Why would scipy/numpy wrapped in @as_op cause much faster sampling than using pytensor operations?

Do you get any blas warnings when you import pymc?

You may want to temove those Deterministic calls. They are only need if you want the records in the final trace, but they can slow down sampling considerably.