Any possible ways to get faster ways of pm.sample()

It’s true that one needs to be much more specific about exactly what kind of speed is desired. I can give you a zillion samples instantly if you’re happy with them all being \theta=-7.1285. But the speed seen in simpler sampling algorithms is often illusory: more samples per unit time, but less information per sample (and often less information per unit time). There are certainly situations in which it makes sense to strategically select sampling algorithms, but it should be done with caution and is generally not a profitable way to find “speed”.

Apologies if my initial comment was blunt. I was trying to be quick but it may have come off as dismissive. My general advice is to (almost) never specify step functions by hand because it often hurts performance relative to the pymc-selected default step functions (and thus seemed relevant to the original request).