Estimate total time when using pymc to sample

As you’ve found, MH is a very simple algorithm that does exactly what it says: tune X time, draw Y times. It evaluates the logp exactly one time each draw to work out the acceptance probability of that draw.

pm.sample uses different algorithms based on the variables in your model and whether there is gradient information available. For continuous variables with gradients, NUTS will be assigned; for continuous variables without gradients, you get Slice. For discrete variables, you get one of CategoricalGibbsMetropolis, BinaryGibbsMetropolis, or just Metropolis. Actually I’m surprised your model gave you Metropolis instead of Slice – usually it is the final fallback if nothing else is available. Maybe it’s because of the potential?

Anyway the number of logp evaluations required will vary by the sampler. NUTS uses a ton (including gradient evaluations) to generate a single proposal, while Metropolis uses exactly one evaluation to generate a proposal. You’ve also stumbled onto why it’s not advised to use clock time to compare the speed of different samplers, see here for some similar discussion.