Latest pymc version autoassigns Metropolis step to continuous variable

I updated pymc to the latest version and ran the model in this gist (should run as is). Unexpectedly, the Metropolist step is chosen for word_order_prior, learning_weights, and softmax_alpha:

This is despite the fact that they all have a continuous distribution & a previous version of pymc v5 automatically used hmc. Did something change that is causing this (unexpected to me) behaviour? Thanks for the help.

If it helps, here’s the output of:

for name, module in sorted(sys.modules.items()):
    if hasattr(module, '__version__'):
        print(f"{name}: {module.__version__}")
What happens if you call model.dlogp()?

I get a NotImplementedError. So I guess one of the operators I am using was implemented in the previous PyMC version and is not in the new. How could I find which one it is?

I guess for now the easiest option is to use the old version - I am glad this is not pointing to a deeper problem with my model!

You can find which one is problematic with model.dlogp(vars=[x]), picking one at a time. If you can share the model (or a smaller version of the model) I can also have a look.

Thanks! learning_weights, softmax_alpha, and word_order_prior all raise the NotImplementedError, while hyper_gammas doesn’t (or did you mean I should check something else?).

Unfortunately the version I posted on the gist is the stripped down version of the model - it is a pretty complicated model especially with the big scan operation. To be honest it’s been a bit of a nightmare to make this run on the real dataset (which is pretty large). I tried a bunch of things to make it run better, as I documented in this post. Eventually I just decided to go brute force and fit it just with pm.sample, it took about 1000 hours but convergence was very poor. This is despite extensive parameter recovery simulations (on smaller simulated datasets) working well.

I’d debug this by drilling down into each component function and make sure it’s differentiable with respect to all it’s (non-data) inputs. For example, to test the probs_languages_to_probs_scenes_multiple_participants, I’d do something like this:

interpretation_weights = pt.tensor('interpretation_weights', shape=(1, None, None, None), dtype='float64')
word_order_weights = pt.tensor('word_order_weights', shape=(1, 1, 1, 1), dtype='float64')
signals = pt.tensor('signals', shape=(None, None, None), dtype='int64')
scenes_trials = pt.tensor('scenes_trials', shape=(1, None, None, None), dtype='int64')
word_orders = pt.tensor('word_orders', shape=(None, None), dtype='int64')
softmax_alphas = pt.tensor('softmax_alphas', shape=(None, None), dtype='float64')

output = probs_languages_to_probs_scenes_multiple_participants(interpretation_weights,

That is, create symbolic tensor variables for each input (with the correct expected shapes), then test pytensor.grad(output.sum(), input) for each input. If you hit a NotImplemented, you will get more information about what is breaking the flow of gradients in the model. At that point I’d drill down again and test every individual computation inside the offending function to see exactly which operation doesn’t have a gradient.