There’s a few interesting questions here you might be asking:
- How can I sample from my model?
So, the best choice is to reparametrize your model, which is (unfortunately!) context dependent and maybe the hardest choice. One common thing to do is to center your model: in case you have something like
mu = pm.Something(...)
std = pm.SomethingElse(...)
latent = pm.Normal("latent", mu, std)
You might change that to
mu = pm.Something(...)
std = pm.SomethingElse(...)
latent_unscaled = pm.Normal("latent_unscaled", 0, 1)
latent = pm.Deterministic("latent", mu + std * latent_unscaled)
(similarly for other location-scale parameters).
Another thing to look for are “heavy tail” parameters, like Cauchy or StudentT with small degrees of freedom – can you replace those with gaussians? Maybe you can’t because then it is not The One True Model, and you’ll need to roll your own sampler!
- How can I tell why I’m getting divergences?
When you have a divergence, you just get a rejected sample. So you can see what parameters cause a rejected sample. I think @aseyboldt might have hooked up some tools to also see the momentum draws that lead to a divergence. Note that the trajectory is deterministic once you know the momentum, so you could (in theory) recreate that trajectory and look at the gradients/log densities at each point. That would require a lot of code, and you’d probably just be like “boy, those gradients sure changed fast! I bet there was some intense curvature in this part of parameter space!”, then go back to step 1. But maybe it would be more useful! And maybe you’d share that code back, which would be neat!