How to improve sampling or re-parametrize model

Models with linear combinations of random variables are going to have identification problems, because there are an infinite combination of q90_stress and sigma_stress that can produce a given value of mu_stress. This blog post by Michael Betancourt has more information than you’d ever like to know on the subject. This is a hopeless situation in frequentist modeling, but in the Bayesian framework you can use priors to pin down a subset of the probability space to focus on, which can sometimes resolve the situation.

My go-to diagnostic in these situations, after seeing the awful trace and energy plot, is to check the pair plots. Here’s what I got when I ran your model (with the modification suggested by @cluhmann):

As you can see, all of the divergences occur when sigma_stress is sufficiently small. You can also kind of see an elliptical cloud between mu_stress of -2 and -4 on the lower right plot, with a long thin tail after that. I might think about focusing on this region, since you have a “degree of freedom” to decide which linear combinations of the variables to consider. Seems like that maps to sigma_stress between around 0.5 and 2, and you might even have some domain knowledge that makes those values more “reasonable”.

Some other assorted thoughts:

  1. You don’t actually care about q90_stress, the value you use in your model is log(q90_stress). It might make sense to directly model the log value on \mathbb R, then exp it after modeling if you want to reason about it.
  2. Normal distributions, including truncated and half, have a lot of probability mass right at zero. In cases like yours, where small values cause problems, I like to use Gamma(2, b), because it has no mass at zero and a mode at 1/b.

Anyway you can try tinkering with stuff. It might also be worth trying pm.sample_smc, which is a non-gradient sampler that progressively refines an estimated posterior via importance sampling. It’s true that NUTS is SOTA, but if you have a really degenerate posterior geometry (i.e. a straight line because of the linear dependence between variables) it might be too hard to do the Hamiltonian simulation NUTS needs, and SMC is a good second best.

The best thing, though, would be to get some more information, like some features of the devices, that would help you identify the linear function at the heart of your model.

3 Likes