A typical suggestion that I see with model specification is switching between centered and non-centered representations for scale parameters e.g., see here: https://twiecki.io/blog/2017/02/08/bayesian-hierchical-non-centered/
The intution I have behind this is that, for the centered model, the posterior looks very different for different values of scale parameters, making it a nightmare to sample from.
I’m basically have the same issue, but I’m using a negative binomial output distribution with an unknown dispersion parameter. I place a prior over this dispersion, similar to placing a prior over the scale parameter, but I suppose that the posterior is different to different values of this parameter, giving issues. For example, I noticed an approximate 2x speed-up from changing my model to use a fixed dispersion rather than having a prior over the dispersion.
I’m wondering if this could be caused by a similar issue with centered parameterisations for scale parameters, or instead perhaps the prior I place on the dispersion is ‘poor’ in some respect.
Thanks