Non-centered hierarchical model produces weird looking posteriors

Hi all,

I am fitting a logistic regression model with some parameters modelled as random effects with what I hope is a non-centered parameterization. I don’t get any divergences and I get ok ESS values. However, the posterior distributions of the parameters look odd to me.

Here are the relevant lines for one of the hierarchical parameters in the model:

.
.
.
family_sigma = pm.HalfNormal("family_sigma", 1.5)
family_raw = pm.Normal("family_raw", 0.0, 1.0, shape=df.family.nunique())
a_family = pm.Deterministic("family", family_raw * family_sigma)

family = pm.intX(df.family)

mu = intercept + a_family[family] + ...

Here are some trace plots:

Firstly, I’m wondering about the posterior distribution of the sigma parameter. It doesn’t look like a half-normal distribution anymore but I guess that could be fine since the posterior is a mix of the prior and the likelihood. Secondly, some of the distributions of the family parameter are really skewed, e.g. the blue and green ones.

My questions are if there is anything wrong with how I’m implementing the non-centered parameterization and if not what could be the reason for seeing these weird distribution shapes. Is this something that can be expected when multiplying the family_raw and the family_sigma parameters or should I try to respecify the model?

Thanks in advance!

Your intuition is correct for the first question: the posterior is the normalized product of the likelihood and prior, so you should not expect the posterior to necessarily be the same form as the prior (except in some conjugate cases).

The skew in the posterior of some of the families may be fine, particularly if your dataset is small. It suggests some residual uncertainty about the location of those parameter values. You did not state how many tuning and post-tuning samples you drew, but make sure your MC error is an order of magnitude or more smaller than the posterior standard deviation.

1 Like

Thanks a lot for your reply. I ran 1000 tuning steps and 1000 sampling iterations, will increase this though. The MC error seems to be ok. The uncertainty hypothesis makes sense I think.

1 Like