Non-centered hierarchical model with constraints

aasensioramos · September 27, 2017, 11:46am

I’m trying to fit a hierarchical model to some data where some of the variables are constrained
for physical reasons to be in an interval. The model (only the part that is relevant here)
in centered way reads something like this:

model = pm.Model()
with model:
    muO = pm.Uniform('muO',lower=8.40,upper=9.1)
    sigmaO = pm.HalfNormal('sigmaO',sd=1.)

    bounded_normal = pm.Bound(pm.Normal,lower=8.40,upper=9.1)
    AO = bounded_normal('AO', mu=muO, sd=sigmaO, shape=n)

This makes AO to be bounded variables in the physically relevant space. However, I’m facing
a problem if I want to write this in non-centered way to check whether I have the well-known
funnel (of hell, to be more dramatic as suggested by @twiecki :))

model = pm.Model()
with model:
    muO = pm.Uniform('muO',lower=8.40,upper=9.1)
    sigmaO = pm.HalfNormal('sigmaO',sd=1.)

    AO_offset = pm.Normal('AO_offset', mu=0, sd=1, shape=n)
    AO = pm.Deterministic('AO', muO + AO_offset*sigmaO)

This model does not allow to easily incorporate bounds for AO. Any idea of how to
do it efficiently?

Andrés

junpenglao · September 27, 2017, 6:38pm

hmmm, not sure what is the best way to deal with this, but I would model A on the real line and then transfer it into a bounded Deterministic:

a, b = 8.40, 9.1
with model:
    muO = pm.Normal('muO',0, 100)
    sigmaO = pm.HalfNormal('sigmaO',sd=1.)

    AO_offset = pm.Normal('AO_offset', mu=0, sd=1, shape=n)
    Am = muO + AO_offset*sigmaO
    AO = pm.Deterministic('AO', (b - a) * tt.nnet.sigmoid(Am) + a)

aseyboldt · September 27, 2017, 6:55pm

Also, you don’t actually need to reparameterize to find out if you are in trouble. If everything is fine you shouldn’t see any warnings, get gelman_rubin close to 1 and a reasonable effective sample size. On top of that you can make scatter plots, especially between scale and location parameters and see if there are correlations for funnels.

aasensioramos · September 27, 2017, 7:16pm

Neat trick. This will make my hyperparameters slightly more difficult to understand but I think this can be done.

aasensioramos · September 27, 2017, 7:18pm

The point of this is that, even though the gelman_rubin test is close to 1, I’m watching some indications in sigmaO of lack of convergence, which is exactly the behavior found by @twiecki in http://twiecki.github.io/blog/2017/02/08/bayesian-hierchical-non-centered/.

twiecki · September 28, 2017, 8:57am

Would be cool if the Interval transform could be used here. It should definitely be possible but the API is not really tuned for it.

twiecki · September 28, 2017, 6:35pm

Actually, there is: http://docs.pymc.io/api/bounds.html?highlight=bounded

aasensioramos · September 28, 2017, 7:58pm

I can apply Bound to a distribution, but not to a Deterministic, right?

junpenglao · September 29, 2017, 7:27am

You are correct. You can bound a deterministic with tt.clip but it mess up the gradient at the boundaries. Other option is to used tt.switch for the value outside of your bound.

aasensioramos · September 29, 2017, 11:37am

A first preliminary test with tt.clip looks promising. Thanks!

Maria · October 17, 2018, 11:21pm

I encountered the same problem. Would it be possible to set the bounds on the offset variable? E.g., for variable a that is bounded between 0 and 1, could you do:

mu_a = pm.Bound(pm.Normal, lower=0, upper=1)('mu_a', mu=0.5, sd=0.5)
sd_a = pm.HalfNormal('sd_a', sd=0.5)
offset_a = pm.Bound(pm.Normal, lower=-mu_a/sd_a, upper=(1-mu_a)/sd_a)(
                    'offset_a', mu=0, sd=1, shape=n)

a = pm.Deterministic('a', mu_a + offset_a * sd_a)

Thanks!

junpenglao · October 18, 2018, 5:34am

You can do it this way, but I wont recommend it - you should model the random variable on the Real line and use logistic transformation instead.

Olaf · September 30, 2020, 11:38pm

I try to model standard deviation, which obviously must be positive, but in the centered parameterization I get a handful of divergences. I tried to naively re-parameterize with sigma = pm.Deterministic('sigma', global_sigma + offset * global_sigma_sd), and got a Bad Initial Energy error when sampling.
then in order to put a lower bound on the deterministic variable I tried with sigma.clip(0.0, np.inf) and tt.switch(sigma<0, 0.0, sigma), but again I always get a Bad Initial Energy error when sampling.
When instead of a 0-lower bound I tried 1e-15, with clip I get:

ValueError: Mass matrix contains zeros on the diagonal.

and with tt.switch:

The chain contains only diverging samples. The model is probably misspecified.

I tested with a latent variable:

offset = pm.Normal('offset', mu=0, sigma=1.0)
b_offset = pm.Bound(pm.Normal, lower=0.0)('bound_offset', mu=global_sigma + offset, sigma=1)
sigma = pm.Deterministic('sigma', b_offset * global_sigma_sd)

which gave no divergences and took 2-4 times longer to sample, but a + b * c != (a + b) * c, so that doesn’t seem like a good idea.
How can this be solved? Is the best option to transform the data, e.g. square demeaned data and log-transform and then model mu instead of sigma?
This is related to this ongoing struggle.

Topic		Replies	Views
Non-centered multivariate normal Questions	7	1280	January 18, 2024
Non-centered parametrisation of bounded variables version agnostic modeling	4	316	January 31, 2024
Non-centered parameterization of a beta distribution? Questions	12	2753	March 22, 2021
Hierarchy on Half Normal Distribution v5 modeling , hierarchical	7	126	July 10, 2024
Converting Multi-Level Hierarchical GLM to Non-Centered Form v5 development , modeling	0	392	July 12, 2022

Non-centered hierarchical model with constraints

Related topics