# Non-centered parametrisation of bounded variables

Hi,

I am trying to estimate estimate parameters for a hierarchical model with two parameters (alpha and beta) for n subjects.
I have a ground truth (simulated) dataset with alphas ~ Uniform(0,1) and betas ~ Gamma(5,1) and I know that alpha’s range is [0,1], while beta’s range is [0,+inf].

Centered parametrisation of the parameters was giving poor estimates so I’ve been looking into non-centered parametrisation.

I’ve found an example Stan implementation of a model similar to mine, which uses stan’s constrained real variables (alpha_b, beta_a, beta_b > 0, beta > 0) and Phi_approx to restrict alphas to [0,1].
Alpha’s non-centered parametrisation has location ~ Normal(0,3) and scale ~ Cauchy(0,5), while beta’s parametrisation is drawn from a Gamma distribution with shape and scale both drawn from Normal(1,5).

I’ve tried to re-implement this in pymc with the following code but the estimates are not really good.

``````    # alpha priors
alpha_a = pm.Normal('alpha_a', mu=0, sigma=3)
alpha_b = pm.Cauchy('alpha_b', alpha=0, beta=5)
alpha_raw = pm.Normal('alpha_raw', mu=0, sigma=1, shape=n_subj)

# beta priors
beta_a = pm.Normal('beta_a', mu=1, sigma=5)
beta_b = pm.Normal('beta_b', mu=1, sigma=5)

# non-centered parametrisation
alpha = pm.Deterministic('alpha', pm.math.invprobit(alpha_a + alpha_b * alpha_raw))
beta = pm.Gamma('beta', beta_a, beta_b, shape=n_subj)
``````

The estimated alphas and betas are all within a very small range (alphas are tightly packed around 0.5, betas all around 3.5).

Do this non-centered parametrisations make sense or are there better ways to do this for bounded variables?
Thanks!
Filippo

I’ve tried to restrict the problem to only estimate alpha using a different non-centered parametrisation that have been employed in similar problems and similar models (in particular in the hBayesDM library):

``````alpha_a ~ Normal(0,1)
alpha_b ~ HalfNormal(0,0.2)
alpha_raw ~ Normal(0,1)
alpha = InverseProbit(alpha_a + alpha_b * alpha_raw)
``````

Despite the changes, I’m still having issues with this parametrisation.
Compared to the original one there’s no divergences anymore but the sampler consistently estimates alpha_a to be within [-3,-2], which causes the InverseProbit to push the values of alpha very close to 0.
I tried to vary the priors for alpha_a and alpha_b but the sampler consistently pushes alpha_a to negative values.

I’ve included an example of the code in this notebook

I’m not too sure on how to interpret the pair plots with alpha and alpha_a and alpha_b.
Do these make sense or is there some (maybe obvious) problem with them?
Thanks!
Filippo

Hi,

I’m wondering if anyone could provide any help with this, in particular with the following questions

• should the CustomDist `logp` return a single value or the likelihood for each subject? In the linked notebook, the logp function computes the choice probabilities for each subject using `pytensor.scan` and then it returns a vector of lenght n_subj with the loglikelihood for each subject.
• are the choices and the outcomes passed in the correct way to the `logp` function? They are merged together in a single `observed` tensor which is later “unpacked” inside the `logp` function.
• I’ve attached the pair plots between the non-centered parametrisation for the location (alpha_a) and scale (alpha_b) parameters and alpha itself. I’m not sure on how to interpret these plots and whether there is something obviously wrong (and hopefully a fix for it).

I believe that the `logp` function is correct, I have tested it and the resulting log-likelihoods are the same of a non-vectorised implementation.

Any help is greatly appreciated

Thanks!
Filippo

Traces where alpha_a becomes very negative resulting in alpha to be close to 0.

location alpha_a vs alpha

scale alpha_b vs alpha

Doesn’t matter, except for model comparison, where you would want the probability per subject or even per trial

This should be fine.

Your model could still be unidentified or have a bug. You don’t have that many subjects, does it get better if you double them? Also, maybe try to se the `alpha_a` to its true value? Does that at least converge to the correct estimates? If not, I suspect an error in the logp.

Hi Ricardo,

I’ve tested the models with 10, 50, 500 subjects and with 50, 200, 500 trials and the behaviour is always the same.

I changed the ground truth alphas to be sampled from a Normal(0.6, 0.1) and set `alpha_a = 0.6`.
In this case the estimates for `alpha_b` shift in the [2,2.5] range.

The model should be able to work correctly with this task/setup (my model is almost a 1:1 conversion of a stan model which works fine on the same task).
At this point it must be a bug in the `logp` but I’m not sure what it could be - the `logp` gives the same NLLs of a non-vectorised implementation when run “deterministically” with fixed parameters.

Thanks for the help!