Issues with Truncated Normal

Jordan_Howell · January 3, 2024, 8:45pm

Hello,

Has anyone had an issue using a truncated normal distribution for the likelihood? I’m trying to do a very simple model estimating my target variable which has been scaled between [0,1]. The model using a normal distribution for the likelihood is below.

coords = {'cann':[1,0]}
with pm.Model(coords = coords) as cannibal_model:
    cannibal = pm.Data('cannibal', cann_idx, mutable = True)
    obs = pm.Data('obs', obs_array, mutable = True)

   #  beta = pm.Normal('beta', mu=0, sigma = .1)
    alpha = pm.Normal('alpha', mu=0, sigma = .05, dims = ['cann'])

    mu = alpha[cannibal]
    sigma = pm.HalfCauchy('sigma', beta=0.1)

    eaches = pm.Normal('predicted_eaches',
                           mu=mu,
                           sigma=sigma,
                        #    lower = 0,
                        #    upper = 1,
                           observed=obs)

    idata = pm.sampling_jax.sample_numpyro_nuts(draws = 1000, tune=2000, target_accept = .95)

This gives me a great trace plot but my ppc is estimating above 1.

When I run the following using a truncated normal for the likelihood, as below, the trace looks off but the ppc looks…better.


coords = {'cann':cann}
with pm.Model(coords = coords) as cannibal_model:
    cannibal = pm.Data('cannibal', cann_idx, mutable = True)
    obs = pm.Data('obs', obs_array, mutable = True)

   #  beta = pm.Normal('beta', mu=0, sigma = .1)
    alpha = pm.Normal('alpha', mu=0, sigma = .05, dims = ['cann'])

    mu = alpha[cannibal]
    sigma = pm.HalfCauchy('sigma', beta=0.1)

    eaches = pm.TruncatedNormal('predicted_eaches',
                           mu=mu,
                           sigma=sigma,
                        #    lower = 0,
                           upper = 1,
                           observed=obs)

    idata = pm.sampling_jax.sample_numpyro_nuts(draws = 1000, tune=2000, target_accept = .95)

Is this a parameters issue or is this indicative of a truncated normal distro?

ricardoV94 · January 3, 2024, 9:02pm

How does your data actually look like? The ppc plot with that spike at 1 is very strange unless you have censored data not truncated one

Jordan_Howell · January 3, 2024, 9:14pm

Here is a histogram of my actual data:

I’m not sure what you mean but censored data.

ricardoV94 · January 4, 2024, 7:31am

You said you scaled your data to be in the 0,1 range, how did you do that?

ricardoV94 · January 4, 2024, 7:33am

The difference between censored and truncated data is illustrated here: Bayesian regression with truncated or censored data — PyMC example gallery

Jordan_Howell · January 4, 2024, 10:02am

I used a standard scikit-learn min/max scaler.

Jordan_Howell · January 4, 2024, 10:02am

Oh this is interesting. I didn’t know this existed. Thank you! I will try these techniques.

ricardoV94 · January 4, 2024, 10:05am

Your priors look very strict from your data. You have a mean clearly >= 1, but you specified the prior as a Normal(0, 0.05). Your sigma may also be pretty extreme.

As you can see in the plot the model is not really converging, the chains are arriving at different conclusions.

Jordan_Howell · January 4, 2024, 11:03am

This seems to be doing better. I changed the model to the following

coords = {'cann':cann}
with pm.Model(coords = coords) as cannibal_model:
   cannibal = pm.Data('cannibal', cann_idx, mutable = True)
   obs = pm.Data('obs', obs_array, mutable = True)

   #  beta = pm.Normal('beta', mu=0, sigma = .1)
   alpha = pm.Normal('alpha', mu=.99, sigma = .07, dims = ['cann'])
   sigma = pm.HalfNormal('sigma', 1)
   y_latent = pm.Normal.dist(mu=alpha[cannibal], sigma = sigma)
    


   eaches = pm.Censored('predicted_eaches',
                           dist=y_latent,
                           lower = 0,
                           upper = 1,
                           observed=obs)

   idata = pm.sampling_jax.sample_numpyro_nuts(draws = 1000, tune=2000, target_accept = .95)

I think this is a good base to expand out from. Thank you for the help.

Topic		Replies	Views
Truncated normal distribution with optimal truncation thresholds via DensityDist Questions	8	3019	June 21, 2018
TruncatedNormal likelihood with all positive data Questions	0	402	June 1, 2019
Truncated observed data? Questions	6	2308	November 6, 2020
Truncated multivariate normal likelihood v5 modeling	10	91	May 30, 2025
Truncated Normal Questions	8	1926	January 29, 2021

Issues with Truncated Normal

Related topics