Eight school problem with student t distribution for treatment effects

Hello,

Thanks a lot for your help.

What did the trick for me was reparametrizing the student as explained here.

All the main parameters now have a rhat at 1, and only a few experiments (among 600) have a rhat at 1.01.

The code translated into eight schools coordinates now looks like :

import numpy as np
import pymc as pm
import arviz as az

J = 8
y = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])

with pm.Model() as NonCentered_eight:
    #base parameters
    mu = pm.Normal("mu", mu=0, sigma=5)
    tau = pm.HalfCauchy("tau", beta=5)
    #student reparametrization for theta_tilde
    nu = pm.Gamma("nu", 2,0.1)
    half_nu = pm.Deterministic("half_nu",0.5 * nu)
    tau_prime= pm.Gamma("tau_prime",half_nu, half_nu)
    alpha= pm.Normal("alpha", 0, 1,shape=y.shape)
    theta_tilde = pm.Deterministic("beta", alpha / np.sqrt(tau_prime))
    #non-centered parametrization
    theta = pm.Deterministic("theta", mu + tau * theta_tilde)
    obs = pm.Normal("obs", mu=theta, sigma=sigma, observed=y)
    prior = pm.sample_prior_predictive(samples=10000)
    trace = pm.sampling_jax.sample_numpyro_nuts(tune=40000,target_accept=0.99)
    posterior_predictive = pm.sample_posterior_predictive(trace)

model_summary=az.summary(trace)
model_summary

However, posterior predictive checks (on my real life usecase) show that my model under estimate the mass at or around zero.

Indeed, my actual distribution has both a peak around zero (like a normal distribution) and fat tails (like a student with low degree of freedom), while there is a trade-off in the student between those dimensions (i.e. the fater the tail the lower the mass around zero).

Thus, I may try a mixture model (point normal or student) in the coming days.

That’ll be another topic.

Best regards,

Timothée