Maybe the sigma parameter of your mu is causing the issue? A 0.01 SD parameter for a Gaussian is quite narrow, so maybe a standard Gaussian provides better coverage (always better to conduct prior predictive checks). Just by changing that parameter you can recover the value of the central tendency of simulated data:
import numpy as np
import arviz as az
import pymc as pm
np.random.seed(27)
M = np.random.normal(12, 3, 160*32) #mean 12 and SD 3
school_idx = np.array([np.repeat(i, 32) for i in range(160)]).flatten()
schools = np.arange(160).astype("str")
coords = {"school": schools,"obs_id":np.arange(len(M))}
with pm.Model(coords=coords) as model:
school = pm.ConstantData("shool_idx", school_idx, dims="obs_id")
sigmas = pm.Uniform('sigmas', 0, 10, shape=2)
tau = pm.Deterministic('tau', sigmas**(-2))
mu = pm.Normal('mu', mu=0, sigma=1)
alphas = pm.Normal('alphas', mu=mu, sigma=tau[1]**0.5, dims='school')
math = pm.Normal('math', mu=alphas[school], sigma=tau[0]**0.5, observed=M, dims='obs_id')
with model:
idata = pm.sample(nuts_sampler="numpyro", random_seed=27)
az.summary(idata, var_names=['mu','sigmas','tau'])
Running chain 0: 100%|██████████| 2000/2000 [00:06<00:00, 325.59it/s]
Running chain 1: 100%|██████████| 2000/2000 [00:06<00:00, 325.70it/s]
Running chain 2: 100%|██████████| 2000/2000 [00:06<00:00, 325.86it/s]
Running chain 3: 100%|██████████| 2000/2000 [00:06<00:00, 326.02it/s]
Sampling time = 0:00:06.318887
Transforming variables...
Transformation time = 0:00:00.116303
Out[31]:
mean sd hdi_3% hdi_97% ... mcse_sd ess_bulk ess_tail r_hat
mu 11.997 0.045 11.916 12.079 ... 0.002 389.0 623.0 1.01
sigmas[0] 0.335 0.003 0.328 0.341 ... 0.000 5458.0 3010.0 1.00
sigmas[1] 7.631 1.544 5.048 9.997 ... 0.082 184.0 318.0 1.03
tau[0] 8.897 0.177 8.556 9.227 ... 0.002 5458.0 3010.0 1.00
tau[1] 0.020 0.010 0.010 0.039 ... 0.001 184.0 318.0 1.03
[5 rows x 9 columns]
May be also possible to remove the sigmas and assign tau directly as a HalfNormal? That may help to better recover the data total SD (if that’s desired):
with pm.Model(coords=coords) as model:
school = pm.ConstantData("shool_idx", school_idx, dims="obs_id")
tau = pm.HalfNormal('tau', 1, shape=2)
mu = pm.Normal('mu', mu=0, sigma=1)
alphas = pm.Normal('alphas', mu=mu, sigma=tau[1], dims='school')
math = pm.Normal('math', mu=alphas[school], sigma=tau[0], observed=M, dims='obs_id')
with model:
idata = pm.sample(nuts_sampler="numpyro", random_seed=27)
az.summary(idata, var_names=['mu','tau'])
Running chain 0: 100%|██████████| 2000/2000 [00:07<00:00, 277.07it/s]
Running chain 1: 100%|██████████| 2000/2000 [00:07<00:00, 277.15it/s]
Running chain 2: 100%|██████████| 2000/2000 [00:07<00:00, 277.26it/s]
Running chain 3: 100%|██████████| 2000/2000 [00:07<00:00, 277.42it/s]
Sampling time = 0:00:07.397331
Transforming variables...
Transformation time = 0:00:00.078383
Out[34]:
mean sd hdi_3% hdi_97% ... mcse_sd ess_bulk ess_tail r_hat
mu 11.999 0.047 11.908 12.085 ... 0.003 134.0 401.0 1.04
tau[0] 2.982 0.030 2.922 3.035 ... 0.001 1507.0 1982.0 1.00
tau[1] 0.115 0.059 0.025 0.216 ... 0.012 12.0 22.0 1.30
[3 rows x 9 columns]
Hope this makes sense and helps a bit.