Hi,
I read everything rather quickly, please tell me if I miss something. You are using a dataset with only two groups and using two parameters to estimate it. If you want to pool towards the mean I would suggest that you only specify a global mean with a larger sd. You can see that in this way you achieve the pooled effect.
with pm.Model() as hierarchical_model:
sigma_hierarchical = pm.HalfCauchy("sigma_hierarchical", beta=5, shape=n_sites)
# The step that makes it hierarchical
# We only assume beta is linked to the global
mu_global = pm.Normal("mu_global", mu=0, sd=5)
#sd_global = pm.HalfCauchy("sd_global", beta=0.1)
beta_hierarchical = pm.Normal(
"beta_hierarchical", mu=mu_global, sd=0.5, shape=n_sites
)
# Define likelihood
likelihood = pm.Normal(
"y_hierarchical",
mu=beta_hierarchical[site_indices] * data["x"],
sigma=sigma_hierarchical[site_indices],
observed=data["y"],
)
# Inference!
trace_hierarchical = pm.sample(draws=3000)
posterior_beta_hierarchical = trace_hierarchical.get_values(
"beta_hierarchical", burn=1000, chains=[1]
)