I am new to the world of PyMC3 so apologize if my wording is not 100% correct.
I am trying to build a Hierarchical Bayesian models with many predictors.
I am able to get a working version by adapting the syntax of this tutorial.
However, the code becomes a bit wordy since I need to define the parameters for 20+ predictors (I am just using 5 predictors in the snippet below, but you get the point)
with pm.Model() as hbm1:
# Hyperpriors for group nodes
mu_a = pm.Normal("mu_a", mu=0.0, sigma=100)
sigma_a = pm.HalfNormal("sigma_a", 5.0)
mu_b = pm.Normal("mu_b", mu=0.0, sigma=100)
sigma_b = pm.HalfNormal("sigma_b", 5.0)
mu_c = pm.Normal("mu_c", mu=0.0, sigma=100)
sigma_c = pm.HalfNormal("sigma_c", 5.0)
mu_d = pm.Normal("mu_d", mu=0.0, sigma=100)
sigma_d = pm.HalfNormal("sigma_d", 5.0)
mu_e = pm.Normal("mu_e", mu=0.0, sigma=100)
sigma_e = pm.HalfNormal("sigma_e", 5.0)
a = pm.Normal("a", mu=mu_a, sigma=sigma_a, shape=n_groups)
b = pm.Normal("b", mu=mu_b, sigma=sigma_b, shape=n_groups)
c = pm.Normal("c", mu=mu_c, sigma=sigma_c, shape=n_groups)
d = pm.Normal("d", mu=mu_d, sigma=sigma_d, shape=n_groups)
e = pm.Normal("e", mu=mu_e, sigma=sigma_e, shape=n_groups)
# Model error
eps = pm.HalfCauchy("eps", 5.0)
y_est = (a[group_id] +
b[group_id] * df["0"].values +
c[group_id] * df["1"].values +
d[group_id] * df["2"].values +
e[group_id] * df["3"].values )
# Data likelihood
y_like = pm.Normal("yoy_like", mu=y_est, sigma=eps, observed=df["target"])
with hbm1:
hbm1_trace = pm.sample(2000, tune=2000, target_accept=0.9)
Instead of defining the priors for each parameter individually, I try to wrap everything in an object that has one extra dimension. See the example below,
X = df[["cst", *predictors[:n_predictors]]].values
with pm.Model() as hbm2:
# Hyperpriors for group nodes
mu = pm.Normal("mu_a", mu=0.0, sigma=100)
sigma = pm.HalfNormal("sigma_a", 5.0)
beta = pm.Normal("beta", mu=mu, sigma=sigma, shape=(n_predictors + 1, n_groups))
# Model error
eps = pm.HalfCauchy("eps", 5.0)
y_est = (X * beta[:, group_id].T).sum(axis=-1)
# Data likelihood
y_like = pm.Normal("yoy_like", mu=y_est, sigma=eps, observed=df["target"])
with hbm2:
hbm2_trace = pm.sample(2000, tune=2000, target_accept=0.9)
I now have a couple of questions:
- Are the first and the second syntax producing exactly the same model? Or am I doing something wrong?
- what would be the equivalent of the command
pm.traceplot(hbm1_trace, ["a", "b", "c"])
for the syntax2 model? I would like to slice mybeta
and show the distributions for the individual predictors but I can t figure out how to properly do that