What is the best way to define a multivariate regression in the context of hierarchical bayesian models?

greghor · August 27, 2021, 7:54am

I am new to the world of PyMC3 so apologize if my wording is not 100% correct.

I am trying to build a Hierarchical Bayesian models with many predictors.

I am able to get a working version by adapting the syntax of this tutorial.

However, the code becomes a bit wordy since I need to define the parameters for 20+ predictors (I am just using 5 predictors in the snippet below, but you get the point)

with pm.Model() as hbm1:
    # Hyperpriors for group nodes
    mu_a = pm.Normal("mu_a", mu=0.0, sigma=100)
    sigma_a = pm.HalfNormal("sigma_a", 5.0)
    mu_b = pm.Normal("mu_b", mu=0.0, sigma=100)
    sigma_b = pm.HalfNormal("sigma_b", 5.0)
    mu_c = pm.Normal("mu_c", mu=0.0, sigma=100)
    sigma_c = pm.HalfNormal("sigma_c", 5.0)
    mu_d = pm.Normal("mu_d", mu=0.0, sigma=100)
    sigma_d = pm.HalfNormal("sigma_d", 5.0)
    mu_e = pm.Normal("mu_e", mu=0.0, sigma=100)
    sigma_e = pm.HalfNormal("sigma_e", 5.0)

    a = pm.Normal("a", mu=mu_a, sigma=sigma_a, shape=n_groups)
    b = pm.Normal("b", mu=mu_b, sigma=sigma_b, shape=n_groups)
    c = pm.Normal("c", mu=mu_c, sigma=sigma_c, shape=n_groups)
    d = pm.Normal("d", mu=mu_d, sigma=sigma_d, shape=n_groups)
    e = pm.Normal("e", mu=mu_e, sigma=sigma_e, shape=n_groups)

    # Model error
    eps = pm.HalfCauchy("eps", 5.0)

    y_est = (a[group_id] + 
               b[group_id] * df["0"].values + 
               c[group_id] * df["1"].values +
               d[group_id] * df["2"].values +
               e[group_id] * df["3"].values )


    # Data likelihood
    y_like = pm.Normal("yoy_like", mu=y_est, sigma=eps, observed=df["target"])

with hbm1:
    hbm1_trace = pm.sample(2000, tune=2000, target_accept=0.9)

Instead of defining the priors for each parameter individually, I try to wrap everything in an object that has one extra dimension. See the example below,

X = df[["cst", *predictors[:n_predictors]]].values
with pm.Model() as hbm2:
    # Hyperpriors for group nodes
    mu = pm.Normal("mu_a", mu=0.0, sigma=100)
    sigma = pm.HalfNormal("sigma_a", 5.0)

    beta = pm.Normal("beta", mu=mu, sigma=sigma, shape=(n_predictors + 1, n_groups))

    # Model error
    eps = pm.HalfCauchy("eps", 5.0)

    y_est = (X * beta[:, group_id].T).sum(axis=-1)

    # Data likelihood
    y_like = pm.Normal("yoy_like", mu=y_est, sigma=eps, observed=df["target"])
    
with hbm2:
    hbm2_trace = pm.sample(2000, tune=2000, target_accept=0.9)

I now have a couple of questions:

Are the first and the second syntax producing exactly the same model? Or am I doing something wrong?
what would be the equivalent of the command pm.traceplot(hbm1_trace, ["a", "b", "c"]) for the syntax2 model? I would like to slice my beta and show the distributions for the individual predictors but I can t figure out how to properly do that

ThomasLG · September 3, 2021, 7:40am

Hi,
Regarding the first question. No it is not the same model since you add a hyperparameter. Now your betas will be correlated. Besides in the second model you only have one “sigma” instead of several.
If the first model is too “wordy” why don’t you use a shape parameter for mus and sigmas?

Regarding the second question, I would not know exactly the syntax but you will have extended possibilities by using arviz instead of pm.traceplot

tcapretto · September 8, 2021, 3:47pm

@greghor you could try Bambi to build hierarchical models on top of PyMC3 using a formula interface like the one in R. You have several examples in the Examples section of the documentation.
Also, feel free to ping me if you have any questions about how it works.

greghor · September 9, 2021, 10:18am

hey Thomas thanks for your reply.
Fair point, I shall use a shape parameter for mus and sigmas as well!

greghor · September 9, 2021, 10:24am

Thank you for the suggestion. Looks super interesting, too bad I didn’t find this earlier. Love the name

Topic		Replies	Views
Hierarchical Model with Multiple Predictors	5	1664	August 15, 2023
Confused about building a compound distribution hierarchical Bayes model with continuous and categorical inputs in pymc3 Questions	1	778	June 8, 2019
Hierarchical Regression Model with Multiple Predictors Questions	1	804	June 21, 2018
Multivariate linear regression with different prior distributions for coefficients v3 modeling	0	537	May 7, 2022
Hierarchical model w/ categorical and continuous predictors v5 modeling , hierarchical	3	520	March 25, 2024

What is the best way to define a multivariate regression in the context of hierarchical bayesian models?

Related topics