Creating a multivariable regressor with multiple groups

stickykeys · November 6, 2018, 2:52am

I’m trying to create a multivariable linear regression model with multiple groups. The closest I could find online was this resource from one of the authors:

http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/

Although this one creates multiple groups there is only a single metric predictor. I think the code below creates what I want, but it seems kind of contrived since I end up using list comprehension.

model_input_x = [theano.shared(X_train[:,i].reshape(-1,1)) for i in range(0,X_train.shape[1])]
model_input_cat = theano.shared(np.asarray(X_train[:,-1], dtype='int32').reshape(-1,1))
model_output = theano.shared(y_train)

with pm.Model() as hierarchical_model:
    mu_beta = pm.Normal('mu_beta', mu=0., sd=1, shape=(num_features))
    sigma_beta = pm.HalfCauchy('sigma_beta', beta=1, shape=(num_features))
   
    beta = [pm.Normal('beta_'+str(i), mu=mu_beta[i], sd=sigma_beta[i], shape=(num_hierarchy)) for i in range(0,num_features)]

    alpha = pm.Normal('alpha', mu=0, sd=1, shape=(1))

    epsilon = pm.HalfCauchy('eps', beta=1) # Model error prior
    
    y_est = alpha + sum([beta_i[model_input_cat]*model_input_x[i] for i,beta_i in enumerate(beta)])
    
    y_like = pm.Normal('y_like', mu=y_est, sd=epsilon, observed=model_output)

Is this the best way to build a hierarchical model where each feature/group combination has it’s own slope and slopes across groups for a given feature come from a common distribution?

I think this is the model I’m creating with this code (given 2 features). The pm.model_to_graphviz tool output is also shown

junpenglao · November 6, 2018, 6:07am

Generally these kind of question could be expressed as a mixed effect model. You can have a look at this post: Constructing the random effects model matrix Z. @Jack_Caster’s repository https://github.com/JackCaster/GLM_with_PyMC3 is pretty great.

stickykeys · November 6, 2018, 7:08pm

I looked through the sleep deprivation notebook, but do not understand the ‘random effects model matrix’. Is the point of this matrix to provide a way to split off the average or ‘fixed’ intercept from the within group intercept? (whereas in my model above there is no average slope only within group slopes). Would the mixed linear model equivalent look like this:

with pm.Model() as hierarchical_model:
    mu_m_slope = pm.Normal('mu_m_slope', mu=0., sd=1, shape=(num_features))
    sigma_m_slope = pm.HalfCauchy('sigma_m_slope', beta=1, shape=(num_features))
    
    fixed_intercept = pm.Normal('f_intercept', mu=0, sd=1)
    fixed_slope = pm.Normal('f_slope', mu=0, sd=1, shape=(num_features))
    
    random_slope = [pm.Normal('m_slope_'+str(i), mu=mu_m_slope[i], sd=sigma_m_slope[i], shape = num_hierarchy) for i in range(0,num_features)]
    epsilon = pm.HalfCauchy('eps', beta=1) # Model error prior

    y_est = (fixed_intercept + fixed_slope*model_input_x + sum([beta_i[model_input_cat]*model_input_x[i] for i,beta_i in enumerate(random_slope)]))
    y_like = pm.Normal('y_like', mu=y_est, sd=epsilon, observed=model_output)

I’ve also been looking through what materials I can find online about linear mixed models, but still feel quite confused. Are there resources you could recommend for a beginner? I’ve read through most of “Doing Bayesian Analysis”.

junpenglao · November 7, 2018, 8:47am

Yes, recall that if you have individual slope for each group, it is the same as having a common intercept (or fixed effect, but dont mind too much of the terminology as the literature could be confusing, instead focus on the input matrix X) and the group “deviation” from the common average. It is all about parameterization.

Your code should work as is, but you dont need a nested for loop to construct your random slope, you can do:
random_slope = pm.Normal('m_slope_', mu=mu_m_slope, sd=sigma_m_slope, shape = (num_features, num_hierarchy))

stickykeys · November 7, 2018, 5:23pm

Okay, it’s making more sense. Thanks for the help!

I get the following error when trying to switch to random slope without the nested for loop:

ValueError: operands could not be broadcast together with shapes (2,4) (2,)

Which I assume is due to the difference in shape between random_slope (2,4) and mu_m_slope (2,).

If you don’t use the nested for loop, how does random_slope determine which element of mu_m_slope (shape=num_features), to associate with which random_slope element?

junpenglao · November 8, 2018, 6:48am

You can use a matrix multiplication, and then index to the output.

Topic		Replies	Views
Hierarchical Regression Model with Multiple Predictors Questions	1	798	June 21, 2018
Multivariate hierarchical linear model with different hyperpriors Questions	2	715	November 2, 2018
Hierarchical multiple regression with multiple level groups v5 modeling	5	1083	June 23, 2023
Making a vector of variables from a list Questions	5	1218	December 28, 2017
Incorporating multiple variables in Hierarchial model v5 hierarchical	0	482	December 30, 2022

Creating a multivariable regressor with multiple groups

Related topics