Creating a multivariable regressor with multiple groups


#1

I’m trying to create a multivariable linear regression model with multiple groups. The closest I could find online was this resource from one of the authors:

http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/

Although this one creates multiple groups there is only a single metric predictor. I think the code below creates what I want, but it seems kind of contrived since I end up using list comprehension.

model_input_x = [theano.shared(X_train[:,i].reshape(-1,1)) for i in range(0,X_train.shape[1])]
model_input_cat = theano.shared(np.asarray(X_train[:,-1], dtype='int32').reshape(-1,1))
model_output = theano.shared(y_train)

with pm.Model() as hierarchical_model:
    mu_beta = pm.Normal('mu_beta', mu=0., sd=1, shape=(num_features))
    sigma_beta = pm.HalfCauchy('sigma_beta', beta=1, shape=(num_features))
   
    beta = [pm.Normal('beta_'+str(i), mu=mu_beta[i], sd=sigma_beta[i], shape=(num_hierarchy)) for i in range(0,num_features)]

    alpha = pm.Normal('alpha', mu=0, sd=1, shape=(1))

    epsilon = pm.HalfCauchy('eps', beta=1) # Model error prior
    
    y_est = alpha + sum([beta_i[model_input_cat]*model_input_x[i] for i,beta_i in enumerate(beta)])
    
    y_like = pm.Normal('y_like', mu=y_est, sd=epsilon, observed=model_output)

Is this the best way to build a hierarchical model where each feature/group combination has it’s own slope and slopes across groups for a given feature come from a common distribution?

I think this is the model I’m creating with this code (given 2 features). The pm.model_to_graphviz tool output is also shown


#2

Generally these kind of question could be expressed as a mixed effect model. You can have a look at this post: Constructing the random effects model matrix Z. @Jack_Caster’s repository https://github.com/JackCaster/GLM_with_PyMC3 is pretty great.


#3

I looked through the sleep deprivation notebook, but do not understand the ‘random effects model matrix’. Is the point of this matrix to provide a way to split off the average or ‘fixed’ intercept from the within group intercept? (whereas in my model above there is no average slope only within group slopes). Would the mixed linear model equivalent look like this:

with pm.Model() as hierarchical_model:
    mu_m_slope = pm.Normal('mu_m_slope', mu=0., sd=1, shape=(num_features))
    sigma_m_slope = pm.HalfCauchy('sigma_m_slope', beta=1, shape=(num_features))
    
    fixed_intercept = pm.Normal('f_intercept', mu=0, sd=1)
    fixed_slope = pm.Normal('f_slope', mu=0, sd=1, shape=(num_features))
    
    random_slope = [pm.Normal('m_slope_'+str(i), mu=mu_m_slope[i], sd=sigma_m_slope[i], shape = num_hierarchy) for i in range(0,num_features)]
    epsilon = pm.HalfCauchy('eps', beta=1) # Model error prior

    y_est = (fixed_intercept + fixed_slope*model_input_x + sum([beta_i[model_input_cat]*model_input_x[i] for i,beta_i in enumerate(random_slope)]))
    y_like = pm.Normal('y_like', mu=y_est, sd=epsilon, observed=model_output)

I’ve also been looking through what materials I can find online about linear mixed models, but still feel quite confused. Are there resources you could recommend for a beginner? I’ve read through most of “Doing Bayesian Analysis”.


#4

Yes, recall that if you have individual slope for each group, it is the same as having a common intercept (or fixed effect, but dont mind too much of the terminology as the literature could be confusing, instead focus on the input matrix X) and the group “deviation” from the common average. It is all about parameterization.

Your code should work as is, but you dont need a nested for loop to construct your random slope, you can do:
random_slope = pm.Normal('m_slope_', mu=mu_m_slope, sd=sigma_m_slope, shape = (num_features, num_hierarchy))


#5

Okay, it’s making more sense. Thanks for the help!

I get the following error when trying to switch to random slope without the nested for loop:

ValueError: operands could not be broadcast together with shapes (2,4) (2,)

Which I assume is due to the difference in shape between random_slope (2,4) and mu_m_slope (2,).

If you don’t use the nested for loop, how does random_slope determine which element of mu_m_slope (shape=num_features), to associate with which random_slope element?


#6

You can use a matrix multiplication, and then index to the output.