Multivariate hierarchical linear model with different hyperpriors

Hi,

I’m trying to build a multivariate hierarchical linear model, but I’m having trouble finding resources on how to do so. I can find examples of hierarchical univariate models and non-hierarchical multivariate models, but not the two together.

Fundamentally, I’m trying to build the following linear model (one per category) with shared hyperpriors between categories:

y_est = a[category] + b[category]*X

where category is an element of [0, 1, 2, 3] and X is an input matrix of num_training_samples by num_pred (10) covariates.

I’ve adapted the scikit learn wrapped linear model implementation at https://github.com/parsing-science/pymc3_models to perform this regression.

In the code below, which isn’t the full code, just the part that constructs the model, the following variables mean:

num_pred - number of predictors or covariates (10)
num_cats - number of categories for which to compute different coefficients (4)
model_cats - vector representing the category for each training sample [0, 1, 2, 3, 0, 1, 2, 3 … etc]

model_input  = theano.shared(np.zeros([num_training_samples, num_pred]))
model_output = theano.shared(np.zeros(num_training_samples, dtype='float'))
model_cats   = theano.shared(np.zeros(num_training_samples, dtype='int'))

model = pm.Model()

with model:
    # Hyperpriors for group nodes
    mu_alpha    = pm.Normal('mu_alpha', mu=0, sd=100)
    mu_beta     = pm.Normal('mu_beta', mu=0,  sd=100)
    sigma_alpha = pm.HalfCauchy('sigma_alpha', 5)
    sigma_beta  = pm.HalfCauchy('sigma_beta',  5)

    # model error
    eps = pm.HalfCauchy('eps', 5)
    
    # alpha and betas
    alpha = pm.Normal('alpha', mu=mu_alpha, sd=sigma_alpha, shape=(num_cats,))
    betas = pm.Normal('beta',  mu=mu_beta,  sd=sigma_beta,  shape=(num_cats, num_pred))

    c = model_cats
    test_est = alpha[c] + T.sum(betas[c] * model_input, 1)
    y_est = pm.Normal('y_est', test_est, sd=eps, observed=model_output)

So I am getting one alpha per category and all 4 alphas are sharing the hyperprior.

I don’t believe my beta hyperpriors are quite right. Since each beta represents a different covariate, it seems like it’s a mistake to set the same hyperpriors to be used for all of them. I would like to for my beta hyperpriors to be shared between categories, but not between different covariates.

I’ve made the following change to the shape to accomplish this:

    mu_beta     = pm.Normal('mu_beta',  mu=0, sd=100, shape=(1, self.num_pred))
    sigma_beta  = pm.HalfCauchy('sigma_beta',  5, shape=(1, self.num_pred))

The model seems to be building and inference seems to work, but I am having some trouble getting it to converge.

Is my concept reasonable?
Is my modification to the code above the correct way to accomplish my goal?

Your model might be over-specified, which makes inference difficult - I suggest you to try more informative prior, maybe start with sd=10 and HalfNormal with sd=2.5

Thanks for the response! I’ll try running that over the weekend to see how the inference works.

In the meantime, I feel a bit shaky about what the appropriate shape of mu_beta and sigma_beta ought to be.

Does my intuition about making them length(covariates)-dimensional to generate a different prior for each covariate make sense or should I be just keeping it at a 1d prior for all the covariates in the model specification?