Hi,
I’m trying to build a multivariate hierarchical linear model, but I’m having trouble finding resources on how to do so. I can find examples of hierarchical univariate models and non-hierarchical multivariate models, but not the two together.
Fundamentally, I’m trying to build the following linear model (one per category) with shared hyperpriors between categories:
y_est = a[category] + b[category]*X
where category is an element of [0, 1, 2, 3] and X is an input matrix of num_training_samples by num_pred (10) covariates.
I’ve adapted the scikit learn wrapped linear model implementation at https://github.com/parsing-science/pymc3_models to perform this regression.
In the code below, which isn’t the full code, just the part that constructs the model, the following variables mean:
num_pred - number of predictors or covariates (10)
num_cats - number of categories for which to compute different coefficients (4)
model_cats - vector representing the category for each training sample [0, 1, 2, 3, 0, 1, 2, 3 … etc]
model_input = theano.shared(np.zeros([num_training_samples, num_pred]))
model_output = theano.shared(np.zeros(num_training_samples, dtype='float'))
model_cats = theano.shared(np.zeros(num_training_samples, dtype='int'))
model = pm.Model()
with model:
# Hyperpriors for group nodes
mu_alpha = pm.Normal('mu_alpha', mu=0, sd=100)
mu_beta = pm.Normal('mu_beta', mu=0, sd=100)
sigma_alpha = pm.HalfCauchy('sigma_alpha', 5)
sigma_beta = pm.HalfCauchy('sigma_beta', 5)
# model error
eps = pm.HalfCauchy('eps', 5)
# alpha and betas
alpha = pm.Normal('alpha', mu=mu_alpha, sd=sigma_alpha, shape=(num_cats,))
betas = pm.Normal('beta', mu=mu_beta, sd=sigma_beta, shape=(num_cats, num_pred))
c = model_cats
test_est = alpha[c] + T.sum(betas[c] * model_input, 1)
y_est = pm.Normal('y_est', test_est, sd=eps, observed=model_output)
So I am getting one alpha per category and all 4 alphas are sharing the hyperprior.
I don’t believe my beta hyperpriors are quite right. Since each beta represents a different covariate, it seems like it’s a mistake to set the same hyperpriors to be used for all of them. I would like to for my beta hyperpriors to be shared between categories, but not between different covariates.
I’ve made the following change to the shape to accomplish this:
mu_beta = pm.Normal('mu_beta', mu=0, sd=100, shape=(1, self.num_pred))
sigma_beta = pm.HalfCauchy('sigma_beta', 5, shape=(1, self.num_pred))
The model seems to be building and inference seems to work, but I am having some trouble getting it to converge.
Is my concept reasonable?
Is my modification to the code above the correct way to accomplish my goal?