I am fitting a multi level model for linear regression where I would like to use multiple columns. I am using the code from this tutorial with a variable intercept and variable slope per county:
with Model() as varying_intercept_slope:
mu_a = Normal('mu_a', mu=0., tau=0.0001)
sigma_a = Uniform('sigma_a', lower=0, upper=100)
tau_a = sigma_a**-2
mu_b = Normal('mu_b', mu=0., tau=0.0001)
sigma_b = Uniform('sigma_b', lower=0, upper=100)
tau_b = sigma_b**-2
# Random intercepts
a = Normal('a', mu=mu_a, tau=tau_a, shape=len(set(county)))
# Random slopes
b = Normal('b', mu=mu_b, tau=tau_b, shape=len(set(county)))
# Model error
sigma_y = Uniform('sigma_y', lower=0, upper=100)
tau_y = sigma_y**-2
# Expected value
y_hat = a[county] + b[county] * floor_measure
# Data likelihood
y_like = Normal('y_like', mu=y_hat, tau=tau_y, observed=log_radon)
How could we use more than one predictor for the vector b? Right now the model is only using floor_measure, but what if I want to use other predictors (p predictors)? Is there a way to use the shape argument to be able to specify priors for each of them and still use the multi-level per county slope?
You can create a Matrix from your conditions and do appropriate matrix multiplication for this (not using indexing). For example, see some example here https://github.com/junpenglao/GLMM-in-Python/blob/master/pymc3_different_sampling.py#L16-L53
Thank you for your reply! Found this library the does pretty much what I wanted to do: https://github.com/bambinos/bambi
This allows me to specify a glm formula and specify multi-level terms.
Let’s say I have a hiearchical model with variable intercepts and variable slopes per group (in my case per store). For the intercepts I have simply one intercept per store all sharing one prior, this makes sense.
But then for slopes, I want to be able to define one slope per group per predictor. So a multi-dimensional random variable.
# Prior for slopes
mu_b = pm.Normal('mu_beta', mu=0, sd=.1)
sigma_b = pm.HalfNormal('sigma_beta', sd=.1)
# Intercept for each store, distributed around group mean mu_a
a = pm.Normal('intercept', mu=mu_a, sd=sigma_a, shape=len(uniqueStores))
# Slope for each store, distributed around group mean mu_b
b = pm.Normal('b', mu=mu_b, sd=sigma_b, shape=(len(uniqueStores), X.shape))
b has shape=(len(uniqueStores), X.shape).
Would it be possible to define a different prior for the different slopes for each predictor? Because right now if I understand correctly b only has one prior defined for all groups and all predictors slopes. This doesn’t really makes sense. Ideally we should have one prior per predictor partially pooling all groups for this predictor slope.
I could add each predictor slope random-variable with it’s pooled prior manually, but I would rather keep this vectorized. Is there any way of doing this?
If the different priors are from the same family of distribution (e.g., Normal distribution with different mu and std), you can specify the prior using a vector for mu and a vector for sd.
If you want different distributions as prior, you will need one RV per slope. You can do
b = pm.Deterministic('beta', tt.stack([beta1, beta2...])) to vectorized it.