Specify covariance between intercepts and slopes in a 'no pooling' model


I’m fitting a linear model where I have pressure and depth measurements for 4 groups. I’m estimating a separate slope for each group as well group means.

In a heirarchical model, this would be akin to fitting a varying intercept and varying slope model, and if needed, allowing for the covariance between the two. This can be expressed in the form here https://stackoverflow.com/questions/39364919/pymc3-how-to-model-correlated-intercept-and-slope-in-multilevel-linear-regressi.

However, in this case, I’m actually not trying to model this in a heirarchical model. That is, I’m fitting a ‘no pooling’ model as Gelman, Hill et al refer to this in their 2007 book on multilevel model.

What I do want is to be able to still specify that the group means and slope are drawn from a multivariate normal distribution with some prior vector of means and a vector of prior standard deviations (not hyper priors).

For instance, in the radon example from Gelman and Hill, 2007, instead of fitting a heirarchical/partial pooling model, I instead wanted to fit a no pooling model, I’d do something like this

with pm.Model() as m1:
    a = pm.Normal('a', mu=0, sd=1, shape=n_counties)
    b = pm.Normal('b', mu=0, sd=1, shape=n_counties)

    # Model error
    eps = pm.HalfCauchy('eps', 5)

    radon_est = a[county_idx] + b[county_idx] * data.floor.values

    # Data likelihood
    radon_like = pm.Normal('radon_like', mu=radon_est, sd=eps, observed=data.log_radon)
    m1_trace = pm.sample(2000)

Instead now, I’d like for the parameters a and b to be drawn from a multivariate normal distribution, with a vector of means and vector of standard deviations along with the appropriate LKJ correlation matrix.

How can I incorporate that into the above model?

Thank you!

In the terminology of the Radon example, using the same prior distribution for your parameters across counties is indeed going to induce cross-county pooling unless you have a fixed mean and standard deviation known ahead of time.

Could you clarify a bit more what you mean by no pooling? My interpretation of “no pooling” means that a and b are going to be independent draws from a distribution with hyperparameters that are fixed, not estimated.

Thanks for taking a look!

I think what I’m trying to achieve is a fixed effect model where each group get’s it’s own slope (so an interaction between the groups and continuous variable). Here the group means are the parameter a and the group slopes are b, but I want to explicitly specify these two coming from a joint multivariate normal distribution and model the correlation between the two.

Okay, then you’ll just need to follow the instructions here: https://docs.pymc.io/notebooks/LKJ.html

At some stage, you’ll need to do something like this:

a = coefs[:,0]
b = coefs[:,1]

to take an Nx2 array and split it into two Nx1 vectors of parameters. Does that make sense?

That seems like what I need to implement for this case - thank you so much!

Hi @ckrapu - I used the reference above and some examples from the Bayesian Analysis with Python book by @aloctavodia to build a covariance matrix and then specify the intercepts and slopes as being drawn from a multivariate normal distribution.

I’m getting an error that indicates something is not right with the shape IndexError: index 10 is out of bounds for axis 0 with size 10

I’ve recreated an example below - appreciate any insight. Thank you!

import pandas as pd 
import numpy as np 
import pymc3 as pm

from theano import tensor as tt 

## create dependent and independent continous data 
x = np.random.normal(0, 1, size = 100) 
y = np.random.normal(1, 2, size = 100) 

## create 10 groups 
ids = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
group_ids = np.repeat(ids, 10)
group_ids = pd.Categorical(group_ids)

with pm.Model() as cov_m1 : 
    sigma = pm.StudentT('sigma', nu = 3, mu = 0, sd = 0.5) 
    ab_mu = np.array([0, 0.3]) 
    sd_a = 10 
    sd_b = 1 
    p = pm.Uniform('p', -1., 1.) 
    Cov = pm.math.stack(([sd_a**2, sd_a*sd_b*p], [sd_a*sd_b*p, sd_b**2])) 
    ## specify as being drawn from mutlivariate normal with shape 10, 2 (10 groups) 
    ab = pm.MvNormal('ab', mu=ab_mu, cov=Cov, shape = (10, 2))  
    mu = ab[:,0][group_ids] + ab[:,1][group_ids]*x 
    y = pm.Normal('y', mu=mu, sd=sigma, observed=y) 

Python indexing starts at 0! I changed the group_ids to start from 0 instead of 1 and it works.

However, the sampler doesn’t. bad initial energy error when I run the trace. Is this parametrization that inefficient as compared with a LKJ prior?