How to build a multi-covariance model in pymc3?

Hey community,

I am trying to study the covariance between several variables coming from two(or more) different engines. Variables are temperature, vibration, humidity, etc. I built a simple covariance model using 2 variables of one engine using this tutorial ([LKJ](https://docs.pymc.io/notebooks/LKJ.html))

Currently, I want to use 2 engines and study the independent covariance of the variables. But, I’m stuck with not understanding the shape parameter. The error is given at the bottom. The idea is to eventually build a hierarchical model. But, I am not sure if my strategy is correct.

with pm.Model() as model:    
      p = pm.Dirichlet('p', a = np.ones(2), shape = (2, 2))
      category = pm.Categorical('category', p=p, shape= 2) 

     # Covariance matrix model for engine 1
     # LKJ prior for covariance matrix 
     packed_L1 = pm.LKJCholeskyCov('packed_L1', n=2, eta=2., sd_dist=pm.HalfCauchy.dist(6))    
     L1 = pm.expand_packed_triangular(2, packed_L1)    
     cov1 = pm.Deterministic('cov1', L1.dot(L1.T))


     # Covariance matrix model for engine 2
     # LKJ prior for covariance matrix 
     packed_L2 = pm.LKJCholeskyCov('packed_L2', n=2, eta=2., sd_dist=pm.HalfCauchy.dist(6))        
     L2 = pm.expand_packed_triangular(2, packed_L2)    
     cov2 = pm.Deterministic('cov2', L2.dot(L2.T))

     # stack covariance of model 1 and model 2
     cov_both = t.stack([cov1,cov2])

     # Let's assume each component has an independent mean
     mu = pm.Normal('mu', mu=0, sigma=6, shape=(2, 2)) 

     # f(x|mu, cov)
     components = pm.MvNormal('components', mu[category], cov_both[category], shape = (2, 2))      

     obs = pm.Mixture('obs', w=p,
                  comp_dists = components, 
                  observed = [df_bearing.iloc[:, k:k+2] for k in [0, 2]], # to get 2 variables as a df at a time
                   shape=(2, 2))
    
     n_samples = 1000

     step = pm.Metropolis()
     trace_ = pm.sample(n_samples, step, cores=1)

I get this error:
—> 53 components = pm.MvNormal(‘components’, np.array([0, 0]), np.array([[[1, 0], [0,1]], [[1,2], [2,1]]]), shape = 2)
54
55

3 frames
/usr/local/lib/python3.6/dist-packages/pymc3/distributions/multivariate.py in init(self, mu, cov, chol, tau, lower, *args, **kwargs)
59 cov = tt.as_tensor_variable(cov)
60 if cov.ndim != 2:
—> 61 raise ValueError(‘cov must be two dimensional.’)
62 self.chol_cov = cholesky(cov)
63 self.cov = cov

ValueError: cov must be two dimensional.

1 Like

Hi rahul.ai and community, a colleague and I am struggling with a similar issue. If anybody had idea that would be super helpful. All the best, Martin

Unfortunately, currently the multivariate distributions in PyMC3 are a bit incomplete as they does not support “batch” of multivariate random variable - so you will need to build the MvNormal in a for loop and then use pm.Mixture.

You can find some example here: Mixed multivariate Gauss distribution

1 Like