Prior Predictive Simulation from a multivariate normal distribution

I’m building a model where the group means and slopes are drawn from a mutlivariate normal distribution, with a uniform prior on the correlation between the 2.

While running prior predictive simulation, I noticed the array drawn from MVN consists of NaN and likewise all summaries associated are also NaN. I’d like to set up some sensible priors to (hopefully) speed up sampling.

Below is some code to create fake data and run the model.

import pandas as pd 
import numpy as np 
import pymc3 as pm
import arviz as az

N = 100 
M = 10 
idx = np.repeat(range(M), N)

sd_alpha = 1.5 
sd_beta = 2.5 
p = 0.4

cov_real = [[sd_alpha**2, sd_alpha*sd_beta*p], 
                         [sd_alpha*sd_beta*p, sd_beta**2]]

means = [2.5, 1]
alpha_beta = np.random.multivariate_normal(means, cov = cov_real, size = M)

alpha_real = alpha_beta[:,0]
beta_real = alpha_beta[:, 1]

y_m = np.zeros(len(idx))
x_m = np.random.normal(10, 1, len(idx))
y_m = alpha_real[idx] + beta_real[idx]*x_m + eps_real

with pm.Model() as cov_m1 : 
    
    sigma = pm.HalfStudentT('sigma', nu = 3, sd = 0.5)
    
    ab_mu = [0, 0] 
    
    sd_a = 1 
    
    sd_b = 1
    
    p = pm.Uniform('p', -0.5, 0.5) 
        
    Cov = pm.math.stack(([sd_a**2, sd_a*sd_b*p], [sd_a*sd_b*p, sd_b**2])) 
        
    ab = pm.MvNormal('ab', mu=ab_mu, cov=Cov, shape = (10, 2))  
    
    mu = ab[:,0][idx] + ab[:,1][idx]*x_m 
    
    y_pm = pm.Normal('y_pm', mu=mu, sd=sigma, observed=y_m) 

When I run the following

with cov_m1: 
    prior_cov_m1 = pm.sample_prior_predictive()
az.summary(prior_cov_m1)

I get the following.

Similarly

Appreciate any insight into what I’m missing in the model statement in order to simulate from the prior. I’ve tried even tighter priors but the problem persists.

Thank you

Hi @dilsher_dhillon
MvNormal random method has been refactored in 3.10 release. So, can you upgrade pymc3 to 3.10.0 and repeat the experiments?

Regarding the az.summary picture, I see arviz is misinterpreting 500 draws as chains and MvNormal distribution shape as draws. I would use az.from_pymc3 to handle shapes.

with cov_m1: 
    prior_cov_m1 = pm.sample_prior_predictive()
    prior_cov_m1 = az.from_pymc3(prior=prior_cov_m1)
az.summary(prior_cov_m1.prior)

Let me know if this works for you :slight_smile:

2 Likes

Thank you so much for your reply, @Sayam753 .

A noob question - when updating pymc3 using conda, will it install the 3.10 release? I did that and I still see that the pymc3 version is 3.8.

Being a pip user, I have no experience with conda. Let’s ping @MarcoGorelli @Spaak , to help figure out the installation issues.

Granted I’m a conda noob so I don’t know how to resolve this, but I can confirm it reproduces:

$ conda create -n temp python=3.8
$ conda activate temp
$ conda install -c conda-forge pymc3
$ python -c 'import pymc3; print(pymc3.__version__)'
3.8

and I too got PyMC version 3.8 installed


My setup has been to make a conda environment, install mkl and pygpu (conda install mkl pygpu), and then install pymc3 via pip (pip install -U pymc3). This works fine, and I get the latest version

1 Like

Regarding installation - it works for me to do

conda install -c conda-forge pymc3=3.10.0

though I get some warnings

$ conda install -c conda-forge pymc3=3.10.0
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/marco/miniconda3/envs/temp

  added / updated specs:
    - pymc3=3.10.0
1 Like

Weirdly enough, your PyMC3 seems to be coming from Anaconda’s main channel @dilsher_dhillon, not from conda-forge – in which case it’s expected that it’s gonna be 3.8, since Anaconda probably didn’t update the version yet

@Sayam753 @AlexAndorra @MarcoGorelli. Thank you so much for the help! Using pip instead worked for me.

This is what I did

Conda create --name pymc3_2021 python=3.9.1 
Conda install -c anaconda numpy 
conda install -c conda-forge theano=1.19.1
Pip install -U pymc3 

This gives me the 3.10 version and seems like the prior sampling for Mvn works well now. The intervals are still quite wide - but this gives me the opportunity to work with better priors!