Prior Predictive Simulation from a multivariate normal distribution

dilsher_dhillon · December 17, 2020, 8:01pm

I’m building a model where the group means and slopes are drawn from a mutlivariate normal distribution, with a uniform prior on the correlation between the 2.

While running prior predictive simulation, I noticed the array drawn from MVN consists of NaN and likewise all summaries associated are also NaN. I’d like to set up some sensible priors to (hopefully) speed up sampling.

Below is some code to create fake data and run the model.

import pandas as pd 
import numpy as np 
import pymc3 as pm
import arviz as az

N = 100 
M = 10 
idx = np.repeat(range(M), N)

sd_alpha = 1.5 
sd_beta = 2.5 
p = 0.4

cov_real = [[sd_alpha**2, sd_alpha*sd_beta*p], 
                         [sd_alpha*sd_beta*p, sd_beta**2]]

means = [2.5, 1]
alpha_beta = np.random.multivariate_normal(means, cov = cov_real, size = M)

alpha_real = alpha_beta[:,0]
beta_real = alpha_beta[:, 1]

y_m = np.zeros(len(idx))
x_m = np.random.normal(10, 1, len(idx))
y_m = alpha_real[idx] + beta_real[idx]*x_m + eps_real

with pm.Model() as cov_m1 : 
    
    sigma = pm.HalfStudentT('sigma', nu = 3, sd = 0.5)
    
    ab_mu = [0, 0] 
    
    sd_a = 1 
    
    sd_b = 1
    
    p = pm.Uniform('p', -0.5, 0.5) 
        
    Cov = pm.math.stack(([sd_a**2, sd_a*sd_b*p], [sd_a*sd_b*p, sd_b**2])) 
        
    ab = pm.MvNormal('ab', mu=ab_mu, cov=Cov, shape = (10, 2))  
    
    mu = ab[:,0][idx] + ab[:,1][idx]*x_m 
    
    y_pm = pm.Normal('y_pm', mu=mu, sd=sigma, observed=y_m)

When I run the following

with cov_m1: 
    prior_cov_m1 = pm.sample_prior_predictive()

az.summary(prior_cov_m1)

I get the following.

Similarly

Appreciate any insight into what I’m missing in the model statement in order to simulate from the prior. I’ve tried even tighter priors but the problem persists.

Thank you

Sayam753 · December 18, 2020, 4:03am

Hi @dilsher_dhillon
MvNormal random method has been refactored in 3.10 release. So, can you upgrade pymc3 to 3.10.0 and repeat the experiments?

Regarding the az.summary picture, I see arviz is misinterpreting 500 draws as chains and MvNormal distribution shape as draws. I would use az.from_pymc3 to handle shapes.

with cov_m1: 
    prior_cov_m1 = pm.sample_prior_predictive()
    prior_cov_m1 = az.from_pymc3(prior=prior_cov_m1)
az.summary(prior_cov_m1.prior)

Let me know if this works for you

dilsher_dhillon · December 21, 2020, 5:59pm

Thank you so much for your reply, @Sayam753 .

A noob question - when updating pymc3 using conda, will it install the 3.10 release? I did that and I still see that the pymc3 version is 3.8.

Sayam753 · December 21, 2020, 6:20pm

Being a pip user, I have no experience with conda. Let’s ping @MarcoGorelli @Spaak , to help figure out the installation issues.

MarcoGorelli · December 21, 2020, 6:54pm

Granted I’m a conda noob so I don’t know how to resolve this, but I can confirm it reproduces:

$ conda create -n temp python=3.8
$ conda activate temp
$ conda install -c conda-forge pymc3
$ python -c 'import pymc3; print(pymc3.__version__)'
3.8

and I too got PyMC version 3.8 installed

My setup has been to make a conda environment, install mkl and pygpu (conda install mkl pygpu), and then install pymc3 via pip (pip install -U pymc3). This works fine, and I get the latest version

MarcoGorelli · December 22, 2020, 8:26am

Regarding installation - it works for me to do

conda install -c conda-forge pymc3=3.10.0

though I get some warnings

$ conda install -c conda-forge pymc3=3.10.0
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/marco/miniconda3/envs/temp

  added / updated specs:
    - pymc3=3.10.0

AlexAndorra · December 23, 2020, 2:06pm

Weirdly enough, your PyMC3 seems to be coming from Anaconda’s main channel @dilsher_dhillon, not from conda-forge – in which case it’s expected that it’s gonna be 3.8, since Anaconda probably didn’t update the version yet

dilsher_dhillon · December 29, 2020, 3:46pm

@Sayam753 @AlexAndorra @MarcoGorelli. Thank you so much for the help! Using pip instead worked for me.

This is what I did

Conda create --name pymc3_2021 python=3.9.1 
Conda install -c anaconda numpy 
conda install -c conda-forge theano=1.19.1
Pip install -U pymc3

This gives me the 3.10 version and seems like the prior sampling for Mvn works well now. The intervals are still quite wide - but this gives me the opportunity to work with better priors!

Topic		Replies	Views
Sampling a multivariate normal with priors on the individual components of the mean Questions	2	383	May 6, 2018
Generating a random matrix with MvNormal Questions	4	733	December 4, 2019
Multivariate normal distribution Questions	4	2597	March 29, 2019
Meta-analytic predictive priors in PyMC prior	8	145	May 23, 2024
Differences between MvNormal and Normal sampling version agnostic prior , modeling , sampling	2	59	July 19, 2024

Prior Predictive Simulation from a multivariate normal distribution

Related topics