Segmentation Fault with pm.MatrixNormal

I have a pymc3 model with parameters sampled from the pm.MatrixNormal. The size of the matrix is 768*30. But while running the model I am getting segmentation fault. But reducing number of parameters is helping. Are those numbers too big to handle for pymc3 models? If needed I can elaborate my model in more details. Any help is welcome.

A segfault is always a pretty bad bug. Could you post something that I can reproduce? (github is a better place for this, but here is fine too if you prefer).

Here is the piece of code you can try:

with pm.Model() as model:

    #prior sampling
    u = pm.MvNormal('u',mu=mu_u,cov=cov_u,shape=u_dim)
    transcript0 = pm.MvNormal('transcript0',mu=mu_trans,cov=cov_trans,shape=trans_dim)
    eta_u_transcript = pm.MatrixNormal('eta_u_transcript',colcov=cov_u, rowcov=cov_trans, shape=(trans_dim, u_dim))
    
    sigma_transcript_sq = pm.InverseGamma('sigma_transcript_sq',alpha=1,beta=1)

    transcript_mean =   tt.dot(eta_u_transcript ,u)+transcript0 
    transcript = pm.MvNormal('transcript', mu= transcript_mean, cov = sigma_transcript_sq*np.eye(trans_dim), observed = data["transcript"] )  
    trace=pm.sample()

where

mu_u = np.zeros(u_dim)
cov_u = np.eye(u_dim)
mu_trans = np.zeros(trans_dim)
cov_trans = np.eye(trans_dim)

and in my case:

u_dim, trans_dim = 30, 768

and

`len(data["transcript"])= 2400`

Please let me know if you need anything else.

Hm, I get a shape error with this code:

import pandas as pd
import theano.tensor as tt
import pandas as pd

u_dim, trans_dim = 30, 768

mu_u = np.zeros(u_dim)
cov_u = np.eye(u_dim)
mu_trans = np.zeros(trans_dim)
cov_trans = np.eye(trans_dim)


data = pd.DataFrame({"transcript": np.random.randn(2400)})


with pm.Model():
    #prior sampling
    u = pm.MvNormal('u',mu=mu_u,cov=cov_u,shape=u_dim)
    transcript0 = pm.MvNormal('transcript0',mu=mu_trans,cov=cov_trans,shape=trans_dim)
    eta_u_transcript = pm.MatrixNormal('eta_u_transcript',colcov=cov_u, rowcov=cov_trans, shape=(trans_dim, u_dim))
    
    sigma_transcript_sq = pm.InverseGamma('sigma_transcript_sq',alpha=1,beta=1)

    transcript_mean =   tt.dot(eta_u_transcript ,u)+transcript0 #+ tt.dot(data['a'] , eta_a_transcript)
    transcript = pm.MvNormal('transcript', mu= transcript_mean, cov = sigma_transcript_sq*np.eye(trans_dim), observed = data["transcript"] )  
    trace=pm.sample()

because len(data['transcript']) != trans_dim. Can you give me a snippet of code that I can execute as it is, and that segfaults for you?

Independent of the segfault: You don’t need to use MvNormal unless you have a non-diagonal covariance. Having huge diagonal covariance matrices is a huge amount of unnecessary work during sampling. Just use pm.Normal instead. Or is that just to make it executable for me? :slight_smile:

Thanks for the suggestion about MvNormal. I changed it here. Here is the code you can try:

import pandas as pd
import theano.tensor as tt
import pandas as pd
import numpy as np
import pymc3 as pm



u_dim, trans_dim = 30, 768

mu_u = np.zeros(u_dim)
cov_u = np.eye(u_dim)
mu_trans = np.zeros(trans_dim)
cov_trans = np.eye(trans_dim)



data_dict = {'transcript':np.random.normal(size=(2400,768))}

print(data_dict['transcript'].dtype)

with pm.Model():
    #prior sampling
    u = pm.Normal('u',mu=0,tau=1,shape=u_dim)
    transcript0 = pm.Normal('transcript0',mu=0,tau=1,shape=trans_dim)
    eta_u_transcript = pm.MatrixNormal('eta_u_transcript',colcov=cov_u, rowcov=cov_trans, shape=(trans_dim, u_dim))
    
    sigma_transcript_sq = pm.InverseGamma('sigma_transcript_sq',alpha=1,beta=1)

    transcript_mean =   tt.dot(eta_u_transcript ,u)+transcript0 
    transcript = pm.MvNormal('transcript', mu= transcript_mean, cov = sigma_transcript_sq*np.eye(trans_dim), observed = data_dict['transcript'] )  
    trace=pm.sample()

This doesn’t segfault for me…
Does it do that every time for you? When it starts sampling or later? What operating system are you one, and what is the output of pip freeze or conda list and np.__config__.show()? Also what version of pymc3?

Yes it does segfault everytime for me. The output for pip freeze is in the attached file: freeze_output.txt (5.1 KB)