Mass matrix contains zeros on the diagonal for messy data

Newbayesian · July 12, 2021, 5:16pm

This is a relatively simple question. I’ve had no issues working with generated distributions, or any kind of “toy data”, but now that I have attempted to use some real world data that is a bit more messy I am having issues.

Here is the real data

Here is the data I generating using a best fit inverse Gaussian distribution

I am successful in imputing missing values from the generated data as follows:

with pm.Model() as model:
    
    initial = np.random.randn(ma.count_masked(test)).astype('float64')
    #init = np.abs(np.random.randn(14990,1))
    
    alpha_m=pm.HalfNormal('alpha_m',1,testval=1)
    beta_m=pm.HalfNormal('beta_m',1,testval=1)
    mu_m=pm.Normal('mu_m',0,1,testval=0)
    x = pm.InverseGamma('x',alpha=alpha_m,beta=beta_m,mu=mu_m,observed=data
    approx=pm.sample(2000,tune=2000,progressbar=True,chains=1,init='adapt_diag',target_accept=.99,discard_tuned_samples=True)

However, when using the “real world” data set the sampling fails immediately with the "Mass matrix contains zeros on the diagonal. " error. I’ve tried changing the priors, changing the sampling technique, and quite a few other things but I am at a loss. Is there something I am missing in terms of sampling from a messy distribution like this?

twiecki · July 13, 2021, 5:57pm

As the data is IID I don’t think you get any benefit by imputing missing values. Also, is the real data more noisy? You could do a prior predictive check and see if that matches with the real data at all.

Newbayesian · July 15, 2021, 6:18pm

Thank you for the response. As far as the data being IDD, agreed, but this is a part of a larger problem I am trying to solve where this will in fact make a difference. Yes, the real data is quite a bit more noisy. Even if I set the priors to be as informative as possible, to the point where I am almost pre-specifying what they should be I still get the same error. Here is the result of that prior predictive check with blue being the prior predictive and the orange being the true noisy data.

twiecki · July 20, 2021, 9:33am

Maybe the likelihood is too unforgiving. You could try a Lognormal, or even better a LogStudentT (which doesn’t exist but if you do a pm.Bound you should be able to create one, however, it might not then work with an observed anymore, but give that a try).

Newbayesian · July 23, 2021, 7:16am

Thank you, I will give it a try!

Topic		Replies	Views
ValueError: Mass matrix contains zeros on the diagonal. Sensitivity to sd of measured data Questions	0	363	May 20, 2021
Mass matrix contains zeros on the diagonal Questions	6	4304	May 4, 2020
Chain 1 failed due to Mass matrix contains zeros on the diagonal Questions sampling	3	847	February 1, 2022
Model diagnostics for "Mass matrix contains zeros on the diagonal" Questions	13	15116	May 22, 2022
ValueError: Mass matrix contains zeros on the diagonal Questions	1	2214	March 16, 2020

Mass matrix contains zeros on the diagonal for messy data

Related topics