Sampling fails after upgrading to v4

jung-benjamin · July 8, 2022, 10:03am

Hi,

I was previously using pymc3 v3.11.2 and decided to upgrade to v4. After upgrading and making only the necessary changes, the NUTS sampler fails where before it didn’t, with the error message:

ValueError: Mass matrix contains zeros on the diagonal.
The derivative of RV Uptime_interval__.ravel()[[0]] is zero.
The derivative of RV Downtime_interval__.ravel()[[0]] is zero.
The derivative of RV L.ravel()[] is zero.

I know this is notoriously difficult to debug and I’m not really sure what could be causing it, I had encountered it before with pymc3, but the current version of my model runs fine on pymc3.
The only changes I made were renaming theano to aesara and changing how I pass variables to pm.DensityDist.

Does anyone know what change in pymc could cause this?

It would be quite a bit of extra work to post a working version of my model here, so I would prefer not to do that.
Some details that I think may be relevant:
Using the NUTS sampler, both init="jitter+adapt_diag" and init="adapt_diag" run into this problem.
I had seen on other posts that this sometimes solves the issue.
The model is essentially a product of pm.Normal distributions.

I’m thankful for any insights on how to solve this problem.

ricardoV94 · July 8, 2022, 2:55pm

You’ll have to share your model for us to be able to help.

jung-benjamin · July 12, 2022, 9:36am

I don’t think I can post a fully working version here, because it requires quite a bit of data stored in other files.
But here are the essentials:

s = {
    'draws': 10000,
    'tune': 5000,
    'chains': 2,
    'cores': 1,
    }
ev_B = {
    'A': 0.16128047381554936,
    'B': 0.005666983263328754,
    'C': 8.388210965590476,
    'D': 0.004900584215030823,
    'E': 0.903214838018954
}
ids = ['A', 'B', 'C', 'D', 'E']
with pm.Model():
    ## load_models returns a dictionary with `ids` as keys
    ## and objects as items. Each object loads data from a file
    ## and has a `predict` method.
    model_dict = load_models(ids, filepaths)
    priors = [
        pm.Uniform('Uptime', lower=100, upper=2000),
        pm.Uniform('Downtime', lower=0, upper=10000),
    ]
    evidence = [ev_B[i] for i in ids]
    sigma = [0.1 * ev_B[i] for i in ids]
    models = [model_dict[i].predict(priors) for i in ids]
    distrib = [pm.Normal(i, mu=m, sigma=s, observed=o)
               for i, m, s, o in zip(ids, models, sigma, evidence)
              ]
    def joint(*args):
        return np.product([*args])
    l = pm.DensityDist('L', *distrib, logp=joint)
    pm.sample(**s)

The sampling actually works, if I use tune = 50 and draws=100. Is there some way to gain more insight into which specific sample is causing it to crash?

Topic		Replies	Views
ValueError: Mass matrix contains zeros on the diagonal Questions	1	2214	March 16, 2020
"ValueError: Mass matrix contains zeros on the diagonal" error, how to debug? Questions	7	1599	March 7, 2020
NUTS and zero derivative error Questions	6	6998	August 18, 2019
206 samples :: Mass matrix contains zeros on the diagonal Questions	16	1499	June 11, 2021
ValueError: Mass matrix contains zeros on the diagonal. The derivative of RV `@@@`.ravel()[0] is zero v3 modeling	0	584	September 13, 2022

Sampling fails after upgrading to v4

Related topics