Get nan or inf from model.logp (model.test point) is an attestation of incorrectly configured model?


#4

I’m not sure if this is it, but it looks like @junpenglao recently (~a week ago) made some changes to master which involve a fix for test points being set on transformed variables. See for instance: https://github.com/pymc-devs/pymc3/pull/2328

Could you try downloading master and then running your model?


#5

I don’t think this is related to the transformed test points.
@marcio Could you also provide the data (or some simulation data)? It would be easier to diagnose the problem.


#6

Are you sure you are pasting the model.bijection.rmap(model.dlogp_array(model.dict_to_array(model.test_point))) in https://pastebin.com/QXM3DYMg2? To me it looks like the model.test_point.


#7

Yes, I am pasting the output of model.bijection.rmap


#8

loss_shifted40 = [155, 267, 105, 223, 183, 189, 262, 178, 272, 187]
fem_cases = [2, 2, 2, 3, 1, 0, 1, 0, 0, 0]
mz_freq = [0.05, 0.047, 0.045, 0.063, 0.06, 0.062, 0.053, 0.057, 0.061, 0.058]


#9

what about N and pp_mean?


#10

pp_mean: 0.0595759774518
N: 3392425


#11

A few thoughts:

1, In your model, the observed part you are doing:

with model:
    ...
    mz = pm.Uniform("mz", lower=0, upper=1, observed=mz_freq)
    M = pm.Binomial("M", n=Yp, p=mz, observed=loss_shifte

However, the node mz is not doing anything, since there is no prior on mz, in effect, this is equivalent to mz = mz_freq

2, Since we got -inf from model.logp (model.logp(model.test_point)), we want to identify which RV node is causing the problem. @aseyboldt suggest doing model.bijection.rmap(model.dlogp_array(model.dict_to_array(model.test_point))) to get the logp of each RV but actually I am not sure that’s correct. I am doing below instead:

# for RV in model.basic_RVs:
#    print(RV.name, RV.logp(model.test_point))
model.check_test_point()

which gives:

lower_beta_interval__ -1.3862943611198906
upper_beta_interval__ -1.3862943611198906
beta_interval__ -13.862943611198906
beta_gamma_interval__ -1.3862943611198906
alpha_gamma_interval__ -1.3862943611198906
gamma_log__ 0.14556084518264534
I_interval__ -13.862943611198906
S_interval__ -13.862943611198908
po_interval__ -1.3862943611198906
Y -18.29351833496805
Yp -3.574558647108001
Yo -26.885429018279982
M -inf

So the problem is the node M = pm.Binomial("M", n=Yp, p=mz_freq, observed=loss_shifted40), and looking closer at the test_value of the input parameter:

In[1]: Yp.tag.test_value
Out[1]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

The n of the Binomial is smaller than the observed, which causing the -inf in the log-likelihood. An easy temporally fix is to provide a valid test value to Yp:

Yp = pm.Poisson("Yp", mu=mu_Yp, shape=len(fem_cases), testval=loss_shifted40+1000)

which seems to resolve the problem.

So below is the full code I ran, hope you find it helpful.

import pymc3 as pm
import numpy as np

loss_shifted40 = np.array([155, 267, 105, 223, 183, 189, 262, 178, 272, 187])
fem_cases = np.array([2, 2, 2, 3, 1, 0, 1, 0, 0, 0])
mz_freq = np.array([0.05, 0.047, 0.045, 0.063, 0.06, 0.062, 0.053, 0.057, 0.061, 0.058])

pp_mean = 0.0595759774518
N = 3392425
with pm.Model() as model:

    pp = pp_mean
    lower_beta = pm.Uniform("lower_beta", lower=0, upper=1)
    upper_beta = pm.Uniform("upper_beta", lower=lower_beta, upper=lower_beta+3)
    beta = pm.Uniform("beta", lower=lower_beta, upper=upper_beta, shape=len(fem_cases))

    beta_gamma = pm.Uniform("beta_gamma", lower=0, upper=120)
    alpha_gamma = pm.Uniform("alpha_gamma", lower=(2/3)*beta_gamma/7, upper=(4/3)*beta_gamma/7)
    gamma = pm.Gamma("gamma", alpha=alpha_gamma, beta=beta_gamma)
    
    I = pm.Uniform("I", lower=0, upper=fem_cases.sum(), shape=len(fem_cases))
    R = pm.Deterministic("R", (gamma*I))
    S = pm.Uniform("S", lower=0, upper=N-(I+R), shape=len(fem_cases))

    inc = pm.Deterministic("inc", beta*I*(S)/(S+I+R)-R)
    
    po = pm.Uniform("po", lower=0, upper=1)
 
    Y = pm.Poisson('Y', mu=inc, shape=len(fem_cases))

    mu_Yp = pm.Deterministic("mu_Yp", var=Y*pp)
    
    Yp = pm.Poisson("Yp", mu=mu_Yp, shape=len(fem_cases), testval=loss_shifted40+1000)

    Yo = pm.Binomial('Yo', n=Y, p=po, observed=fem_cases)    

    M = pm.Binomial("M", n=Yp, p=mz_freq, observed=loss_shifted40)
    
    trace = pm.sample(1e4)

Frequently Asked Questions
NaN occurred in optimization with ADVI
Average Loss in optimization output
Using stick breaking algorithm example but receiving gradient problem
pm.Potential without Theano?
#12
with model:
    trace = pm.sample(1e4, njobs=4)
pm.traceplot(trace[7000:]);

Hmm S seems a bit problematic… The next step I would suggest to take is reparameterize the model and try to avoid using bounded (ie. Uniform) variable for the prior.


#13

One more question, it seems obvious, but just checking: does Yp.tag.test_value outputs the start value of this variable, right?


#14

Yes, it can be defined by providing testval in the variable declares (e.g., pm.Normal(..., testval=...), or by the start karg in the calling of sampling (e.g., pm.sample(..., start=start,...)) where start is a dict with the same format as model.test_point.


#15
for RV in model.basic_RVs:
    print(RV.name, RV.logp(model.test_point))

Using the procedure you taught me, I got this:

lower_beta_interval__ -1.3862943611198906
upper_beta_interval__ -1.3862943611198906
beta_interval__ -40.20253647247681
beta_gamma_interval__ -4.795859992452562
alpha_gamma_interval__ -1.3862943611198904
gamma_log__ -2.300013578603738
I_interval__ -40.20253647247681
S_interval__ -40.20253647247685
po_interval__ -2.407945608651872
Y -160.51534229164463
Yp -119.61861672892917
Yo -4612.920761723422
M -inf

and

Yp.tag.test_value:

array([608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608,
       608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608, 608,
       608, 608, 608])

Instead of using mz_freq, I used mz_freq.mean(), what solved the problem, even without setting a testval. Thank you very much my friend!


#16

You are welcome! I am glad you find it helpful :wink:


#17

@junpenglao, still in this model, I would like to store the sample, but got ValueError: This backend does not support sampler stats. Why would this happen? Thank you in advance!


#18

What backend are you specified in pm.sample?


#19

db = pm.backends.Text(‘trace’)
trace = pm.sample(5e4, njobs=4, step=[pm.Metropolis(vars=[Y, Yp]), pm.NUTS()], tune=1000, progressbar=True, trace=db)


#20

I see, well I have never work with backend before (and we are thinking of removing it actually), if you want to save it in a file I think there are better ways to do so. @colcarroll?


#21

Oh, good! I’m almost jumping through the window, the sample took a day to finish. HAHAHAHAHA (desperate laugh). Thank you very much @junpenglao, I will wait for the @colcarroll’s teachings!


#22

Haven’t tried this, but it should work (give it a shot with very few samples first!):

import cPickle as pickle # if python2
import pickle # if python3

...

trace = pm.sample(5e4, njobs=4, step=[pm.Metropolis(vars=[Y, Yp]), pm.NUTS()], tune=1000)

with open('trace.pkl', 'wb') as buff:
    pickle.dump(trace, buff)

You can then recover the trace with

with open('trace.pkl', 'rb') as buff:
    trace = pickle.load(buff)

This is great for local development on your computer. A few “gotchas” that probably won’t matter:

  1. You cannot pickle in python2 and unpickle in python3
  2. Do not unpickle a file from an untrusted source, as it can run arbitrary code

Using text backend raises sampler stats error
Loaded saved Text Trace does not show any summary or traceplot
#23

Thank you very much @colcarroll !