I’m building a complex model, that have a convergence problem, when initializing with NUTS I get this error message :
ValueError: Bad initial energy: nan. The model might be misspecified.
How can I have clues about the random variable that produce such ‘nan’ energy ?
So far my work around is to use Metropolis step for a given variable instead of NUTS. But this is not satisfying. I guess that this given variable is not the problem because I still have the problem even if I replace this random variable by a constant (with a reasonable value).
I printed the logp value of the test point for each of my random variables :
(last value is the whole model logp)
None of them are Inf or Nan, I’m a bit surprised to see positive logp but I’m not sure that it is a problem (?).
How can I get more information about this ‘nan ernergy’ ?
The error message should be improved recently, are you on master? It should print a bit more information re which RV is nan.
I’ve updated pymc3 with this line :
pip install git+https://github.com/pymc-devs/pymc3
(I guess that is the “master” ?)
But still get no clues about wich RV gets mad…
The energy problem, if it is not from invalid start value (i.e., model.test_point) causing non-finite logp, it is usually due to gradient being non-finite. It could be difficult to diagnose, so here would be all the possible step to identify the problem:
with pm.Model() as model:
# your model definition
# make sure all test_value are finite
# make sure all logp are finite
step = pm.HamiltonianMC()
q0 = step._logp_dlogp_func.dict_to_array(model.test_point)
p0 = step.potential.random()
# make sure the potentials are all finite
start = step.integrator.compute_state(q0, p0)
# make sure model logp and its gradients are finite
logp, dlogp = step.integrator._logp_dlogp_func(q0)
# make sure velocity is finite
v = step.integrator._potential.velocity(p0)
kinetic = step.integrator._potential.energy(p0, velocity=v)
Any time you see an array containing non-finite element, you can map it back into a dict to see which RV is causing the problem. For example, say the dlogp contain non-finite value:
And adjust the prior for that RV accordingly.
Hope this is clear!
Thanks for all these tricks…
I found no divergent parameters…
The error disapeared when I switched from defaut initializer : ‘jitter+adapt_diag’
to ‘adapt_diag’ (alone, without the jitter)
I guess that the jitter trick somehow pushes some parameter outside they legitimate support…?
Does it make sens ?
Yep, make sense - that can happens.