Energy change in leapfrog step is too large: what is special about Emax=1000?

My model diverges on about 0.5% of the samples. The error messages (in trace.report._chain_warnings) look like this:

Energy change in leapfrog step is too large: 1017.4813457729051.
Energy change in leapfrog step is too large: 1059.9371476906745.
Energy change in leapfrog step is too large: 1139.1986720946406.
Energy change in leapfrog step is too large: 1059.8736709577333.
Energy change in leapfrog step is too large: 2289.9443530453555.
etc.

The overly large energy changes are not super-huge, not infs
or even greater than 4000. Are these worth worrying about? In particular,
should I figure out why the energy changes are sometimes greater
then the default of 1000? Or should I raise that default with a larger
Emax? Or ignore the occasional excessive energy change?

Is there anything special about the default Emax of 1000?

Some details: hierarchical model with lots of beta distributions. NUTS. target_accept=0.8

Emax=1000 is a default most PPL use for HMC related energy error. There might be some reference about why but I dont remember very clearly. It tells you whether the leapfrog integration is imprecise because of the geometric of your model.

In general though, my experience is that emax=1000 is already a pretty tolerated threshold, as usually the energy change is quite small (depending on you model as well). The divergence should comes from the energy error raised here, so yes you should figure out why it is large.

2 Likes