UserWarning after sampling

Hi all,

What does these warnings mean:
1- UserWarning: Chain 0 reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.

2 - UserWarning: Chain 0 contains 6 diverging samples after tuning. If increasing target_accept does not help try to reparameterize.

What can I do to prevent this? Thank you very much!

1 Like

Both warnings are related to the property of Hamiltonian Monte Carlo sampler. You can have a look at @colcarroll’s recent talk for some information Talk/Essay: “Hamiltonian Monte Carlo in PyMC3”. In the reference part, you can find the most relevant papers on this issue.

1 Like

I find Betancourt (2017) an excellent paper for gaining some intuition on these issues.

That leaves us with the subject of reparameterization, but I do not quite get how to apply that yet…
Maybe I should study these examples a little more:
http://pymc-devs.github.io/pymc3/notebooks/Diagnosing_biased_Inference_with_Divergences.html
http://twiecki.github.io/blog/2017/02/08/bayesian-hierchical-non-centered/

1 Like

Many of the advice in the Stan manual session 26 also apply to PyMC3 especially when NUTS sampler is applied. There are lots of reparameterization tips inside.

3 Likes

Excellent!
That surely made the sampling faster in some of my hierarchical models, with less divergences.
Paper specifically on hierarchical models (Betancourt, Girolami 2013):

Also, regarding the warnings:
http://mc-stan.org/misc/warnings.html

1 Like

What is the expected false positive rate on the tree depth warning? Because I’m getting the tree depth warning even on the toy model described here.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sb
import pandas as pd
import pymc3 as pm

%matplotlib inline

model = pm.Model()
with model:
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)

with model:
    step = pm.NUTS()
    trace = pm.sample(2000, tune=1000, init=None, step=step, njobs=2)

… generates …

…/python2.7/site-packages/pymc3/step_methods/hmc/nuts.py:448: UserWarning: Chain 0 reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.

There really shouldn’t be a warning about max-treedepth in this example. Strangely, I can’t reproduce it either. What version are you using?
About false positives for max depth in general: warnings about max depth are far less serious than divergences. Divergences indicated that we might not get accurate results, a high depth indicates that we aren’t sampling very efficiently. We print a warnings if we reach the max depth in more than 5% of the samples, so things might not be terrible if you see one of those, but I think it is usually worth investigating if we have that many large trees.

pymc3==3.1
but I’m experiencing some other strange behavior so I’m going to try reinstalling and see if that helps.

I did a bit of digging, and this is actually a surprisingly interesting case. With the default target acceptance rate of 0.8 there seems to be a good chance that the step size is just right to match the posterior in a way that it oscillates and breaks the termination condition in nuts. As a consequence it continues integrating the trajectory until it reaches the max treedepth. This seems to be fixed in this case by changing target_accept to pretty much anything else.

The same thing seems to happen in stan, but for some reason it seems to go for a slightly smaller step sizes. I get similar behaviour if I change adapt_delta (the target_accept analogue) to 0.7. Although it doesn’t look as bad.

The difference in step size is a bit worrisome, I wonder where that is coming from. Sadly, we can’t compare the acceptance rates directly, as stan doesn’t export the mean across the tree.

@colcarroll @junpenglao If you have any ideas about this, that would be great. I’m a bit at a loss at the moment.

The code:

model = pm.Model()
with model:
    mu1 = pm.Normal("mu1", mu=0, sd=1, shape=10)

with model:
    step = pm.NUTS(scaling=np.ones(10))
    trace = pm.sample(1000, tune=1000, step=step, njobs=50, discard_tuned_samples=False)

plt.plot(np.array(trace.get_sampler_stats('depth', combine=False))[:, -1000:].mean(-1))

# stan

%%file test.stan
data {}
parameters {
    real a[10];
}
model {
    a ~ normal(0, 1);
}


model_ = pystan.StanModel(file='test.stan')
trace_ = model_.sampling(chains=50, control={'adapt_delta': 0.7})
accept = np.array([stats['accept_stat__'] for stats in trace_.get_sampler_params()])
plt.plot(accept[:, -1000:].mean(axis=-1))
1 Like

I don’t have enough expertise to comment on this, I would suggest we invite the Stan_devs for discussion.

1 Like