There were *divergences after tuning. Increase `target_accept` or reparameterize. & The number of effective samples is smaller than 10% for some parameters

Hi,

I have an array of 413 integers ranging from 0 to 6.

`np.unique(dist, return_counts=True)`

(array([0., 1., 2., 3., 4., 5., 6.]),
 array([144, 125,  93,  34,  12,   4,   1]))

I’d like to model this with a beta binomial distribution:

with pm.Model() as model:
    alpha = pm.Uniform('alpha', lower=0, upper=50)
    beta =  pm.Uniform('beta', lower=0, upper=50)
    z = pm.BetaBinomial('z',7,alpha,beta,observed=dist -1, shape=(413))

    trace= pm.sample(17000, tune=1500,init='adapt_diag', chains=4, cores=4, max_treedepth=50, target_accept=.60)

When I run this example I always get:

There were 1113 divergences after tuning. Increase `target_accept` or reparameterize.
There were 794 divergences after tuning. Increase `target_accept` or reparameterize.
There were 400 divergences after tuning. Increase `target_accept` or reparameterize.
There were 590 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

Also alpha does not seem to converge.

What am I doing wrong?

There may be multiple issues at play here. For example, I would advise against using uniform priors over alpha and beta. But the critical piece seems to be the parameters being passed into pm.BetaBinomial(). Passing n, alpha, and beta in as positional arguments requires knowing the assumed positions (which the documentation doesn’t provide). I think this should help:

z = pm.BetaBinomial('z', n=7, alpha=alpha, beta=beta, observed=data)
1 Like

Hi,
Thanks for your reply.
I will change the parameter assignment
With respect to the uniform priors of alpha and beta; what are the alternatives?

Regards Hans

I would specifically (try to) avoid using any sort of “truncated” distributions (including uniform, half normal, etc.). They are fine in principle, but the sharp boundaries tend to make NUTS unhappy. So many something like a gamma? It really depends on the application.

I would also suggest cranking up the target acceptance. It defaults to 0.8, but setting it to 0.9 or 0.95 can often alleviate smaller sampling issues (possibly at the expense of efficiency)

Hi
With respect to the target acceptance I just tried a couple of settings to get rid of the target_accept message.

As I understand from the BetaBinomial alpha and beta must be positive.
Don’t they need to be represented by a distribution that is a least truncated at 0?

I played around with an online simulator:
https://www.randomservices.org/random/apps/SpecialSimulator.html
That sets the range of alpha and beta from 0 to 50.
So I just took that same range.
But I am a bit of a dummy :wink:

Regards Hans

Alpha and beta do need to be positive. So you do probably want the priors for these parameters to have positive support (i.e., to suggest that only positive values are credible). But there is a difference between a prior that naturally has positive support (e.g., a gamma distribution) and a prior that is simply chopped off (i.e., truncated) at values <=0.

Consider the gamma distribution below. Note how the shape of this distribution between x=0 and x=4 strongly implies that negative values are not credible (even though are positive values of x). So as the sampler wanders semi-blindly around the parameter space, this distribution should discourage it from moving farther and farther left long before the sampler attempts to explore negative values.

image

So to repeat. The uniform prior you were using is totally fine from a mathematical point of view. But the sampler (i.e., MCMC/NUTS) is wandering around the parameter space, trying out parameter values one by one and distributions with “sharp corners” (e.g., truncated distributions) don’t provide much guidance about invalid regions of the parameter space until the sampler has already in such a region (at which point it’s too late, the divergence has already occured).

Hi
It worked :slight_smile:

with pm.Model() as model:
    alpha = pm.Uniform('alpha', lower=0, upper=50)
    beta =  pm.Uniform('beta', lower=0, upper=50)
    z = pm.BetaBinomial('z',n=6, alpha=alpha, beta=beta,observed=dist -1, shape=(413))

    idata = pm.sample(27000, tune=1500,init='adapt_diag', chains=4, cores=4)

It still gives the acceptance message however.
But now it is too high??

Auto-assigning NUTS sampler...
Initializing NUTS using adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, alpha]
Sampling 4 chains, 0 divergences: 100%|██████████| 114000/114000 [00:48<00:00, 2361.94draws/s]
The acceptance probability does not match the target. It is 0.8982175303601605, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9146879438754264, but should be close to 0.8. Try to increase the number of tuning steps.
The number of effective samples is smaller than 25% for some parameters.

Super!
Thanks!

The acceptance rate is higher than expected. During tuning, the sampler tries to select sampling parameters that will achieve certain performance (e.g., proportion of proposals accepted). The lower-than-expected acceptance rate suggests that the sampling was not as efficient as it could have been (which is consistent with the lower-than-expected number of effective samples), but this is probably fine. If you were run this “for real” (e.g., in production, need to make this inference many times, etc.), you might want to improve the efficiency.