There were *divergences after tuning. Increase `target_accept` or reparameterize. & The number of effective samples is smaller than 10% for some parameters

HG1 · November 12, 2021, 1:27pm

Hi,

I have an array of 413 integers ranging from 0 to 6.

`np.unique(dist, return_counts=True)`

(array([0., 1., 2., 3., 4., 5., 6.]),
 array([144, 125,  93,  34,  12,   4,   1]))

I’d like to model this with a beta binomial distribution:

with pm.Model() as model:
    alpha = pm.Uniform('alpha', lower=0, upper=50)
    beta =  pm.Uniform('beta', lower=0, upper=50)
    z = pm.BetaBinomial('z',7,alpha,beta,observed=dist -1, shape=(413))

    trace= pm.sample(17000, tune=1500,init='adapt_diag', chains=4, cores=4, max_treedepth=50, target_accept=.60)

When I run this example I always get:

There were 1113 divergences after tuning. Increase `target_accept` or reparameterize.
There were 794 divergences after tuning. Increase `target_accept` or reparameterize.
There were 400 divergences after tuning. Increase `target_accept` or reparameterize.
There were 590 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

Also alpha does not seem to converge.

What am I doing wrong?

cluhmann · November 12, 2021, 5:02pm

There may be multiple issues at play here. For example, I would advise against using uniform priors over alpha and beta. But the critical piece seems to be the parameters being passed into pm.BetaBinomial(). Passing n, alpha, and beta in as positional arguments requires knowing the assumed positions (which the documentation doesn’t provide). I think this should help:

z = pm.BetaBinomial('z', n=7, alpha=alpha, beta=beta, observed=data)

HG1 · November 12, 2021, 6:17pm

Hi,
Thanks for your reply.
I will change the parameter assignment
With respect to the uniform priors of alpha and beta; what are the alternatives?

Regards Hans

cluhmann · November 12, 2021, 8:09pm

I would specifically (try to) avoid using any sort of “truncated” distributions (including uniform, half normal, etc.). They are fine in principle, but the sharp boundaries tend to make NUTS unhappy. So many something like a gamma? It really depends on the application.

I would also suggest cranking up the target acceptance. It defaults to 0.8, but setting it to 0.9 or 0.95 can often alleviate smaller sampling issues (possibly at the expense of efficiency)

HG1 · November 13, 2021, 8:21am

Hi
With respect to the target acceptance I just tried a couple of settings to get rid of the target_accept message.

As I understand from the BetaBinomial alpha and beta must be positive.
Don’t they need to be represented by a distribution that is a least truncated at 0?

I played around with an online simulator:
https://www.randomservices.org/random/apps/SpecialSimulator.html
That sets the range of alpha and beta from 0 to 50.
So I just took that same range.
But I am a bit of a dummy

Regards Hans

cluhmann · November 13, 2021, 3:28pm

Alpha and beta do need to be positive. So you do probably want the priors for these parameters to have positive support (i.e., to suggest that only positive values are credible). But there is a difference between a prior that naturally has positive support (e.g., a gamma distribution) and a prior that is simply chopped off (i.e., truncated) at values <=0.

Consider the gamma distribution below. Note how the shape of this distribution between x=0 and x=4 strongly implies that negative values are not credible (even though are positive values of x). So as the sampler wanders semi-blindly around the parameter space, this distribution should discourage it from moving farther and farther left long before the sampler attempts to explore negative values.

So to repeat. The uniform prior you were using is totally fine from a mathematical point of view. But the sampler (i.e., MCMC/NUTS) is wandering around the parameter space, trying out parameter values one by one and distributions with “sharp corners” (e.g., truncated distributions) don’t provide much guidance about invalid regions of the parameter space until the sampler has already in such a region (at which point it’s too late, the divergence has already occured).

HG1 · November 15, 2021, 8:11am

Hi
It worked

with pm.Model() as model:
    alpha = pm.Uniform('alpha', lower=0, upper=50)
    beta =  pm.Uniform('beta', lower=0, upper=50)
    z = pm.BetaBinomial('z',n=6, alpha=alpha, beta=beta,observed=dist -1, shape=(413))

    idata = pm.sample(27000, tune=1500,init='adapt_diag', chains=4, cores=4)

It still gives the acceptance message however.
But now it is too high??

Auto-assigning NUTS sampler...
Initializing NUTS using adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, alpha]
Sampling 4 chains, 0 divergences: 100%|██████████| 114000/114000 [00:48<00:00, 2361.94draws/s]
The acceptance probability does not match the target. It is 0.8982175303601605, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9146879438754264, but should be close to 0.8. Try to increase the number of tuning steps.
The number of effective samples is smaller than 25% for some parameters.

Super!
Thanks!

cluhmann · November 15, 2021, 3:23pm

The acceptance rate is higher than expected. During tuning, the sampler tries to select sampling parameters that will achieve certain performance (e.g., proportion of proposals accepted). The lower-than-expected acceptance rate suggests that the sampling was not as efficient as it could have been (which is consistent with the lower-than-expected number of effective samples), but this is probably fine. If you were run this “for real” (e.g., in production, need to make this inference many times, etc.), you might want to improve the efficiency.

Topic		Replies	Views
Many Divergences in NUTS Sampler with Beta Binomial Regression v3 development , modeling	2	474	October 25, 2022
Conversion rate model: Beta parameters from pm.Uniform Questions	5	1122	February 7, 2021
Is Beta(1,1) better behaved for NUTS than Uniform(0,1)? Questions modeling	3	475	May 4, 2023
Good results but many divergent samples: what, me worry? Questions	16	1945	September 18, 2020
Gelman's (alpha + beta)^-5/2 parametrization for beta prior in AB test Questions	7	1060	April 3, 2021

There were *divergences after tuning. Increase `target_accept` or reparameterize. & The number of effective samples is smaller than 10% for some parameters

Related topics