Struggling with Beta mixture models

Hi everyone,

I’m relatively new to PYMC3 - I can follow along with tutorials and examples but haven’t had much luck with my own models - so apologies in advance for any silly questions.

I’m working with some data which takes values between 0 and 1, and which is multi-modal. I decided to use a mixture of Beta distributions. I’d also like to start from a Uniform distribution (I think that this requires the expected value of my priors to be 1 so that I get a Beta(1,1) distribution if I don’t pass any observations).

I’ve generated some data which is reasonably similar to the data I’m working with, as follows:

sample_1 = np.random.beta(1, 200, size=350)
sample_2 = np.random.beta(750, 1500, size=150)
sample_3 = np.random.beta(900, 600, size=100)
sample_4 = np.random.beta(1, 1, size=100)
sample = np.concatenate([sample_1, sample_2, sample_3, sample_4])

I’ve worked around some of the difficulties I had with mixture models (e.g. I can’t iterate over the Beta RV) and have come up with the following model. I was a bit stuck on what to pick as priors - I suspect this could be where my problem is arising…

with pm.Model() as beta_model:
    w = pm.Dirichlet('w', a=np.array([1]*K), shape=K)
    alpha = pm.Gamma('alpha', alpha=1, beta=1, shape=K, testval=1) 
    beta = pm.Gamma('beta', alpha=1, beta=1, shape=K, testval=1)
    comp_dists=[pm.Gamma.dist(alpha=alpha[i], beta=beta[i]) for i in range(K)]
    obs = pm.Mixture('obs', w=w, comp_dists=comp_dists, observed=sample)

But, it doesn’t seem to do a very good job with the data.

The acceptance probability does not match the target. It is 0.8983302428206118, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.8817013927564947, but should be close to 0.8. Try to increase the number of tuning steps.
The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.

And here is the summary of the trace. Neither N_eff nor Rhat looks very happy.

           mean     sd  mc_error  hpd_2.5  hpd_97.5   n_eff   Rhat

w__0       0.05   0.05      0.00     0.00      0.12    1.03   6.12
w__1       0.28   0.28      0.03     0.00      0.58    1.00  28.05
w__2       0.45   0.11      0.01     0.30      0.59    1.02   8.16
w__3       0.22   0.12      0.01     0.08      0.36    1.01   8.92
alpha__0   3.66   3.08      0.29     0.01      8.45    1.12   3.36
alpha__1   0.86   0.55      0.02     0.00      1.65  218.44   1.03
alpha__2  11.05  10.20      1.01     0.81     23.98    1.01   9.37
alpha__3  13.84   7.70      0.75     4.15     25.03    1.04   5.37
beta__0    5.65   4.55      0.43     0.00     12.34    1.10   3.60
beta__1   14.91  13.60      1.35     0.02     31.27    1.01  10.28
beta__2   45.57  17.73      1.71    23.80     71.77    1.05   4.98
beta__3   36.42  27.16      2.67     5.90     73.01    1.02   7.13

I expect something is off with my parameterisation, but I’m a bit stuck on how to proceed. In addition, this is actually better than using my real dataset - with the real data, NUTS samples very slowly (2s per iteration) and I get a bad initial energy error if I use jitter.

Any advice you have would be very welcome!

Thank you.

Mixture model is difficult to sample, have a look at Properly sampling mixture models. You do need care when you parameterize your mixture model.

Thank you @junpenglao ! I will work my way through that thread and see if it helps.
I notice you said that in 1D it is usually easy to keep the parameters ordered between chains - could you point me in the right direction of how to achieve that?