Hi everyone,
I’m relatively new to PYMC3 - I can follow along with tutorials and examples but haven’t had much luck with my own models - so apologies in advance for any silly questions.
I’m working with some data which takes values between 0 and 1, and which is multi-modal. I decided to use a mixture of Beta distributions. I’d also like to start from a Uniform distribution (I think that this requires the expected value of my priors to be 1 so that I get a Beta(1,1) distribution if I don’t pass any observations).
I’ve generated some data which is reasonably similar to the data I’m working with, as follows:
sample_1 = np.random.beta(1, 200, size=350)
sample_2 = np.random.beta(750, 1500, size=150)
sample_3 = np.random.beta(900, 600, size=100)
sample_4 = np.random.beta(1, 1, size=100)
sample = np.concatenate([sample_1, sample_2, sample_3, sample_4])
I’ve worked around some of the difficulties I had with mixture models (e.g. I can’t iterate over the Beta RV) and have come up with the following model. I was a bit stuck on what to pick as priors - I suspect this could be where my problem is arising…
K=4
with pm.Model() as beta_model:
w = pm.Dirichlet('w', a=np.array([1]*K), shape=K)
alpha = pm.Gamma('alpha', alpha=1, beta=1, shape=K, testval=1)
beta = pm.Gamma('beta', alpha=1, beta=1, shape=K, testval=1)
comp_dists=[pm.Gamma.dist(alpha=alpha[i], beta=beta[i]) for i in range(K)]
obs = pm.Mixture('obs', w=w, comp_dists=comp_dists, observed=sample)
But, it doesn’t seem to do a very good job with the data.
The acceptance probability does not match the target. It is 0.8983302428206118, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.8817013927564947, but should be close to 0.8. Try to increase the number of tuning steps.
The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
And here is the summary of the trace. Neither N_eff nor Rhat looks very happy.
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
w__0 0.05 0.05 0.00 0.00 0.12 1.03 6.12
w__1 0.28 0.28 0.03 0.00 0.58 1.00 28.05
w__2 0.45 0.11 0.01 0.30 0.59 1.02 8.16
w__3 0.22 0.12 0.01 0.08 0.36 1.01 8.92
alpha__0 3.66 3.08 0.29 0.01 8.45 1.12 3.36
alpha__1 0.86 0.55 0.02 0.00 1.65 218.44 1.03
alpha__2 11.05 10.20 1.01 0.81 23.98 1.01 9.37
alpha__3 13.84 7.70 0.75 4.15 25.03 1.04 5.37
beta__0 5.65 4.55 0.43 0.00 12.34 1.10 3.60
beta__1 14.91 13.60 1.35 0.02 31.27 1.01 10.28
beta__2 45.57 17.73 1.71 23.80 71.77 1.05 4.98
beta__3 36.42 27.16 2.67 5.90 73.01 1.02 7.13
I expect something is off with my parameterisation, but I’m a bit stuck on how to proceed. In addition, this is actually better than using my real dataset - with the real data, NUTS samples very slowly (2s per iteration) and I get a bad initial energy error if I use jitter.
Any advice you have would be very welcome!
Thank you.