Very simple question. I am trying to model long tailed Beta-Geometric synthetic data which I create as:
from matplotlib import pylab as plt import numpy as np import pandas as pd import pymc3 as pm from pymc3.variational.callbacks import CheckParametersConvergence %matplotlib inline population_size = 100000 a, b = 0.33, 0.85 be = np.random.beta(a, b, size=population_size) geo =  for i in be: geo.append(np.random.geometric(p=(1-i))) geo = np.array(geo) print("Beta mean:", np.mean(be)) print("Syntethic population data follows a Beta distribution with a=%s; b=%s and mean=%s" % (a, b, np.mean(be))) plt.grid() plt.title('Observed Geometric') _ = plt.hist(geo, bins=20)
And then I try to recover my beta parameters using Metropolis:
import pymc3 as pm model = pm.Model() with model: beta_alpha = pm.Uniform('beta_alpha', 0.0001, 1) beta_beta = pm.Uniform('beta_beta', 0.0001, 1) beta = pm.Beta('beta', alpha=beta_alpha, beta=beta_beta) renewals = pm.Geometric('renewals', p=(1-beta), observed=geo) trace = pm.sample( 100000, step=pm.Metropolis(), njobs=4 ) pm.traceplot(trace)
The issue here is that the parameters are recovered incorrectly and also the mean of the beta seems to be wrong
Probably the reason is that my distribution is very very long tailed and this messes up with the simulation. However maybe you can you point me in some direction to improve this?
Thanks in advance for the help!