I just figured out what was the problem. When I set up the model, I chose to represent each K
as a univariate BetaBinomial
:
if __name__ == '__main__':
trueK = np.asarray([40, 60])
N = trueK.sum()
fn = 0.9
n = (N*fn).astype('int32')
obs = random_mv_hypergeom(K=trueK, n=n)[0]
p = obs/n
b = 10
a = (1+(b-2)*p)/(1-p)
with pm.Model() as ex:
K = pm.BetaBinomial('K', n=N, alpha=a, beta=b, testval=(N*p).astype('int32'), shape=p.shape)
p = pm.Deterministic('p', K/N)
hyp = MvHypergeometric(name='hypergeom', K=K, n=n, observed=obs)
tex1 = pm.sample(draws=10000, nchains=2)
It is important that all K
sum up to N
, but since each BetaBinomial
is independent of each other, there is no guarantee of that. Changing the specification so that K
has a Multinomial
prior solved the issue:
with pm.Model() as ex:
pp = pm.Dirichlet('p', a = N*np.repeat(1/len(trueK), repeats=len(trueK)), shape=trueK.shape,
testval=obs)
KK = pm.Multinomial('K', n=N, p=pp, shape=trueK.shape)
hyp = MvHypergeometric(name='hypergeom', K=KK, n=n,
observed=obs)
There are warnings about divergence and number of effective samples, and the traces for the K
variable don’t look very well behaved, but this seems out of this question`s scope. I’ll start another question in the forum regarding this problem.
Anyway, thanks for the attention and the tips!