Implementation of Multivariate Hypergeometric distribution not sampling correctly

lmnice · April 17, 2019, 2:03am

I just figured out what was the problem. When I set up the model, I chose to represent each K as a univariate BetaBinomial:

if __name__ == '__main__':

    trueK = np.asarray([40, 60])
    N = trueK.sum()

    fn = 0.9
    n = (N*fn).astype('int32')
    obs = random_mv_hypergeom(K=trueK, n=n)[0]
    p = obs/n

    b = 10
    a = (1+(b-2)*p)/(1-p)

    with pm.Model() as ex:
        K = pm.BetaBinomial('K', n=N, alpha=a, beta=b, testval=(N*p).astype('int32'), shape=p.shape)
        p = pm.Deterministic('p', K/N)
        hyp = MvHypergeometric(name='hypergeom', K=K, n=n, observed=obs)

        tex1 = pm.sample(draws=10000, nchains=2)

It is important that all K sum up to N, but since each BetaBinomial is independent of each other, there is no guarantee of that. Changing the specification so that K has a Multinomial prior solved the issue:

    with pm.Model() as ex:
        pp = pm.Dirichlet('p', a = N*np.repeat(1/len(trueK), repeats=len(trueK)), shape=trueK.shape,
                          testval=obs)
        KK = pm.Multinomial('K', n=N, p=pp, shape=trueK.shape)
        hyp = MvHypergeometric(name='hypergeom', K=KK, n=n,
                               observed=obs)

There are warnings about divergence and number of effective samples, and the traces for the K variable don’t look very well behaved, but this seems out of this question`s scope. I’ll start another question in the forum regarding this problem.

Anyway, thanks for the attention and the tips!

Topic		Replies	Views
Multivariate Hypergeometric model has very low number of effective samples Questions	4	1211	April 22, 2019
Problem with pymc3.Multinomial() distribution? Questions	3	570	July 14, 2020
Multivariate Normal code does not work since version 3.2 Development	4	892	October 30, 2017
Asking help for Multivariate Poisson distribution implement	0	272	May 18, 2023
Mixed multivariate Gauss distribution Questions	39	4813	April 12, 2018

Implementation of Multivariate Hypergeometric distribution not sampling correctly

Related topics