Dirichlet-Multinomial Model does not appear to be sampling at all

Hi all!

I’ve been having a specific problem with my model - it appears to generate appropriate posteriors but only if I remove the final line of the model. I’ve reduced it down to the following minimum working example:


import numpy as np
import pymc3 as pm

sparsity=10 #not zero
beta=np.ones([3,4])/4 #input for dirichlet

with pm.Model() as mod:
    theta=pm.Dirichlet('theta',beta/sparsity,shape=(3,4))   
#    transition=pm.Multinomial('transition',1,theta,shape=(3,4))

    step=pm.Metropolis()
    trace = pm.sample(draws=5000,step=step)
_ = pm.traceplot(trace)
pm.summary(trace)
pm.plot_posterior(trace)

Essentially, the idea is that we have a dirichlet-multinomial system with a sparsity parameter that determines how the multinomial distribution behaves - a low sparsity results in the multinomial looking uniform.

Running the code with the Multinomial line commented out works as expected, giving extreme values of theta (when the sparsity parameter is high):

But uncommenting the multinomial results in the following MCMC plot:

I’ve tried changing samplers, parameters etc but with no luck. Any tips on this would be much appreciated.

Thanks!

I’ve also tried reducing it down to a one dimensional model (to then possibly loop over), but I get “The broadcastable pattern of the input is incorrect for this op”. Similar to this: https://github.com/pymc-devs/pymc3/issues/1304 but .flatten() does not help either.

The following should work for one dimension (work in terms of you can initiate the model, but the sampling is still stuck with Metropolis):

sparsity=1 #not zero
beta=np.ones(4,) #input for dirichlet

with pm.Model() as mod:
    theta=pm.Dirichlet('theta',beta/sparsity)
    transition=pm.Multinomial('transition',10,theta,shape=4)

The failed of Metropolis in high dimension is somehow expected, and in your model if Multinomial contains observed value it should sample using NUTS without problem. If you just want to simulate data, it might better just use scipy.stats random

Actually, upon a closer look it is actually a bug, if you print the model test_point:

with pm.Model() as mod:
    theta=pm.Dirichlet('theta',beta/sparsity)
    transition=pm.Multinomial('transition',1,theta,shape=4)
mod.test_point

which returns:

Out[14]: 
{'theta_stickbreaking__': array([ 0.,  0.,  0.]),
 'transition': array([0, 0, 0, 0])}

The transition is all zeros, which is wrong. I will report the bug to Github, meanwhile you can bypass this error by providing a valid start value. For example you can try the code below:

import numpy as np
import pymc3 as pm
import scipy.stats as st

sparsity = 1 #not zero
beta = np.ones([3,4])/4 #input for dirichlet
n = 10
testval = np.asarray([st.multinomial.rvs(p=a, n=n) for a in beta])
with pm.Model() as mod:
    theta=pm.Dirichlet('theta', beta/sparsity, shape=(3, 4))   
    transition=pm.Multinomial('transition', n, theta, shape=(3, 4), testval=testval)
    trace = pm.sample(5000, step=pm.Metropolis())
    
pm.traceplot(trace);
trace['transition']

Keep in mind that Metropolis acceptance is still quite low in this situation. And I observed that if the n in Multinominal is 1 the trace hardly move when the sparsity is high.

1 Like

That’s incredibly useful - thanks very much!

In case you want to speed up you inference a bit, you can use a Dirichlet Multinomial, where the intermediate probabilities (in your case theta) are marginalized out. I got my implementation running a few days ago. You can find it here: Gist

3 Likes

Ah thanks for that - I’ve been looking for something along those lines.

And the link doesn’t work…

Unfortunately, I’m unable to edit the message, here you go :slight_smile:

2 Likes