I’ve been having a specific problem with my model - it appears to generate appropriate posteriors but only if I remove the final line of the model. I’ve reduced it down to the following minimum working example:
import numpy as np
import pymc3 as pm
sparsity=10 #not zero
beta=np.ones([3,4])/4 #input for dirichlet
with pm.Model() as mod:
theta=pm.Dirichlet('theta',beta/sparsity,shape=(3,4))
# transition=pm.Multinomial('transition',1,theta,shape=(3,4))
step=pm.Metropolis()
trace = pm.sample(draws=5000,step=step)
_ = pm.traceplot(trace)
pm.summary(trace)
pm.plot_posterior(trace)
Essentially, the idea is that we have a dirichlet-multinomial system with a sparsity parameter that determines how the multinomial distribution behaves - a low sparsity results in the multinomial looking uniform.
Running the code with the Multinomial line commented out works as expected, giving extreme values of theta (when the sparsity parameter is high):
I’ve also tried reducing it down to a one dimensional model (to then possibly loop over), but I get “The broadcastable pattern of the input is incorrect for this op”. Similar to this: https://github.com/pymc-devs/pymc3/issues/1304 but .flatten() does not help either.
The following should work for one dimension (work in terms of you can initiate the model, but the sampling is still stuck with Metropolis):
sparsity=1 #not zero
beta=np.ones(4,) #input for dirichlet
with pm.Model() as mod:
theta=pm.Dirichlet('theta',beta/sparsity)
transition=pm.Multinomial('transition',10,theta,shape=4)
The failed of Metropolis in high dimension is somehow expected, and in your model if Multinomial contains observed value it should sample using NUTS without problem. If you just want to simulate data, it might better just use scipy.stats random
The transition is all zeros, which is wrong. I will report the bug to Github, meanwhile you can bypass this error by providing a valid start value. For example you can try the code below:
import numpy as np
import pymc3 as pm
import scipy.stats as st
sparsity = 1 #not zero
beta = np.ones([3,4])/4 #input for dirichlet
n = 10
testval = np.asarray([st.multinomial.rvs(p=a, n=n) for a in beta])
with pm.Model() as mod:
theta=pm.Dirichlet('theta', beta/sparsity, shape=(3, 4))
transition=pm.Multinomial('transition', n, theta, shape=(3, 4), testval=testval)
trace = pm.sample(5000, step=pm.Metropolis())
pm.traceplot(trace);
trace['transition']
Keep in mind that Metropolis acceptance is still quite low in this situation. And I observed that if the n in Multinominal is 1 the trace hardly move when the sparsity is high.
In case you want to speed up you inference a bit, you can use a Dirichlet Multinomial, where the intermediate probabilities (in your case theta) are marginalized out. I got my implementation running a few days ago. You can find it here: Gist