Hi,
I am trying to implement an ordered-probit model described in a (draft) paper by Liddell & Kruschke (Paper and its source code http://osf.io/53ce9)
The paper implements the model in R/JAGS and is not terribly complex but I have difficulties getting it to work in PyMC3. https://github.com/JWarmenhoven/DBDA-python/blob/master/Notebooks/Ordinal%20Model_Kruschke_Liddell.ipynb
Because of the custom Theano Operation to calculate the thresholded cumulative normal probabilities I cannot sample using NUTS. The Metropolis sampler sampler does not converge.
Likelihood: I am not sure whether I use the right distribution and feeding it the correct data. The JAGS model seems to use counts for respective ordinal value and not percentages.
Any pointers?
After tweaking the sampling a little bit I get results that are comparable to those of the R/JAGS code. The sampler warns about low Effective Sample Size for some parameters though. I will have to investigate a little bit more.
Also, I wonder how I could get around the Theano as_op function. Not sure that that ordered logistic distribution would fit i this setting.
There should be way to rewrite the theano as_op into a theano function with switch and normal cdf.
I am also having trouble implementing the model described in the Liddell & Kruschke paper.
I am trying a simple example with fixed cutpoints, but pm.sample()
gives ValueError: Mass matrix contains zeros on the diagonal.
.
Using pm.sample_smc()
seems to work OK.
I’ve described my approach in a notebook (repo).
Does anybody have any ideas on getting the model to work with pm.sample()
, or thoughts on how to go about debugging the problem? Thanks for any advice.
Did you try the solution in this Frequently Asked Questions? Likely is that p = tt.where(p <= 0, 1e-50, p)
in your implementation breaks the gradient.
I receive the same error without the tt.where
line. I hadn’t seen the solution you linked to - I’ll look into it, thanks for your help.
I am seeing a nan
at this step:
# make sure model logp and its gradients are finite
logp, dlogp = step.integrator._logp_dlogp_func(q0)
print(logp)
step._logp_dlogp_func.array_to_dict(dlogp)
-147.5278053781309
{'sigma_log__': array(nan), 'mu': array(214.06774447)}
I’ll keep investigating what might be causing it.
My model is more or less a direct translation of the JAGS version. Although it gets similar results, I will need to look into optimizing it.
I think the problem is with calculating the gradient when there is an erf
function in the model. I think that because SMC is using Metropolis under the hood, it doesn’t use the gradient and therefore samples fine. If you set your sampler to be non-gradient based, such as slice
, it should sample.
1 Like
I looked back into this recently and I think it was due to the use of np.Inf
s in the cutpoints - when I replace them with small or large numbers, it seems to sample fine. I’ve updated the notebook.
Hmm. It does now sample, but horribly.