Ordered probit model for ordinal data

Hi,

I am trying to implement an ordered-probit model described in a (draft) paper by Liddell & Kruschke (Paper and its source code http://osf.io/53ce9)

The paper implements the model in R/JAGS and is not terribly complex but I have difficulties getting it to work in PyMC3. https://github.com/JWarmenhoven/DBDA-python/blob/master/Notebooks/Ordinal%20Model_Kruschke_Liddell.ipynb

Because of the custom Theano Operation to calculate the thresholded cumulative normal probabilities I cannot sample using NUTS. The Metropolis sampler sampler does not converge.

Likelihood: I am not sure whether I use the right distribution and feeding it the correct data. The JAGS model seems to use counts for respective ordinal value and not percentages.

Any pointers?

Try with the recently implemened ordered logistic distribution, or something similar using the ordered transformation.
http://docs.pymc.io/api/distributions/discrete.html#pymc3.distributions.discrete.OrderedLogistic

This notebook might also gives some inspiration: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/Ports/brms_monotonic_compare.ipynb

After tweaking the sampling a little bit I get results that are comparable to those of the R/JAGS code. The sampler warns about low Effective Sample Size for some parameters though. I will have to investigate a little bit more.

Also, I wonder how I could get around the Theano as_op function. Not sure that that ordered logistic distribution would fit i this setting.

There should be way to rewrite the theano as_op into a theano function with switch and normal cdf.

I am also having trouble implementing the model described in the Liddell & Kruschke paper.

I am trying a simple example with fixed cutpoints, but pm.sample() gives ValueError: Mass matrix contains zeros on the diagonal..

Using pm.sample_smc() seems to work OK.

I’ve described my approach in a notebook (repo).

Does anybody have any ideas on getting the model to work with pm.sample(), or thoughts on how to go about debugging the problem? Thanks for any advice.

Did you try the solution in this Frequently Asked Questions? Likely is that p = tt.where(p <= 0, 1e-50, p) in your implementation breaks the gradient.

I receive the same error without the tt.where line. I hadn’t seen the solution you linked to - I’ll look into it, thanks for your help.

I am seeing a nan at this step:

# make sure model logp and its gradients are finite
logp, dlogp = step.integrator._logp_dlogp_func(q0)
print(logp)
step._logp_dlogp_func.array_to_dict(dlogp)
-147.5278053781309
{'sigma_log__': array(nan), 'mu': array(214.06774447)}

I’ll keep investigating what might be causing it.

My model is more or less a direct translation of the JAGS version. Although it gets similar results, I will need to look into optimizing it.

I think the problem is with calculating the gradient when there is an erf function in the model. I think that because SMC is using Metropolis under the hood, it doesn’t use the gradient and therefore samples fine. If you set your sampler to be non-gradient based, such as slice, it should sample.

1 Like

I looked back into this recently and I think it was due to the use of np.Infs in the cutpoints - when I replace them with small or large numbers, it seems to sample fine. I’ve updated the notebook.

Hmm. It does now sample, but horribly.