Ordered probit model for categorical data

EDIT 1: This model has been brought up here before. Seems like the author of that post was using a custom Op for the normal CDF, but it should be possible to do this in vanilla theano/pymc3 since tt.erf exists.

EDIT 2: Based on the previous post, I have re-written the description and code to be more consistent with previous work. The error does not go away, but I get closer.

I’m trying to model ordinal data (e.g., Likert scale) with K possible ordinal values.

The ordered probit model approach from Kruschke (Chap. 23 in DBDA) assumes there is an underlying latent normal random variable with unknown mean \mu and scale \sigma, which is chopped up into K based on K-1 fixed thresholds, denoted by \theta_1, \theta_2, \dots \theta_{K-1}, . Without loss of generality, the lowest and highest thresholds are fixed: \theta_1\equiv1.5 and \theta_{K-1}\equiv K - 0.5.

Here is a minimal example:

import numpy as np
import pymc3 as pm
import theano.tensor as tt

with pm.Model():

    K = 7 
    data = np.random.randint(K, size=100)

    thresholds = np.linspace(1.5, K - .5, K - 1)
    obsthresh = np.ma.asarray(thresholds)
    obsthresh[1:-1] = np.ma.masked

    theta = pm.Normal("theta", mu=thresholds, sigma=K, shape=K, observed=obsthresh)
    theta = tt.concatenate([[-np.inf], theta, [np.inf]])

    mu = pm.Normal(name="mu", mu=(1 + K) / 2, sd=K)
    sig = pm.HalfCauchy(name="sig", beta=1)

    _p = 0.5 * (1 + pm.math.erf((theta - mu) / (sig * tt.sqrt(2))))
    p = _p[1:] - _p[0:-1]
    pm.Deterministic(name="p", var=p)

    # prior on data
    pm.Categorical(name='y', p=p, observed=data)

    trace = pm.sample(chains=2)  # adding `step=pm.Slice()` here fixe "bad initial energy"

This works nicely with slice sampling but not with NUTS, and I don’t know why. Two possibilities come to mind:

  1. There is an erf in the model.
  2. The lack of monotonicity constraint on the middle thresholds.

I can’t figure out if either or both of these are real problems. Any suggestions?