Kuschke-style fixed threshold cumulative ordinal probit model

I’m still a relatively new user of PyMC and was interested in fitting a cumulative ordinal probit model, with the fixed threshold parameterisation popularised by John Kruschke in Doing Bayesian Data Analysis. Namely, for K ordinal categories, there are K + 1 degrees of freedom (K - 1 thresholds, plus the latent scale mean and standard deviation). The classic approach is to assume the location and scale are 0 and 1 respectively (using the standard normal or logistic CDF) and estimate all the thresholds, whereas Kruschke fixes the first and last thresholds and estimates the latent scale parameters directly.

It seems this model parameterisation has been discussed a bit in the past (e.g. see here or here) but no clear resolution from what my Googling could tell me.

I implemented a version that I had ported to Stan a while ago (edit: the simplex trick was something Bob Carpenter wrote out), and it seems to be working quite nicely (if sampling slowly) although I haven’t tested it out extensively yet. Placing here if it’s of interest, and any feedback or help with making it cleaner would be useful too. I can do a more general comparison later on too.

import pymc as pm
import aesara.tensor as at

pm_ordinal = pm.Model()

with pm_ordinal:
   # `data` is a dictionary of observed data
   # Lower fixed threshold
    L = data["L"]
    # Upper fixed threshold
    U = data["U"]
    # Number of ordinal categories
    K = data["K"]
    J = K - 1
    # Ordinal data
    y = data["y"] - 1

    # Latent scale mean
    mu = pm.Normal("mu")
    
    # Latent scale standard deviation
    sigma = pm.HalfNormal("sigma", 5)

    # Represent the free thresholds as a simplex 
    theta_raw = pm.Uniform("theta_raw", shape=(J,), transform=pm.distributions.transforms.simplex)
    # Derive the estimated thresholds using theta_raw, where theta[K - 1] == U
    theta_partial = pm.Deterministic("theta_partial", L + (U - L) * at.extra_ops.cumsum(theta_raw)) 
    # Concatenate L to theta_partial -- no idea if this is efficient or not
    theta_star = pm.Deterministic("theta_star", pm.math.concatenate( ([L], theta_partial)))

    y_lik = pm.OrderedProbit("y_lik", cutpoints=theta_star, eta=mu, sigma=sigma, observed=y)
    trace = pm.sample(draws=1000, random_seed=1234)
2 Likes

Just realised that @drbenvincent has a similar approach here, which I missed.

Thanks for the contribution and tagging me. We have plans to write up a fully worked out example of ordinal regression here Analysing ordinal data in PyMC · Issue #277 · pymc-devs/pymc-examples · GitHub We are just waiting on a few things before we can make that happen.

1 Like

Cool! You could take advantage of built-in support for coords and dims. I don’t think there’s anything inefficient about the concatenate, but others would know more. For the sampling speed part, there are a couple other NUTS implementations you might try:

Using the Jax backend, you can try NumPyro’s sampler:

import pymc.sampling_jax

with model:
    idata = pm.sampling_jax.sample_numpyro_nuts(chains=4, tune=1000, draws=1000, target_accept=0.8)

or BlackJAX:

import pymc.sampling_jax

with model:
    idata = pm.sampling_jax.sample_blackjax_nuts(chains=4, tune=1000, draws=1000, target_accept=0.8)

Using the Numba backend, there is nutpie:

import nutpie 

with model:
    compiled_model = nutpie.compile_pymc_model(model)
    idata = nutpie.sample(model, chains=4, tune=1000, draws=1000, target_accept=0.8)

Great, thanks @bwengals!