I’m still a relatively new user of PyMC
and was interested in fitting a cumulative ordinal probit model, with the fixed threshold parameterisation popularised by John Kruschke in Doing Bayesian Data Analysis. Namely, for K ordinal categories, there are K + 1 degrees of freedom (K - 1 thresholds, plus the latent scale mean and standard deviation). The classic approach is to assume the location and scale are 0 and 1 respectively (using the standard normal or logistic CDF) and estimate all the thresholds, whereas Kruschke fixes the first and last thresholds and estimates the latent scale parameters directly.
It seems this model parameterisation has been discussed a bit in the past (e.g. see here or here) but no clear resolution from what my Googling could tell me.
I implemented a version that I had ported to Stan a while ago (edit: the simplex trick was something Bob Carpenter wrote out), and it seems to be working quite nicely (if sampling slowly) although I haven’t tested it out extensively yet. Placing here if it’s of interest, and any feedback or help with making it cleaner would be useful too. I can do a more general comparison later on too.
import pymc as pm
import aesara.tensor as at
pm_ordinal = pm.Model()
with pm_ordinal:
# `data` is a dictionary of observed data
# Lower fixed threshold
L = data["L"]
# Upper fixed threshold
U = data["U"]
# Number of ordinal categories
K = data["K"]
J = K - 1
# Ordinal data
y = data["y"] - 1
# Latent scale mean
mu = pm.Normal("mu")
# Latent scale standard deviation
sigma = pm.HalfNormal("sigma", 5)
# Represent the free thresholds as a simplex
theta_raw = pm.Uniform("theta_raw", shape=(J,), transform=pm.distributions.transforms.simplex)
# Derive the estimated thresholds using theta_raw, where theta[K - 1] == U
theta_partial = pm.Deterministic("theta_partial", L + (U - L) * at.extra_ops.cumsum(theta_raw))
# Concatenate L to theta_partial -- no idea if this is efficient or not
theta_star = pm.Deterministic("theta_star", pm.math.concatenate( ([L], theta_partial)))
y_lik = pm.OrderedProbit("y_lik", cutpoints=theta_star, eta=mu, sigma=sigma, observed=y)
trace = pm.sample(draws=1000, random_seed=1234)