How to setup a Censored OrderedLogistic model?

I’m trying to setup a model for an outcome that is ordinal and censored. I get an error on running sample():

IndexError: indices must be an integer array

I have made a simple example (below). Am I doing something wrong either conceptually or in my code?

import pymc as pm
import numpy as np

rng = np.random.default_rng()

# simulate data
N = 100
stages = np.arange(5)
probs = [0.10, 0.05, 0.25, 0.4, 0.2]
simdat = rng.choice(a=stages, p=probs, size=N)
censored = rng.binomial(n=1, p=0.25, size=N)

# model setup
upper = [simdat[i] if censored[i] == 1 else max(simdat) for i in np.arange(N)]
N_cutpoints = len(stages) - 1
with pm.Model() as model:
    cutpoints = pm.Normal("cutpoints", 
                          mu=np.arange(N_cutpoints) - 2.5, 
                          sigma=2, 
                          transform=pm.distributions.transforms.ordered,
                          shape=N_cutpoints)
    
    ol_dist = pm.OrderedLogistic.dist(eta=0, cutpoints=cutpoints)
    y_obs = pm.Censored("y_obs", 
                        ol_dist, 
                        lower=None,
                        upper=upper,
                        observed=simdat)

# sample
with model:
    idata = pm.sample()

You’re seeing a bug because Censored makes the data be float64 instead of int64, and then the logprob of Categorical (which the OrderedLogistic is) fails because it only works with integers.

However, the larger problem is that Categorical doesn’t have a logcdf method implemented yet, which Censored needs to work.

Needed changes: Allow censoring Categorical distributions by ricardoV94 · Pull Request #7662 · pymc-devs/pymc · GitHub

Oh, okay. Thanks. I suppose there isn’t an explicit place for it, so I’ll just say here that you do great work and are helpful and I really appreciate it.

2 Likes