Happy new year PyMC community!
Over the Christmas break I’ve been trying to learn some item response theory modelling. In particular I am trying to implement a graded response model for some Likert scale rating data, but I am running into some shape errors upon sampling, even though the prior predictive check runs fine and has the dimensions I expect. I
Graded response models seem fiddly as each item has N-1 “difficulty” parameters that represent the cut points where respondents switch from, say, a “2” response to a “3”. The below code simulates some data and matches my use case.
import aesara.tensor as at
import numpy as np
import pandas as pd
import pymc as pm
# Simulate some random data
n_raters = 60
n_items = 40
categories = 5
data = np.random.randint(0, high=categories,
size=(n_raters, n_items))
# Prepare dataset
data = (pd.DataFrame(data)
.rename(index='rater_{}'.format, columns='item_{}'.format)
.rename_axis(index='rater')
.reset_index()
.melt(id_vars='rater', var_name='item', value_name='response')
.assign(rater=lambda x: pd.Categorical(x['rater']),
item=lambda x: pd.Categorical(x['item']))
)
# Set up coords
c = {'rater': data['rater'].cat.categories,
'item': data['item'].cat.categories,
'rater_c': data['rater'].cat.codes,
'item_c': data['item'].cat.codes,
'K': np.arange(data['response'].nunique() - 1)
}
# Model
with pm.Model(coords=c) as graded_response:
# Latent ability + hyper parameter on scale
σ_θ = pm.HalfNormal('σ_θ', sigma=3)
θ = pm.Normal('θ', mu=0, sigma=σ_θ, dims='rater')
# Discrimination parameter with hyper parameter on scale
σ_α = pm.HalfNormal('σ_α', sigma=3)
α = pm.TruncatedNormal('α', mu=1, sigma=σ_α, lower=0.001, dims='item')
# Cut points, ordered, with a dimension for each item
cut = pm.Normal('cut', mu=np.linspace(-2, 2, len(c['K'])),
sigma=5,
transform=pm.distributions.transforms.ordered,
dims=('item', 'K'))
# logit-predictor
eta = pm.Deterministic('eta', α[c['item_c']] * θ[c['rater_c']])
# Ordered logistic
pm.OrderedLogistic('ll', eta=eta, cutpoints=cut[c['item_c']],
observed=data['response'].values)
# Sample prior predictive - works ok!
#prpr = pm.sample_prior_predictive()
# InferenceButton™
posterior = pm.sample()
When running the above code I get:
ValueError: Input dimension mismatch. One other input has shape[1] = 4, but input[5].shape[1] = 40.
which must be an issue with the cut
variable. Inspired by @drbenvincent 's work on ordinal regression I’ve tried swapping out the cut
variable for a cumulative-sum-Dirichlet, like so:
cut = at.extra_ops.cumsum(pm.Dirichlet('cut', a=np.ones(len(c['K'])),
shape=('item', 'K')))
but there are some more strange shape errors with this in the prior predictive check and it doesn’t sample.
I’ve managed to get this Stan code running which does what I want, and gives the same dimensions as the PyMC prior predictive for my model above, but I really want to stay in PyMC for this.
I’m hoping I am missing something simple here - any advice is deeply appreciated. However I checked out the great talk with @Morgan_Pihl, @tcapretto, and @twiecki on IRT in PyMC here, and that made me think I might need something much more complex!
Thank you!