Item response theory model with odd shape mismatch

alj · January 5, 2023, 12:33pm

Happy new year PyMC community!

Over the Christmas break I’ve been trying to learn some item response theory modelling. In particular I am trying to implement a graded response model for some Likert scale rating data, but I am running into some shape errors upon sampling, even though the prior predictive check runs fine and has the dimensions I expect. I

Graded response models seem fiddly as each item has N-1 “difficulty” parameters that represent the cut points where respondents switch from, say, a “2” response to a “3”. The below code simulates some data and matches my use case.

import aesara.tensor as at
import numpy as np
import pandas as pd
import pymc as pm

# Simulate some random data
n_raters = 60
n_items = 40
categories = 5
data = np.random.randint(0, high=categories, 
                         size=(n_raters, n_items))

# Prepare dataset
data = (pd.DataFrame(data)
        .rename(index='rater_{}'.format, columns='item_{}'.format)
        .rename_axis(index='rater')
        .reset_index()
        .melt(id_vars='rater', var_name='item', value_name='response')
        .assign(rater=lambda x: pd.Categorical(x['rater']),
                item=lambda x: pd.Categorical(x['item']))
       )

# Set up coords
c = {'rater': data['rater'].cat.categories,
     'item': data['item'].cat.categories,
     'rater_c': data['rater'].cat.codes,
     'item_c': data['item'].cat.codes,
     'K': np.arange(data['response'].nunique() - 1)
    }

# Model
with pm.Model(coords=c) as graded_response:
    
    # Latent ability + hyper parameter on scale
    σ_θ = pm.HalfNormal('σ_θ', sigma=3)
    θ = pm.Normal('θ', mu=0, sigma=σ_θ, dims='rater')
    
    # Discrimination parameter with hyper parameter on scale
    σ_α = pm.HalfNormal('σ_α', sigma=3)
    α = pm.TruncatedNormal('α', mu=1, sigma=σ_α, lower=0.001, dims='item')
    
    # Cut points, ordered, with a dimension for each item
    cut = pm.Normal('cut', mu=np.linspace(-2, 2, len(c['K'])), 
                    sigma=5, 
                    transform=pm.distributions.transforms.ordered,
                    dims=('item', 'K'))
    
    # logit-predictor
    eta = pm.Deterministic('eta', α[c['item_c']] * θ[c['rater_c']])
    
    # Ordered logistic
    pm.OrderedLogistic('ll', eta=eta, cutpoints=cut[c['item_c']], 
                       observed=data['response'].values)
    
    # Sample prior predictive - works ok!
    #prpr = pm.sample_prior_predictive()
    
    # InferenceButton™
    posterior = pm.sample()

When running the above code I get:
ValueError: Input dimension mismatch. One other input has shape[1] = 4, but input[5].shape[1] = 40.
which must be an issue with the cut variable. Inspired by @drbenvincent 's work on ordinal regression I’ve tried swapping out the cut variable for a cumulative-sum-Dirichlet, like so:

cut = at.extra_ops.cumsum(pm.Dirichlet('cut', a=np.ones(len(c['K'])),
                                        shape=('item', 'K')))

but there are some more strange shape errors with this in the prior predictive check and it doesn’t sample.

I’ve managed to get this Stan code running which does what I want, and gives the same dimensions as the PyMC prior predictive for my model above, but I really want to stay in PyMC for this.

I’m hoping I am missing something simple here - any advice is deeply appreciated. However I checked out the great talk with @Morgan_Pihl, @tcapretto, and @twiecki on IRT in PyMC here, and that made me think I might need something much more complex!

Thank you!

tcapretto · January 5, 2023, 6:37pm

The problem is the transformation that’s used in the cutpoints. Run the following and you’ll see it starts sampling:

with pm.Model(coords=c) as graded_response:
    σ_θ = pm.HalfNormal('σ_θ', sigma=3)
    θ = pm.Normal('θ', mu=0, sigma=σ_θ, dims='rater')

    σ_α = pm.HalfNormal('σ_α', sigma=3)
    α = pm.TruncatedNormal('α', mu=1, sigma=σ_α, lower=0.001, dims='item')
    
    cut = pm.Normal(
        'cut', 
        mu=np.linspace(-2, 2, len(c['K'])), 
        sigma=5, 
        dims=('item', 'K')
    )
    # explicit sorting 
    # This is NOT a solution, just a trick I used to see if the problem was the transformation 
    cut = at.sort(cut, axis=-1) 
    
    eta = pm.Deterministic('eta', α[c['item_c']] * θ[c['rater_c']])
    
    # Ordered logistic
    output = pm.OrderedLogistic(
        'll', 
        eta=eta, 
        cutpoints=cut[c['item_c']], 
        observed=data['response'].to_numpy(),
    )
    idata = pm.sample()

I suggest updating to PyMC 5 and using pm.distributions.transforms.univariate_ordered.

pm.distributions.transforms.ordered was replaced here https://github.com/pymc-devs/pymc/pull/6255.

Also nootice PyMC 5 uses PyTensor (a fork of Aesara) instead of Aesara

with pm.Model(coords=c) as graded_response:
    σ_θ = pm.HalfNormal('σ_θ', sigma=3)
    θ = pm.Normal('θ', mu=0, sigma=σ_θ, dims='rater')

    σ_α = pm.HalfNormal('σ_α', sigma=3)
    α = pm.TruncatedNormal('α', mu=1, sigma=σ_α, lower=0.001, dims='item')
    
    cut = pm.Normal(
        'cut', 
        mu=np.linspace(-2, 2, len(c['K'])), 
        sigma=5, 
        transform=pm.distributions.transforms.univariate_ordered,
        dims=('item', 'K')
    )    
    eta = pm.Deterministic('eta', α[c['item_c']] * θ[c['rater_c']])
    
    # Ordered logistic
    output = pm.OrderedLogistic(
        'll', 
        eta=eta, 
        cutpoints=cut[c['item_c']], 
        observed=data['response'].to_numpy(),
    )
    idata = pm.sample()

If for some reason you still need PyMC 4, try using the latest release of PyMC 4.

Let me know if it works and have fun with IRT

Simon · January 5, 2023, 7:33pm

Another option could be using PyMC3, the model in this repo works well for me: GitHub - SimonErnesto/IRT_GRM_Bayesian: Bayesian approach to item response theory (IRT) implementing a graded response model (GRM) on questionnaire data.
It’s very similar to your model:

sub_lookup = dict(zip(data.subject.unique(), range(len(data.subject.unique()))))
sub_s = data.subject.replace(sub_lookup).values #subject index

item_lookup = dict(zip(data.item.unique(), range(len(data.item.unique()))))
item_i = data.item.replace(item_lookup).values #item index


responses = data.score.values
C = len(data.score.unique()) #Number of categories (unique scores)
I = len(data.item.unique()) #Number of items
D = len(data.dimension.unique()) #Number of participants/subjects
S = len(data.subject.unique()) #Number of participants/subjects

mat_dot = pm.math.matrix_dot


with pm.Model() as mod:
    
    #delta = pm.HalfNormal('delta', sigma=0.5, shape=(I), testval=np.ones(I)) #discrimination parameter
    delta = pm.Lognormal('delta', mu=0, sigma=0.5, shape=(I), testval=np.ones(I)) #discrimination parameter

    psi = pm.Normal('psi', mu=0, sigma=0.5, shape=S, testval=np.ones(S)) #participant/subject parameter
    
    ksigma = pm.HalfNormal('ksigma', 0.5)
    kmu = pm.Normal('kmu', [-4,-2,2,4], 0.5, shape=C-1)
    kappa = pm.Normal('kappa', mu=kmu, sigma=ksigma, transform=pm.distributions.transforms.ordered,
                      shape=(I,C-1), testval=np.arange(C-1)) #cutpoints/difficulty parameter 
    
    eta = delta[item_i]*psi[sub_s] #estimate
    
    y = pm.OrderedLogistic('y', cutpoints=kappa[item_i], eta=eta, observed=responses)

Hope it helps, in case you cannot get it running with PyMC4 or 5.

alj · January 5, 2023, 7:43pm

Thank you so much @tcapretto! Saved me once again from madness! This does indeed sample and looks good! I did get this warning after updating - not sure if this is something to worry too much about:

AmbiguityWarning: Ambiguities exist in dispatched function _unify

I had a functional trick in the end that was working, based on the ordinal regression discussions here. I had a Dirichlet prior with items x categories, with a cumulative sum across the categories axis. I then made it deterministic but sliced off the final column (which was always set to a fixed point), and rescaled it so it was between say ±8. Very hacky but it worked, to my surprise!

Really happy to see the fix for the .ordered transform. That has given me some headaches over the past few years. I also saw that there is an alternative parameterisation that estimates the cutpoints separately to the difficulty parameter, e.g. alpha * (theta - difficulty) - cutpoint that Morgan uses elsewhere on here. I don’t know a huge amount about IRT but this seems perhaps more interpretable, keeping the item difficulty distinct from the cutpoints - do you have any preferences or thoughts on that based on your work with PyMC Labs and Alva?

Thank you again!

alj · January 5, 2023, 7:48pm

Thank you Simon - this looks great, I wish I had seen this earlier this week! Its really reassuring to see that I had the structure right if nothing else - and so interesting that this worked out OK in PyMC3. It seems like the .univariated_ordered transform solved the problem, and its much nicer to be able to put priors on the cutpoints without tricky re-parameterisations. I wonder now if this transform will help the ordinal regression problems I’ve been struggling with.

Thanks for sharing, love those plots!

Morgan_Pihl · January 5, 2023, 9:01pm

Hello! Glad that you enjoyed our webinar @alj, and delighted that you even found your way to my old github repo! Great response from @tcapretto, I have nothing to add there.

When it comes to your question about the parameterization, I do have something to add. I actually first started out with an implementation of the “modified graded response model” (Muraki, 1990; Fitting a Polytomous Item Response Model to Likert-Type Data), which basically assumes the same distance between cutpoints for all items in a test. Instead of estimating K-1 cutpoints for N items (as in the GRM), leading to (K-1) * N item parameters, you would estimate N difficulties and K-1 “offsets” for the cutpoints, leading to (K-1) + N parameters. I initially found that to be simpler and faster, even though it’s based on a pretty restrictive assumption. Perhaps that’s why you’ve seen me using the parameterization you mention in the past.

Nowadays I prefer the original parameterization, with a twist introduced by @tcapretto et al. to improve identifiability for the alpha parameter. Instead of estimating the cutpoints directly, we estimate the cutpoints scaled by alpha! To get the cutpoints, we simply divide by alpha afterwards:

eta = alpha * theta - cutpoints_tilde
cutpoints = cutpoints_tilde / alpha

Best regards,
Morgan

alj · January 5, 2023, 9:45pm

Thanks for taking a look here @Morgan_Pihl - your presentation and Stan code were really helpful and great guidance for me setting up the above model (and prompted me to try Stan which I had not done yet!).

That’s really interesting regarding the alternative parameterisation. In the data I am working with here I am finding wide intervals and not much difference amongst the alpha parameter, which doesn’t feel quite right to me. I’ll try your above suggestion; do you have an ordered normal prior on the cutpoints and immediately scale by alpha, or did you use a different cutpoint prior? I remember @tcapretto alluding to a “clever way” of setting up the prior here which I was interested to learn more about!

Thank you and best wishes for 2023!

cluhmann · January 6, 2023, 2:38am

The ambiguity warning is likely due to you having both aesara and pytensor installed. If you are using PyMC 5, I would suggest removing aesara and that should eliminate the warnings.

alj · January 6, 2023, 11:49am

Thank you @cluhmann that has done the trick! Awesome stuff.

Topic		Replies	Views
Correct shapes in multidimensional item response 2PL v5 shape_issue , modeling	12	312	February 29, 2024
Bernoulli Mixture Model - shape mismatch v5 modeling	3	331	July 27, 2023
Shape mismatch error in 'sample_ppc' Questions	8	868	March 6, 2019
Time-series regression - shape error / Input dimension mis-match Questions time_series	8	1109	August 6, 2022
Predicting student exam scores v5 modeling	4	290	October 18, 2023

Item response theory model with odd shape mismatch

Related topics