Hello,
I’m trying to implement a model for ordered response data, where ratings are collected using a survey with a Likert-type rating scale (1-5).
The model proposed by Muraki (1990) is a modification of the Graded Response Model (Samejima, 1969). The probability of a response option k \in {1, ..., K} is defined as:
where the cumulative probability of an observed response X being higher than k depends on the person ability \theta, the question difficulty \beta and the option threshold \delta:
Code to generate artificial data:
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Generate data
n_persons = 500
n_questions = 20
n_options = 5
# Latent variables
true_theta = np.random.normal(loc=0, scale=1, size=n_persons)
true_beta = np.random.normal(loc=0, scale=1, size=n_questions)
true_alpha = np.random.normal(loc=1, scale=0.1, size=n_questions)
true_delta = np.array([-1.5, -0.5, 0.5, 1.5])
# Cumulative probabilities
p_x_over_1 = np.zeros(shape=(n_persons, n_questions))
p_x_over_2 = np.zeros(shape=(n_persons, n_questions))
p_x_over_3 = np.zeros(shape=(n_persons, n_questions))
p_x_over_4 = np.zeros(shape=(n_persons, n_questions))
for j in range(n_questions):
eta1 = true_alpha[j] * (true_theta - true_beta[j] - true_delta[0])
p_x_over_1[:, j] = sigmoid(eta1)
eta2 = true_alpha[j] * (true_theta - true_beta[j] - true_delta[1])
p_x_over_2[:, j] = sigmoid(eta2)
eta3 = true_alpha[j] * (true_theta - true_beta[j] - true_delta[2])
p_x_over_3[:, j] = sigmoid(eta3)
eta4 = true_alpha[j] * (true_theta - true_beta[j] - true_delta[3])
p_x_over_4[:, j] = sigmoid(eta4)
# Likelihoods
p_x_1 = 1 - p_x_over_1
p_x_2 = p_x_over_1 - p_x_over_2
p_x_3 = p_x_over_2 - p_x_over_3
p_x_4 = p_x_over_3 - p_x_over_4
p_x_5 = p_x_over_4
# Likelihood 3-dimensional array
# n_options x n_persons x n_questions
p_x = np.array((p_x_1, p_x_2, p_x_3, p_x_4, p_x_5))
# Simulate response per person per question
x = np.zeros(shape=(n_persons, n_questions)).astype(int)
for i in range(n_persons):
for j in range(n_questions):
x[i, j] = np.random.choice(a=[0,1,2,3,4], size=1, p=p_x[:, i, j])[0]
# Response matrix to vector
x_obs= x.flatten('C')
person_index = np.repeat(range(n_persons), n_questions)
question_index = np.tile(range(n_questions), n_persons)
I’ve had some success with the model specification below. It runs fine using MAP and MCMC, but when trying to use ADVI, I keep getting errors.
# Define model
with pm.Model() as grm_model:
# Person variable
theta = pm.Normal(
'theta', mu=0, sd=1,
shape=n_persons
)
# Question variables
beta = pm.Normal(
'beta', mu=0, sd=1,
shape=n_questions
)
alpha = pm.TruncatedNormal(
'alpha', mu=1, sd=0.1, lower=0,
shape=n_questions
)
delta = pm.Normal(
'delta', mu=[-1.5, -0.5, 0.5, 1.5], sd=0.1,
transform=pm.distributions.transforms.ordered,
testval=[-1.5, -0.5, 0.5, 1.5],
shape=4
)
# Likelihood = Item Response Function
eta = alpha[question_index] * (theta[person_index] - beta[question_index])
# Outcome
x = pm.OrderedLogistic(
'x', eta=eta, cutpoints=delta,
observed=x_obs,
testval=np.random.randint(4, size=n_persons * n_questions),
shape=(n_persons * n_questions)
)
Running this code produces FloatingPointError: NaN occurred in optimization.
with grm_model:
advi = pm.ADVI()
approx = advi.fit(n=10000)
Running this code works fine:
with grm_model:
trace = pm.sample(
draws=500,
tune=500,
chains=4,
init='adapt_diag'
)
When removing the alpha parameter from the model, ADVI runs fine. I have tried experimenting with different distributions for alpha, such as HalfCauchy and Lognormal, but the error persists.
What could be the issue? Am I doing something wrong in defining the likelihood eta, or the way I’m using OrderedLogistic?