Review mIRT model and processes

For introduction, we are currently developing mIRT model to measure the underlying 28 abilities of candidates based on their responses to our questionnaire. What we have done so far is inspired by one of the paper that incorporate Q-matrix concept into mIRT.
The model we are currently experiment with is following the formula stated in Reckase 2009:

P(y_{ij}=1)=invlogit(a_i\theta^T_j+b_i)

For context, the questionnaire form we have consist of 96 questions. Each question is labelled to contribute to an ability using Q-Matrix.

Example of the Q-Matrix:

Questions Reading Writing N abilites…
Q1 0 1
Q2 1 0

The Q-Matrix (item_factors) above is applied on discrimination a before computing the probability.

with pm.Model() as model:
    # Hyperpriors
    sigma_diff = pm.HalfCauchy("sigma_diff", beta=2)
    mu_diff = pm.Cauchy("mu_diff", alpha=0, beta=1)
    mu_discrim = pm.HalfCauchy("mu_discrim", beta=0.5)

    # Parameters priors #
    # Difficulty
    diff_offset = pm.Normal(
        "diff_offset", mu=0, sigma=1, shape=(self.n_items, 1)
    )
    diff = pm.Deterministic("diff", mu_diff + diff_offset * sigma_diff)
    # Discrimination #
    discrim_offset = pm.LogNormal(
        "discrim_offset", mu=0, sigma=0.3, shape=(self.n_items, self.n_factors)
    )
    # Masking with Q-matrix
    discrim = pm.Deterministic(
        "discrim", (mu_discrim + discrim_offset) * self.item_factors
    )
    # Abilities prior #
    theta = pm.Normal("theta", mu=0, sigma=1, shape=(self.n_persons, self.n_factors))

    kernel = pm.math.dot(discrim, theta.T) + diff
    probabilities = pm.math.invlogit(kernel)
    
    # Likelihood
    observed = pm.Bernoulli(
        "observed", p=probabilities, observed=self.response_train
    )

The Bayesian model above is to estimate the item parameters (discrimination and difficulty).

I assumed that the model is identifiable based on fixing item parameters to zero by using Q-Matrix.

After having item parameters, negative log likelihood is used to estimate the ability of each candidate with SLSQP optimizer.

thetas = np.zeros((self.n_factors, persons))
# Transform response from 0 to -1
responses = np.where(responses == 0, -1, 1)

for n in range(persons):
    candidate_response = responses[:, n]

    def estimate_ability(theta):
        # Negative log likelihood
        irt = np.dot(discrimination, theta.T) + difficulty
        prob = scipy.special.expit(candidate_response * irt)
        ll = np.sum(np.log(prob) + 1e-10)
        return -ll

    optimizer = scipy.optimize.minimize(
        estimate_ability,
        np.zeros((self.n_factors)),
        bounds=[
            (-4, 4),
        ]
        * self.n_factors,
        method="SLSQP",
    )
    thetas[:, n] = optimizer.x
    if optimizer.success == False:
        raise Exception("Optimizer failed")

The area that we are currently not confident would be:

  1. Is the model truly identifiable?
  2. Any mistake or potential error of the model?
  3. Is the ability estimation correct? (A bit beyond PyMC and Bayesian modeling)

Any feedback and help would be appreciated!