I’m trying to implement a polytomous latent class model:
(x[i, j] | Theta[i] = c, Pi[j] ~ Categorical(Pi[c,j]) for i = 1, 2, . . ., n and j = 1, 2, …, J
Theta[i] | Gamma ~ Categorical(Gamma) for i = 1, 2, . . ., n
Gamma ~ Dirichlet(Alpha[Gamma])
Pi[c,j] ~ Dirichlet(Alpha[Pi[c]]) for c = 1, 2, …, C and j = 1, 2, …, J
where n = sample size, c = classes, j = observations.
Let:
numClasses = 2 # Want to identify 2 distinguishing categories of subjects
numObservables = 5 # Imagine a survey of 5 questions where each question can be coded as 0, 1, 2, or 3
numSample = 100 # The number of subjects who filled out the survey, so 5 questions per subject
lcaData = 100 * 5 observable responses where each response is 0, 1, 2, or 3
Attempt 1:
with pm.Model() as lca_model:
pi = pm.Dirichlet(‘pi’, np.ones((numClasses, numObservables)), shape=(numClasses, numObservables))
gamma = pm.Dirichlet(‘gamma’, np.ones(2))
theta = pm.Categorical(‘theta’, gamma, shape=numSample)
likelihood = pm.Categorical(‘likelihood’, pi[theta], shape=numObservables, observed=lcaData)
This results in a shape mismatch of (100, ) (100, 5).
When I run this:
with pm.Model() as lca_model:
pi = pm.Dirichlet(‘pi’, np.ones(numObservables))
gamma = pm.Dirichlet(‘gamma’, np.ones(2))
theta = pm.Categorical(‘theta’, gamma, shape=numSample)
likelihood = pm.Categorical(‘likelihood’, pi[theta], observed=lcaData)
The dimensions don’t throw an exception, but the resulting shape of pi is 1 * 5 rather than 2 * 5, i.e. 5 results for the questions for each of the two categories.
I’m new to PyMC3, and I’ve been able to run a dichotomous version similar to the above, but I’m not sure about the translation to this model. Any advice would be appreciated.