Latent Class Analysis Model (Dirichlet / Categorical Implementation Question)


#1

I’m trying to implement a polytomous latent class model:

(x[i, j] | Theta[i] = c, Pi[j] ~ Categorical(Pi[c,j]) for i = 1, 2, . . ., n and j = 1, 2, …, J
Theta[i] | Gamma ~ Categorical(Gamma) for i = 1, 2, . . ., n
Gamma ~ Dirichlet(Alpha[Gamma])
Pi[c,j] ~ Dirichlet(Alpha[Pi[c]]) for c = 1, 2, …, C and j = 1, 2, …, J

where n = sample size, c = classes, j = observations.

Let:

numClasses = 2 # Want to identify 2 distinguishing categories of subjects
numObservables = 5 # Imagine a survey of 5 questions where each question can be coded as 0, 1, 2, or 3
numSample = 100 # The number of subjects who filled out the survey, so 5 questions per subject

lcaData = 100 * 5 observable responses where each response is 0, 1, 2, or 3

Attempt 1:

with pm.Model() as lca_model:
    pi = pm.Dirichlet(‘pi’, np.ones((numClasses, numObservables)), shape=(numClasses, numObservables))
    gamma = pm.Dirichlet(‘gamma’, np.ones(2))
    theta = pm.Categorical(‘theta’, gamma, shape=numSample)
    likelihood = pm.Categorical(‘likelihood’, pi[theta], shape=numObservables, observed=lcaData)

This results in a shape mismatch of (100, ) (100, 5).

When I run this:

with pm.Model() as lca_model:
    pi = pm.Dirichlet(‘pi’, np.ones(numObservables))
    gamma = pm.Dirichlet(‘gamma’, np.ones(2))
    theta = pm.Categorical(‘theta’, gamma, shape=numSample)
    likelihood = pm.Categorical(‘likelihood’, pi[theta], observed=lcaData)

The dimensions don’t throw an exception, but the resulting shape of pi is 1 * 5 rather than 2 * 5, i.e. 5 results for the questions for each of the two categories.

I’m new to PyMC3, and I’ve been able to run a dichotomous version similar to the above, but I’m not sure about the translation to this model. Any advice would be appreciated.


#2

You might find this post helpful: Naive Bayes model with PyMC3


#3

Thank you very much for the reply.