I have the following model, based on Latent Dirichlet Allocation (Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022.) that has a pm.Categorical
in it:
alpha = np.ones((1, K))
beta_prior = np.ones((1, V))
num_words = df.shape[0]
with pm.Model() as model:
doc_num = pm.Data('i', df['Document'])
theta = pm.Dirichlet("θ", a=alpha, shape=(D, K)) # topic probabilities
beta = pm.Dirichlet("beta", a=beta_prior, shape=(K, V)) # word probabilities
w = pm.Categorical("w",
p=t.dot(theta[doc_num], beta),
shape=6583,
observed=df['Word']
)
I was a little surprised to see that NUTS was automatically assigned as a sampler (and that it is expected to take approximately 24h on my MacBook Pro!). Should I be using a different sampling method for this? In the past I had seen a hybrid NUTS, M-H sampler. Is this because it has only the observed RV as categorical (I marginalized away an internal variable)?
Also, should I be replacing my use of Categorical
with a Mixture
?