Just realized I can mark this as solved thanks to @junpenglao’s suggestion of considering that non-available categories have probability 0 – this implies we know in advance which categories are absent. Of course, I’m curious about @aseyboldt thoughts on profiling and speed-ups if he’s got any!
Here is a simplified version of the model, for the sake of readability and understanding how to implement the idea with PyMC:
with pm.Model() as m_multi:
a_cluster = pm.Normal("a_cluster", -1.8, 0.5, shape=(Nclusters, Ncategories - 1))
a_pivot = tt.as_tensor_variable(np.full(shape=(Nclusters, 1), fill_value=-2.2))
a_cluster_f = tt.horizontal_stack(a_cluster a_pivot)
lat_p = tt.nnet.softmax(a_cluster_f[cluster_id])
# zero-inflation process:
# keep only preferences for available categories:
slot_p = cats_available * lat_p
# normalize preferences:
slot_p = slot_p / tt.sum(slot_p, axis=1, keepdims=True)
R = pm.Multinomial("R", n=N, p=slot_p, observed=sim_R)
trace_multi = pm.sample(2000, tune=3000, init="adapt_diag")
Hope this will help future users!
For the curious out there, the real model is much more complicated (varying-effects and covariance) but I’ll open source it – as well as real and simulated data – once I’ve had time to clean it up!
And again, a big thank you to Junpeng Couldn’t have done it without your help! Now I owe you a second beer when you come to Paris