Suppose I have the following model:
with pm.Model() as model_hier:
packed_L = pm.LKJCholeskyCov('packed_L', n=2, eta=2., sd_dist=pm.HalfCauchy.dist(5))
L = pm.expand_packed_triangular(2, packed_L)
Sigma = pm.Deterministic('Sigma', L.dot(L.T))
mu = pm.Normal('mu', 0, 20, shape=2)
beta = pm.MvNormal('beta', mu=mu, chol=L, shape=(16, 2))
alphas = pm.invlogit(beta)
post = pm.Multinomial('post', n=np.sum(some_values, axis=1), p=alphas,
observed=some_values)
The goal is to find mu
, beta
and Sigma
; some_values
is an array with dimension [16, 2]
. Although there are many divergences, what I find weird is that multinomial
works. Why? Because almost always each row of alphas
(array of [16, 2]
) doesn’t sum one, and the sum must be one in a multinomial distribution. It samples without any problem! (besides the divergences, of course).
So, why does the multinomial work in this case?