I want to use a mixture of multivariate Bernoulli distributions as an intermediate discrete latent variable in a hierarchical model. I noticed it was probably easier to do this as a custom distribution. Thankfully someone else has already written the function, and it works great when the discrete variable is the observed variable.
def bernoulli_loglh(theta, weights):
def _logp(value):
value = value > 0.5
value_neg = 1 - value
logtheta = tt.log(theta)
neglogtheta = tt.log(1-theta)
# N*K matrix of likelihood contributions from X_i with theta_k
betas = tt.tensordot(value, logtheta, axes=[1, 1]) + tt.tensordot(value_neg, neglogtheta, axes=[1, 1])
## Form alphas, NxK matrix with the component weights included
alphas = (betas + tt.log(weights)).T
# Take LSE rowise to get N vector
# and add alpha_cluster1 to get the total likelihood across all
# components for each X_i
lse_clust = pm.math.logsumexp(alphas - alphas[0, :], axis=0) + alphas[0,:]
# Final overall sum
return tt.sum(lse_clust)
return _logp
The problem is that when I use it as a latent variable, I am getting continuous values in the sampling, even when I specify Metropolis Hastings sampling. This makes sense, because the domain is never specified.
One hacky workaround I tried was to do the value > 0.5 thing everywhere that the variable appeared, but I didn’t really get good results with this and I’d ultimately want something that works better.