# Best way to define a custom discrete distribution to use as a latent variable?

I want to use a mixture of multivariate Bernoulli distributions as an intermediate discrete latent variable in a hierarchical model. I noticed it was probably easier to do this as a custom distribution. Thankfully someone else has already written the function, and it works great when the discrete variable is the observed variable.

``````def bernoulli_loglh(theta, weights):
def _logp(value):
value = value > 0.5
value_neg = 1 - value
logtheta = tt.log(theta)
neglogtheta = tt.log(1-theta)

# N*K matrix of likelihood contributions from X_i with theta_k
betas = tt.tensordot(value, logtheta, axes=[1, 1]) + tt.tensordot(value_neg, neglogtheta, axes=[1, 1])

## Form alphas, NxK matrix with the component weights included
alphas = (betas + tt.log(weights)).T

# Take LSE rowise to get N vector
# and add alpha_cluster1 to get the total likelihood across all
# components for each X_i
lse_clust = pm.math.logsumexp(alphas - alphas[0, :], axis=0) + alphas[0,:]

# Final overall sum
return tt.sum(lse_clust)
return _logp
``````

The problem is that when I use it as a latent variable, I am getting continuous values in the sampling, even when I specify Metropolis Hastings sampling. This makes sense, because the domain is never specified.

One hacky workaround I tried was to do the value > 0.5 thing everywhere that the variable appeared, but I didn’t really get good results with this and I’d ultimately want something that works better.

How are you using the bernoulli_loglh in your model? Via `pm.DensityDist`? One way to get only integer values is to create a class Distribution that inherits from `pm.Discrete`, but the only thing this will do is tell the Metropolis sampler to round it’s proposed values before evaluating the logp. You can achieve the same if you round your value in your `_logp` and then save the rounded accepted values in a `pm.Deterministic` Something like

``````with pm.Model() as m:
my_dist = pm.DensitiyDist('my_dist', bernoulli_loglh...)
my_values = pm.Deterministic('my_values', np.round(my_dist))
``````

Same thing if you are using a `pm.Potential`