Mixture of continuous and discrete logp

Your example doesnt really makes sense to me… Y is a discrete variable but it is a mixture of binomial and normal? Also the mixture component is already observed, I guess you can do comp_dist.distribution.logp(value) in your mix_mixlogp:

    # Define mixed logp
    def mix_mixlogp(w, comp_dists):
        def logp_(value):
            print(value)
            comp_logp = tt.squeeze(tt.stack([comp_dist.distribution.logp(value) 
                                             for comp_dist in comp_dists], axis=1))
            return pm.math.logsumexp(tt.log(w) + comp_logp, axis=-1)
        return logp_

But I am not sure whether it really makes sense.

The advantage of mixture model is that you dont need to know which portion of the data is from which component - you dont need a discrete latent label as the data is evaluated on all component, but the weight (after inferece) inform us which component each data point is more likely to belong to.