Mixture Model Latent Variable (augmented-data) example

Hi,
The example in:

contains a math expression of an augmented-data mixture model (mixture model with latent variables), that is not implemented. It does refer to another example where it is claimed to be implemented:

However, it is not there either
Is it planned to be?
Thanks in advance,
Lior

Does pymc.Mixture — PyMC 5.6.1 documentation do what you need?

Hi,
No, it is not; it demonstrates a standard-data mixture model, not the augmented-data one. Augmented-data model involves the cluster indicator, usually denoted z_i, i in {1,…,N_sample}
Thanks

I see, I didn’t know what you meant with augmented. For me that’s the normal Mixture model without marginalizing the indicator variable.

You can do that by sampling the latent variables and use those to index the parameters, a bit like the coal mining example: Introductory Overview of PyMC — PyMC 5.6.1 documentation

Otherwise if you want a mixture among different distributions and can’t just use a parameters switch, you can use CustomDist, something like

import pymc as pm

def dist(idx, mu1, mu2, sigma1, sigma2, size):
  comp1 = pm.Normal.dist(mu1, sigma1, size=size)
  comp2 = pm.Laplace(mu2, sigma2, size=size)
  return pm.math.switch(idx, comp1, comp2)

with pm.Model() as m:
  # Prior for idicator variables
  idx = pm.Bernoulli("idx", p=0.5, size=(3,))  # can use categorical for more components
  # Prior for mixture components parameters
  mu1 = ...
  mu2 = ...
  sigma1= ...
  sigma2 = ...
  # Mixture likelihood with custom distribution 
  llike = pm.CustomDist("llike", idx, mu1, mu2, sigma1, sigma2, dist=dist, observed=[2, 2, 3])

There was some discussion sometime ago to allow such Mixtures from the same Mixture constructor: Allow non-marginalized mixtures · Issue #5717 · pymc-devs/pymc · GitHub

If you think that would be useful feel free to express it there.

Great!
Thanks a lot!