MarginalModel is very nice indeed and has uses beyond this. I guess the only two caveats are no NUTs sampling atm (for this particular case, but will probabily be resolved later), no missing data allowed. I am yet to compare the two in terms of robustness. Perhaps that may still be my go to model when dataset is small to medium and there are no missing data.
In this case, I can use numpyro to get speed ups but pymc’s own sampler seems to fare better when the model is not very well identified (close-ish centers etc). numpyro seems to need better informed priors (such cluster centers set as relatively vague normals around centers from kmeans) to not get stuck in some of the simulated data I have tried.
If someone can explain me shortly how, would be happy to contribute a document to the pymc example gallery which explains
- importance of batched vs supp dims in sampling for mixture models
- how the likelihood changes when batched dim = 1 vs sup dim = 1
- how to use multiple univariates in mixture models using two ways (MarginalModel and sticking univariates together)
Could be a nice semi-advanced tutorial that also exemplifies the important distinction between dimensions.