Hello, I’m trying to figure out the right way to generate model for this data in pymc3:
So I have about 70 observations, and only 1 of them is informative. I have a mix of 2 classes which have different informative observations.
I’m able to capture the important observation when it’s only 1 (not a mixture) with the code below. Can anybody give me a hint how to consider it for a mixture? I don’t need a detailed solution, even a hint would be so helpful!
Thanks @junpenglao! This is a huge help and I’m able to run my example based off the non-marginalized mixture. I’m still working on figuring out how the marginalized mixture works because the performance involving categorical variables is pretty brutal.
Hi @junpenglao. I’m trying to extend your notebook to the multivariate case that I have but I can’t seem to get the dimensions correct. Can I still use the pm.NormalMixture function when I have multiple observations like in the picture above? I’m not sure where to apply softmax because my output data is a 2-D array and not a vector of classes.
I have something close that’s giving my best result yet. It’s using a DensityDist like the example from Gaussian Mixture Model with ADVI in the PyMC3 docs. I’d like to try your idea of applying a softmax constraint, but I’m not sure where to do that. Is that an additional factor in the “logps” variable? I have a github notebook modified to use my data if you’d like to take a peek :).
Hi @chris,
Actually, I don’t think my suggestion of using a softmax is well thought. Looking at the data generation process, each row is not necessary sum to 1 right? If you have K > 2 you can try modeling the pi using a softmax, but I am not sure it is better than using Dirichlet (which use a stick-breaking process internally).