The problem comes down to non-identifiability, i.e. multiple sets of parameters yield the same outcome and the likelihood is indifferent across several modes. The two ways of fixing this that I know are:
- Choose smarter priors that constrain your model such that label switching goes away.
- Add additional structure to your model that makes non-identifiability go away.
Personally, I like the second option better. @AustinRochford recently wrote a blog post on how to do this that I really like, which you can find here. However, he looks at factor analysis so I’m not sure how this translates to your case. There is also some more advice on mixture models in this thread.
I don’t immediately see how pm.Potential
or pm.transform.ordered
would be of any use here.