Label switching for multicomponent model

Hi, can any one tell suppose we have 2 component model like y(t)=A1cos(w1t)+B1sin(w1t)+A2cos(w2t)+B2sin(w2t) +e(t) suppose i take true parameter is A1=5=B1, A2=B2=3, w1=2, w2=1.5 where e(t) is gaussian noise where follow N(0,sigma^2)
I want to estimate parameter A1,B1,A2,B2,w1,w2,sigma^2 i use nutpie in pymc where i take prior for A1,B1,A2,B2~N(0,sigma=10) and w1,w2~U(0,pi) for sigma^2 ~Inversegamma(2,2) the problem is label switching i mean some time i get correct some time i get it switch all the estimate for chain=1 A1=3=B1, A2=B2=5, w1=1.5, w2=2 CAN THERE IS ANY SOLUTION?

Welcome!

You can find discussion about enforcing orderings if you use the search functionality. For example, here and here.

1 Like

Michael Betancourt wrote a really nice case study on identifying mixtures:

https://betanalpha.github.io/assets/case_studies/identifying_mixture_models.html

Ordering’s great, such as forcing A1 < A2. Surprisingly, it doesn’t change the inferences you care about. You can always code this directly by parameterizing A1 directly and then defining a new parameter A2_raw and taking A2 = A1 + exp(A2_raw). But I think then in PyMC you need to put a prior on A2_raw. Alternatively, you can define A2 = A1 + B where B is any positive-constrained variable such as one you assign a lognormal or gamma prior.

A second approach is to try to post-hoc relate the chains. This is a mess and I wouldn’t recommend it.

Another solution is to focus on posterior predictive quantities that don’t depend on the labels. For example, if you do any kind of held-out prediction like cross-validation over data and only care about posterior predictive log likelihood, the labels don’t matter. The main difficulty is with adapting algorithm parameters and in convergence monitoring—but you can monitor the overall log density as a proxy and parameters like your sigma are not going to vary based on label as they don’t appear to be label specific. I don’t think Michael talks about this approach in his case study.

2 Likes

Thank you sir. For replying I again get issue when I give that you said like that A2 = A1 + exp(A2_raw) then there is divergence. Also with previous setup the results i get like given in trace plot.