Label switching in Hidden Markov Models

j_catulo · June 20, 2022, 11:38am

Hi!

I implemented a Hidden Markov Model with gaussian emissions to get posterior samples from multiple sequences of observations and my chains seem to be converging, however I am having label switching problems. Below you can find the code and results.

Code

N_states = 8
coord = {'emissions': np.arange(0, N_states)}

with pm.Model(coords = coord) as model:
    
    observations = pm.Data('data', observations, mutable = True)

    Pt = pm.Dirichlet("p_transition", np.ones( (N_states, N_states) ), shape=(N_states, N_states))
    P0 = pm.Dirichlet("p_init", np.ones((N_states,)), shape=(N_states,))


    logp_initial_state = at.log(P0)
    logp_transition = at.log(Pt)

    mu = pm.Normal('mu', mu = [-30,-25, -20, -16, -12,-9,-6,-5],  sigma = [2]*5+[1]*3)
    sigma = pm.InverseGamma('sigma', alpha= 40, beta=80, dims='emissions')

    loglike = pm.Potential( "hmm_loglike", hmm_logp_value_grad_op( observations, mu, sigma, logp_initial_state, logp_transition) )

Results

label_switch4555×1920 511 KB

After some research, I found that label switching is a common problem in mixture models and is caused by symmetry in the likelihood of the model parameters, but it is still not very clear to me the reasoning behind this problem. Can anyone recommend some literature about this topic?

Why does this happen and what can I do to prevent this?

Thank you in advance.

j_catulo · June 20, 2022, 4:38pm

This post seems to be a good solution, but, as it was said in that post, we would need to overwrite the data in the trace object to swap the sampled values for the switched dimensions. Is it possible to edit the data of an Arviz data structure?

I want to do something like:

idata2 = idata.copy()
idata2.sel(chain=[1]).posterior['mu'][0][:,0] = idata2.sel(chain=[1]).posterior['mu'][0][:,1]

But, by doing this, idata2 is unaltered.

DanWeitzenfeld · June 21, 2022, 8:38pm

Could you use an ordered transform on mu?

j_catulo · June 22, 2022, 10:49am

Thank you for your answer!

I cannot find in the documentation what that function does to my variables. Either way, I implemented it and the inferences were much worse.

I think that, in my case, the best way to handle this is by post processing the trace. I don’t have mid chain label switching, so I can just re-label my components.

Topic		Replies	Views
Problem using emissions as likelihood for hidden Markov model version agnostic	6	69	January 8, 2025
Label switching in multivariate mixtures Questions	0	756	September 28, 2020
Feature Request: handle label switching in summary. Some ideas Development	7	1174	March 13, 2019
Issues with multivariate GMM model and label switching v5	2	176	April 27, 2024
Edit ArviZ data structure v5 arviz	7	876	June 22, 2022

Label switching in Hidden Markov Models

Related topics