Baum-Welch example in PYMC3

koohyh · July 14, 2020, 6:05am

Dear all,
I wonder if anyone can point me to an example of Baum-Welch algorithm in pymc3 that can be used to learn the parameters of a given Hidden Markov Model.
I have trying to use python hmmlearn to do that but doesnt seem to be very good. The problem I am having from the following code is that I do not get the same parameter that I used with my toy example to generate the sequences.
The following is toy example in which first I initiate a hmm, then I generate some sequences using known parameters, finally I initiate a second hmm and need to train (fit) that with my generated sequences to learn the initial parameters. But the issue is never get close enough to my original parameter, even for this simple example, that makes me worried how I am going to use this for my complex data.
I therefore wonder if there is anything similar in pymc3, as I think stan has a fairly good coverage of HMMs.

def initialize_components(n_components,  n_emissions):
    """
    Initialize start probabilities,  transition and emission matrices, start  for a model with a fixed number of components,
    for Multinomial model with a certain number of dimensions.
    """

    trans_mat = np.abs(np.random.randn(n_components, n_components))
    trans_mat = (trans_mat.T / trans_mat.sum( axis=1 )).T
    
    emission_mat = np.abs(np.random.randn(n_components, n_emissions))
    
    emission_mat = (emission_mat.T/emission_mat.sum(axis=1)).T
#     print(emission_mat.shape)

    start_probs = np.abs( np.random.randn(n_components) )
    start_probs /= start_probs.sum()
    return (start_probs, trans_mat, emission_mat)

def generate_sequences(Pi, T, E, n_components=3, n_features=6, N=200, K=20):
    """
    Use the given parameters, initiate a hmm model and generate some sequences
    """
    states = []
    seqs = []
    seq_lengths = []
    
    model = hmm.MultinomialHMM(n_components=3)
    model.startprob_ = Pi
    model.transmat_ = T
    model.emissionprob_ = E
    model.n_features = n_features
    print('*** Original parameters *********')
    print('p_init')
    print(model.startprob_)
    print('p_tans ')
    print(model.transmat_)
    print('p_emm')
    print(model.emissionprob_)

    
    for j in range(N):
        seq_length = K 
        ### generate sequence
        seq = model.sample(seq_length)[0]
        z = model.sample(seq_length)[1]
        #     print(z)
        # print(seq)
        ### generate sequence
        seq = model.sample(seq_length)[0]
        seq_lengths.append(seq_length)
        states.append(z)
        # print(seq)
        if len(seqs) > 0:
            seqs = np.append(seqs, seq, axis=0)
        
        else: 
            seqs = seq
            
    model.fit(seqs, seq_lengths)
    score =  model.score(seqs)
    print('Score: ', score)
    return(seqs, states, seq_lengths)

def train(hmm_model, Seqs, Lengths, nIteration):
    """
    Train the model with given data, and find the best describing parameters (Baub-Welch)
    """
    best_model = None
    best_trans = None
    best_emissions = None
    best_pi = None
    max_score = -np.inf

    for i in range(nIteration):
        
        pi, A, B = initialize_components(n_states, n_emissions)
        hmm_model.emissionprob_ = B
        hmm_model.transmat_ = A
        hmm_model.startprob_ = pi
        hmm_model.n_features_ = 6
        hmm_model.fit(Seqs, Lengths)
        this_score =  hmm_model.score(Seqs)

        if this_score > max_score:
            max_score = this_score
            best_model = hmm_model
            best_pi = pi
            best_trans = A
            best_emissions = B
         ### used these start poing
        if i % 10 == 0:
            print('^'*50)
            print('\niteration', i)
            print('best score: ', max_score)


#     print(i , '\n ================== After training ==============')
#     print(model2.score(seqs))

    print(max_score)
    print(best_model.startprob_)
    print(best_model.transmat_)
    print(best_model.emissionprob_)
    return(best_model)

##################
S = ["S1", "S2", 'S3']
n_states = len(S)
E = [i+1 for i in range(6)] ## sides of a dice
trans_prob = np.array(
    [[0.9,  0.05, 0.05], 
     [0.1, 0.8, 0.1], 
     [0.05,  0.05, 0.9]]
)
start_prob = np.array([0.8, 0.1, 0.1])

### Lets assume that we are throwing 3 6-sided dices: one is fair with equel probablity of each side, and the other two are loaded
em_prob =  np.array([[1/4, 3/20, 3/20, 3/20, 3/20, 3/20],
                [2/15, 2/15, 2/15, 2/15, 2/15, 1/3],
                [1.0/6, 1.0/6, 1.0/6, 1.0/6, 1.0/6, 1.0/6]])

X, Z, L = generate_sequences(start_prob, trans_prob, em_prob, n_components=3, n_features=6, N=200, K=20)
model2 = hmm.MultinomialHMM(n_components=3,   n_iter=200)

fitted_hmm = train(model2, X, L, 100)

junpenglao · July 14, 2020, 7:04am

The closest implementation in PyMC3 compare to Stan regarding HMM is How to marginalized Hidden Markov Model with categorical?. You can try to see if it fits your use case or not.

koohyh · July 15, 2020, 7:02am

Thank you.
I will be looking into the link you sent, though my initial impression was that it includes algorithms such as forward and viterbi but not baum-welch.

junpenglao · July 15, 2020, 3:27pm

You are right, it’s using forward method to better accumulate the log_prob but we dont do inference so Baum-Welch is not implemented

koohyh · July 15, 2020, 3:52pm

But it looks like the section that is labeled as ‘Semi-suprivised model’ in the notebook that you sent me the link does the same job! Am I right? all I need is to infer the parameters from the observed data, and that gives me the parameter!
I am just trying to figure out how to works because I am not very fluent with its syntac.
cheers
hashem

koohyh · July 15, 2020, 4:11pm

sorry to bother you again.
Just to add that my impression was wrong. For ‘Semi-supervised’ model to run, one would require both observed-seqs(labeled sequences) and state-seqs whereas in my case (and for baum-welch) all I have is just observed sequences and not state sequence a priori.

junpenglao · July 15, 2020, 9:01pm

You can run the unsupervised model, which is also use forward method. However, as discussed in the thread, the inference on these kind of model is extremely difficult both in pymc3 and Stan. Some EM like algorithm is needed most likely.

Topic		Replies	Views
Sample external model parameters with PyMC3 Questions	4	997	October 17, 2017
Sequence of Observed in a loop, how the log-likelihood are estimated? add them all? Questions	9	1773	November 16, 2018
Hidden Markov Model - Estimating Transition and Emission CPDs from multiple sequences - not working Questions	11	3866	December 3, 2020
How to marginalized Hidden Markov Model with categorical? Questions	9	5707	December 19, 2018
HMMs with PyMC3: Complexity and point estimates Questions	1	789	October 11, 2019

Baum-Welch example in PYMC3

Related topics