Baum-Welch example in PYMC3

Dear all,
I wonder if anyone can point me to an example of Baum-Welch algorithm in pymc3 that can be used to learn the parameters of a given Hidden Markov Model.
I have trying to use python hmmlearn to do that but doesnt seem to be very good. The problem I am having from the following code is that I do not get the same parameter that I used with my toy example to generate the sequences.
The following is toy example in which first I initiate a hmm, then I generate some sequences using known parameters, finally I initiate a second hmm and need to train (fit) that with my generated sequences to learn the initial parameters. But the issue is never get close enough to my original parameter, even for this simple example, that makes me worried how I am going to use this for my complex data.
I therefore wonder if there is anything similar in pymc3, as I think stan has a fairly good coverage of HMMs.

def initialize_components(n_components,  n_emissions):
    """
    Initialize start probabilities,  transition and emission matrices, start  for a model with a fixed number of components,
    for Multinomial model with a certain number of dimensions.
    """

    trans_mat = np.abs(np.random.randn(n_components, n_components))
    trans_mat = (trans_mat.T / trans_mat.sum( axis=1 )).T
    
    emission_mat = np.abs(np.random.randn(n_components, n_emissions))
    
    emission_mat = (emission_mat.T/emission_mat.sum(axis=1)).T
#     print(emission_mat.shape)

    start_probs = np.abs( np.random.randn(n_components) )
    start_probs /= start_probs.sum()
    return (start_probs, trans_mat, emission_mat)

def generate_sequences(Pi, T, E, n_components=3, n_features=6, N=200, K=20):
    """
    Use the given parameters, initiate a hmm model and generate some sequences
    """
    states = []
    seqs = []
    seq_lengths = []
    
    model = hmm.MultinomialHMM(n_components=3)
    model.startprob_ = Pi
    model.transmat_ = T
    model.emissionprob_ = E
    model.n_features = n_features
    print('*** Original parameters *********')
    print('p_init')
    print(model.startprob_)
    print('p_tans ')
    print(model.transmat_)
    print('p_emm')
    print(model.emissionprob_)

    
    for j in range(N):
        seq_length = K 
        ### generate sequence
        seq = model.sample(seq_length)[0]
        z = model.sample(seq_length)[1]
        #     print(z)
        # print(seq)
        ### generate sequence
        seq = model.sample(seq_length)[0]
        seq_lengths.append(seq_length)
        states.append(z)
        # print(seq)
        if len(seqs) > 0:
            seqs = np.append(seqs, seq, axis=0)
        
        else: 
            seqs = seq
            
    model.fit(seqs, seq_lengths)
    score =  model.score(seqs)
    print('Score: ', score)
    return(seqs, states, seq_lengths)

def train(hmm_model, Seqs, Lengths, nIteration):
    """
    Train the model with given data, and find the best describing parameters (Baub-Welch)
    """
    best_model = None
    best_trans = None
    best_emissions = None
    best_pi = None
    max_score = -np.inf

    for i in range(nIteration):
        
        pi, A, B = initialize_components(n_states, n_emissions)
        hmm_model.emissionprob_ = B
        hmm_model.transmat_ = A
        hmm_model.startprob_ = pi
        hmm_model.n_features_ = 6
        hmm_model.fit(Seqs, Lengths)
        this_score =  hmm_model.score(Seqs)

        if this_score > max_score:
            max_score = this_score
            best_model = hmm_model
            best_pi = pi
            best_trans = A
            best_emissions = B
         ### used these start poing
        if i % 10 == 0:
            print('^'*50)
            print('\niteration', i)
            print('best score: ', max_score)


#     print(i , '\n ================== After training ==============')
#     print(model2.score(seqs))

    print(max_score)
    print(best_model.startprob_)
    print(best_model.transmat_)
    print(best_model.emissionprob_)
    return(best_model)

##################
S = ["S1", "S2", 'S3']
n_states = len(S)
E = [i+1 for i in range(6)] ## sides of a dice
trans_prob = np.array(
    [[0.9,  0.05, 0.05], 
     [0.1, 0.8, 0.1], 
     [0.05,  0.05, 0.9]]
)
start_prob = np.array([0.8, 0.1, 0.1])

### Lets assume that we are throwing 3 6-sided dices: one is fair with equel probablity of each side, and the other two are loaded
em_prob =  np.array([[1/4, 3/20, 3/20, 3/20, 3/20, 3/20],
                [2/15, 2/15, 2/15, 2/15, 2/15, 1/3],
                [1.0/6, 1.0/6, 1.0/6, 1.0/6, 1.0/6, 1.0/6]])

X, Z, L = generate_sequences(start_prob, trans_prob, em_prob, n_components=3, n_features=6, N=200, K=20)
model2 = hmm.MultinomialHMM(n_components=3,   n_iter=200)

fitted_hmm = train(model2, X, L, 100)

The closest implementation in PyMC3 compare to Stan regarding HMM is How to marginalized Hidden Markov Model with categorical?. You can try to see if it fits your use case or not.

Thank you.
I will be looking into the link you sent, though my initial impression was that it includes algorithms such as forward and viterbi but not baum-welch.

You are right, it’s using forward method to better accumulate the log_prob but we dont do inference so Baum-Welch is not implemented

But it looks like the section that is labeled as ‘Semi-suprivised model’ in the notebook that you sent me the link does the same job! Am I right? all I need is to infer the parameters from the observed data, and that gives me the parameter!
I am just trying to figure out how to works because I am not very fluent with its syntac.
cheers
hashem

sorry to bother you again.
Just to add that my impression was wrong. For ‘Semi-supervised’ model to run, one would require both observed-seqs(labeled sequences) and state-seqs whereas in my case (and for baum-welch) all I have is just observed sequences and not state sequence a priori.

You can run the unsupervised model, which is also use forward method. However, as discussed in the thread, the inference on these kind of model is extremely difficult both in pymc3 and Stan. Some EM like algorithm is needed most likely.