Hidden Markov Model using PyMC4

Hello all!

I am a new Bayesian student. I am also new with PyMC. Now, I am interested in estimating the parameters of Hidden Markov Models using Bayesian inference. For that, I wanted to start with this example:
But, using the new version of PyMC.
This is the code:

pymc_hmm_logL.py (5.8 KB)

Unfortunately, I have the followed error when I try to execute the HMM algorithm. I thought the problem is in the Categorical distribution. But, I am not sure.

ValueError: Incompatible parametrization. Must specify either p or logit_p.

I appreciate any help in this regard, and it will also help me advance in my learning process.

1 Like

@jessegrabowski @ericmjl

Hi Juan!

You can directly add code to your posts by surrounding it with triple backticks (```). This is a lot easier for us to discuss than attaching a script.

Tensorflow isn’t used with PyMC; despite some conversation about it as a possibility, it has never been used. The computational backend is pytensor, which you should be using.

Otherwise, I think it would be good if you could cut this code down to a minimum example that reproduces your error. There’s a lot of support code mixed in with the model code, so it’s not clear to me where I need to help you debug.

I can’t speak to @ericmjl 's specific approach, but we’ve been discussing HMMs quite a bit lately in the context of automatic marginalization and the new(ish) DiscreteMarkovChain distribution. See here for an example, or search the discourse for recent discussion on HMMs.

Hi Jesse!

Thanks for you reply. Here is my code. This first part is for data generation for HMM training.

# Import libraries
import pymc as pm
import arviz as az
import numpy as np
import seaborn as sns
import pytensor.tensor as pt
import tensorflow as tf
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.stats import multinomial
from typing import List

# Here: Data generation
# Analytic solution to obtain equilibrium distribution
def equilibrium_distribution(p_transition):
    n_states = p_transition.shape[0]
    A = np.append(
        arr=p_transition.T - np.eye(n_states),
    b = np.transpose(np.array([0] * n_states + [1]))
    p_eq = np.linalg.solve(
    return p_eq

# Generate a Markov sequence based on p_init and p_transition
def markov_sequence(p_init: np.array, p_transition: np.array, sequence_lenght: int) -> List[int]:
    if p_init is None:
        p_init = equilibrium_distribution(p_transition)
    initial_state = list(multinomial.rvs(1, p_init)).index(1)

    states = [initial_state]
    for _ in range(sequence_lenght - 1):
        p_tr = p_transition[states[-1]]
        new_state = list(multinomial.rvs(1, p_tr)).index(1)
    return states

# Generate de emissions
def gaussian_emissions(states: List[int], mus: List[float], sigmas: List[float]) -> List[float]:
    emissions = []
    for state in states:
        loc = mus[state]
        scale = sigmas[state]
        e = norm.rvs(loc=loc, scale=scale)
    return emissions

def plot_emissions(states, emissions):
    fig, axes = plt.subplots(figsize=(16, 8), nrows=2, ncols=1, sharex=True)

p_transition = np.array(
    [[0.6, 0.2, 0.2],
     [0.1, 0.8, 0.1],
     [0.1, 0.2, 0.7]]

states = markov_sequence(p_init=[1,0,0], p_transition=p_transition, sequence_lenght=1000)
gaussian_ems = gaussian_emissions(states=states, mus=[1,0,-1], sigmas=[0.2,0.5,0.1])

The second part is related with the model:

# Here: Using PyMC for Bayessian inference

class HMMstates():
    def __init__(self, p_transition, p_equilibrium, n_states, *args, **kwargs):
        self.p_transition = p_transition
        self.p_eqilibrium = p_equilibrium
        self.k = n_states
        self.mode = tf.cast(0, dtype='int64')

    def logp(self, x):
        p_eq = self.p_eqilibrium
        p_tr = self.p_transition[x[:-1]]

        # The logp of the initial state
        initial_state_logp = pm.Categorical.dist(p=p_eq).logp(x[0])

        # The logp of the rest of the states
        x_i = x[1:]
        ou_like = pm.Categorical.dist(p=p_tr).logp(x_i)
        transition_logp = tf.sum(ou_like)

        return initial_state_logp + transition_logp

def solve_equilibrium(n_states, p_transition):
    A = np.eye(n_states) - p_transition + np.ones(n_states)
    #p_equilibrium = pm.Deterministic("p_equilibrium", tf.linalg.solve(A.T, tf.ones(n_states)))
    p_equilibrium = pt.linalg.solve(pt.transpose(A), pt.ones((n_states, 1)))
    return p_equilibrium
class HMMGaussianEmissions():
    def __init__(self, states, mu, sigma, *args, **kwargs):
        self.states = states
        # self.rate = rate
        self.mu = mu
        self.sigma = sigma

    def logp(self, x):
        x: observations
        states = self.states
        # rate = self.rate[states]  # broadcast the rate across the states.
        mu = self.mu[states]
        sigma = self.sigma[states]
        return tf.sum(pm.Normal.dist(mu=mu, sigma=sigma).logp(x))
n_states = 3
with pm.Model() as model:
    # Priors for transition matrix
    p_transition = pm.Dirichlet("p_trans", a=np.ones((n_states, n_states)))

    # Solve for the equilibrium state
    #p_equilibrium = pm.Dirichlet('init_probs', a=[20,1,1])
    p_equilibrium = solve_equilibrium(n_states, p_transition)

    # HMM state
    hmm_states = HMMstates(

    # Prior for mu and sigma
    mu = pm.Normal("mu", mu=0, sigma=1, shape=(n_states,))
    sigma = pm.Exponential("sigma", lam=2, shape=(n_states,))

    # Observed emission likelihood
    obs = HMMGaussianEmissions(

with model:
    trace = pm.sample(1000, chains=2, progressbar=True)

I was able to fix the error and run the code without any errors. Now, the problem I have is that the algorithm keeps running and after a couple of hours it does not even show the progress bar. I don’t know if I have something wrong defined in the model or what is the internal process since it shows “NUTS: [p_trans, mu, sigma]” until it shows the progress bar. Thank you for your help.