Using DiscreteMarkovChain with batch dimensions and data of different lengths

ddg.palmer · March 2, 2025, 8:14am

Hi. I’m trying to model experimental data for multiple participants using a hidden Markov model based on pymc-extras’ DiscreteMarkovChain. I have a working model which uses a for loop to create a DiscreteMarkovChain distribution, emission matrix, and observed data for each participant, but it seems like quite an unweildly solution. I’d like to simply use batched dimensions like I would in other models, but each participant has seen a different number of trials, which means that each has a different length Markov chain.

The only solution I’ve been able to think of is to pad each vector up to the length of the longest, but the parameter I’m interested in is the transition probability, and the padded values would alter the transition probabilities (by either transitioning more or less than the participant themselves). Is there an elegant way to solve this problem, or should I stick with my for loop?

ricardoV94 · March 4, 2025, 10:36am

For this you can define your own distribution with CustomDist.

def dist(init_state, transition_p, end_t, size):
    def step(t, state_tm1, transition_p, end_t):
        state_t = pm.Categorical.dist(p=transition_p[state_tm1])

        # If t >= end_t, we encode as -1
        # We need to use a DiracDelta for PyMC to accept the logp from a `where`
        end_state = pm.DiracDelta.dist(-1, shape=state_t.shape)
        state_t = pt.where(t >= end_t, end_state, state_t)

        return state_t, pm.pytensorf.collect_default_updates(state_t)

    states, _ = pytensor.scan(
        fn=step,
        sequences=[pt.arange(end_t.max())],
        outputs_info=[init_state],
        non_sequences=[transition_p, end_t],
        strict=True,
    )
    return states


with pm.Model() as m:
    transition_p = pm.Dirichlet("transition_p", a=np.ones(3), shape=(3, 3))  # 3x3 transition matrix

    init_state = pt.as_tensor([0, 0, 0, 0, 0])  # Could be a prior
    end_t = pt.as_tensor([10, 15, 20, 12, 8])  # Observed 10, 15, 20, ... trials, respectively
    max_t = end_t.max()

    y = pm.CustomDist(
        "y",
        init_state,
        transition_p,
        end_t,
        dist=dist,
        shape=(end_t.shape[0], max_t),
    )

I assumed data was padded with -1, but you can do whatever encoding you want.

Colab notebook with full code and parameter recovery: Google Colab

This model seems to run fast with the numba backend (bleeding versions of pymc/pytensor) but whether it scales to your data or not I don’t know. Let me know if you find this approach to be much slower than the “wrong” one using the DiscreteMarkovChain from pymc-extras

You can read more about using CustomDist to define timeseries: Time Series Models Derived From a Generative Graph — PyMC example gallery

ddg.palmer · March 4, 2025, 10:41pm

Thank you so much!

Topic		Replies	Views
Adding Conditional Transition Probabilities to a Hidden Markov Model version agnostic	5	118	November 18, 2024
Custom Categorical Distribution - ensure bounded candidates v5 modeling	13	395	March 31, 2024
Hidden Markov Model - Estimating Transition and Emission CPDs from multiple sequences - not working Questions	11	3838	December 3, 2020
Problem using emissions as likelihood for hidden Markov model version agnostic	6	72	January 8, 2025
DiscreteMarkovChain combined with Normal	1	57	October 3, 2024

Using DiscreteMarkovChain with batch dimensions and data of different lengths

Related topics