Generating samples using mcmc method from PyMc

M24 · February 23, 2024, 7:09am

Hi , I am using a transition matrix of 3 X 3. I am beginner in PyMC and trying to use built-in methods to sample an generate events to develop a time series. For the followoing code , I tried but unable to understand the error. Any help/directions/ references in PyMC would be very helpful.

import numpy as np
import pymc as pm

# Define the transition matrix
transition_matrix = np.array([[0.1, 0.2, 0.7],
                              [0.09, 0.01, 0.9],
                              [0.2, 0.8, 0.0]])

# Define the state of interest
target_state = 2

# Number of iterations
num_iterations = 10000

# Define PyMC model
model = pm.Model()

# Counter for transition to target state
transition_count = 0

with model:
    # Hidden state sequence
    states = [pm.Categorical(f'state_{i}', p=transition_matrix[0]) for i in range(num_iterations)]

    # Generate events based on transition matrix
    for i in range(1, num_iterations):
        states[i] = pm.Categorical(f'state_{i}', p=transition_matrix[states[i:-1]])
        
        # Check if the transition leads to the target state
        if states[i] == target_state:
            transition_count +=1

# Sample from the model
with model:
    trace = pm.sample(num_iterations, chains=1)

# Estimate the probability of transitioning to the target state
estimated_probability = transition_count.get_value() / num_iterations

print("Estimated probability of transitioning to state", target_state, ":", estimated_probability)

I am getting error at Categorical" IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices"

jessegrabowski · February 23, 2024, 7:50am

There are several problems with this model. A PyMC model is not normal python code. Instead, it lazily constructs a computational graph. This graph represents, but does not compute!, the expressions you write down. As a result, something like this block:

        if states[i] == target_state:
            transition_count +=1

is not going to work. In general, loops are never advised. There is a looping construct in PyMC called scan, if you need to do recursive computation. In general, though, scan is like a nuclear bomb, and there are often simpler solutions to explore before you go down that road.

In your case, you are trying to estimate a discrete markov chain, and we have a probability distribution for that in pymc-experimental. You can have a look at the first example in this notebook for a model that seems to be exactly what you’re trying to accomplish here.

One other issue is that you cannot call pm.sample then check a model variable like transition_count. You need to add the symbolic computation to the output of pm.sample using pm.Deterministic. In your case, though, you aren’t estimating anything, because the transition matrix is fixed. If you sample then compute shares of hidden states transitions from state i to state j, it will simply return transition_matrix[i,j]. See the above notebook for an example of using a random variable to estimate a transition matrix.

M24 · February 23, 2024, 8:49am

Hey thanks @jessegrabowski for a detailed response…

Actually, after posting the question even I found out about pymc-experimental, discrete-markov chain.I will check on this pymc-experimetal first. It seems that its a separate package apart from pymc.
Yes , you are correct, Currently I am not estimating, actually I want to generate samples using mcmc.
Yes, I have a fixed transition matrix now . I am trying to understand to generate samples using mcmc using this fixed transition matrix.
Does pymc has any samplers which I can use with this transition matrix.

jessegrabowski · February 23, 2024, 8:55am

You can use pm.sample_prior_predictive if you want to generate forward samples from a fixed model

M24 · February 23, 2024, 8:55am

In the case of pymc-experimental helps us to generate a Discrete Markov chain. But I already have one in the form of transition matrix.

jessegrabowski · February 23, 2024, 9:04am

The code you posted appears to generate samples from a markov chain with a known transition matrix, then computes the total share of state 2 observed. You can accomplish this by making a pmx.DiscreteMarkovChain RV with P=transition_matrix, calling pm.sample_prior_predictive, then computing the share of state 2 in the samples.

M24 · February 23, 2024, 10:11am

Yes, correctly said I am trying to generate samples from the given transition matrix(Markov Chain).

Thanks for the directions. I will try but Whats RV in the pmx.Discrete Markov Chain RV ?

Actually, I am looking into some Markov Chain Monte Carlo methods so that I can reduce the simulation time.

jessegrabowski · February 23, 2024, 10:20am

It’s the distribution pymc_experimental.distributions.timeseries.DiscreteMarkovChain. Again, you can basically just copy-paste the first example in the notebook I linked (cell 8). Replace pm.sample with pm.sample_prior_predictive, and replace P with your transition_matrix. I’m happy to look at your code if you run into issues.

I’m not sure I understand this. MCMC is an algorithm for evaluating integrals, not for simulating systems with discrete Markov dynamics (also called “Markov Chains”). You will not be using MCMC at all for the problem as stated, because you aren’t doing any inference.

M24 · February 23, 2024, 10:33am

Hey @jessegrabowski Thanks for the response. Really appreciate. Actually Ia m looking into this for a while and actually finding it difficult to interpret.
Let me write a code with your given directions and I’ll get back to you for further discussions.

Actually, I am looking MCMC in order to get better sampling methods which might help to generate large number of samples . Ex: I am sure right now but was reading Gibbs . Is the approach correct?

jessegrabowski · February 23, 2024, 10:40am

Why do you want to generate samples? What is your goal?

Gibbs, like all MCMC algorithms, is a way to generate samples from a target distribution whose probability density function is unknown. It might be good to look into some resources that explains the fundamentals of these methods? I like the Statistical Rethinking series.

In your case, the target distribution is known. Under some conditions, discrete markov chains converge to a stationary distribution \psi that solves the equation \psi(I - P)= 0. See here for details. In your case, this distribution is [0.14188124, 0.39936942, 0.45874934]. So without doing any simulation at all, I can tell you that the share of samples in state 2 for a chain of sufficiently long length will be about 45%.

M24 · February 23, 2024, 11:01am

Ok. It goes This way.

I have a transition matrix 3 X 3 and 3 states i.e s0,s1 and s2
inital state for ex : transition_matrix[0]
I am looking for destination :s2
I want to sample in order to get one of the 3 states for 100 simulations

Actually using numpy.random.choice() I am getting the sample but was looking at some built-in packges to work for.

jessegrabowski · February 23, 2024, 11:03am

You could try quantecon.MarkovChain if that’s all you need to do, see the lecture I linked from them above.

M24 · February 23, 2024, 11:42am

Actually from the given transition matrix and current state s0 and target state s2 , I want to perform random sampling to get one of the 3 states for 1000, 10000 simulations

M24 · February 26, 2024, 5:35am

Hi @jessegrabowski , currently yes I have consider a simple example of Markov Chain (3 x 3 matrix with states s0,s1,s2) I am interested to perform random simulations to obtain the states .Ex:
Simulation run=1 , randomly obtain s0
Simulation run=2 , randomly obtain s2 and so on.

I am looking for something like this.

jessegrabowski · February 26, 2024, 5:37am

Did you look a the quantecon lecture I already linked? I believe this resource is more adapted to your needs than PyMC.

M24 · February 26, 2024, 5:59am

Yes, I had a check through quantecon .

But looking at linked file shared by you. I have re-written the code with the help of the reference . Is it possible to have glance through what I have interpreted.

# Define the transition matrix
trans_mat = np.array([[0.1, 0.2, 0.7],
                              [0.09, 0.01, 0.9],
                              [0.2, 0.8, 0.0]])

#number of states: 3
#States: s0,s1 ans s2¶
ax=sb.heatmap(trans_mat, cmap='YlOrRd', annot=True,linewidth=.3)
ax.set(xlabel="", ylabel="")
ax.xaxis.tick_top()

# Use current state as s0 prob list

#Estimate the transition matrix from data, using DiscreteMarkovChain as a likelihood
#(Ref https://github.com/pymc-devs/pymc-experimental/blob/main/notebooks/discrete_markov_chain.ipynb)
current_state=0

def generate_chains(P, steps, n_chains=1):
    output = np.empty((n_chains, steps), dtype="int64")

    x0 = pm.draw(pm.Categorical.dist(p=trans_mat[0], shape=(n_chains,)))
    output[:, 0] = x0

    for t in range(1, steps):
        output[:, t] = [
            np.random.choice(range(P.shape[0]), p=P[output[i, t - 1]].ravel()).astype(int)
            for i in range(n_chains)
        ]

    return output.squeeze()

chains = generate_chains(trans_mat, 10, n_chains=2)

with pm.Model() as model:
    x0 = pm.Categorical.dist(np.ones(3) / 3, size=(100,))
    P = pm.Dirichlet("P", a=[1, 1, 1], size=(3,))
    discrete_mc = DiscreteMarkovChain("MarkovChain", P=trans_mat, init_dist=x0, observed=chains)
    idata = pm.sample()

az.summary(idata)

RANDOM_SEED = 58
rng = np.random.default_rng(RANDOM_SEED)


with pm.Model() as model:
    x0 = pm.Categorical.dist(np.ones(3) / 3, size=(100,))
    P = pm.Dirichlet("P", a=[1, 1, 1], size=(3,))
    discrete_mc = DiscreteMarkovChain("MarkovChain", P=trans_mat, init_dist=x0, observed=chains)
    idata = pm.sample_prior_predictive(samples=50, random_seed=rng)

az.summary(idata)

Actually, I was trying to understand sample and sample_prior.
But we have written manually for generate chains code.
What does actually DiscreteMarkov Chain code section does?

M24 · February 26, 2024, 6:01am

Actually I have to explore MCMC methods . So was looking into MCMC

other reason : generate a markov chain for large number of simulations

M24 · February 26, 2024, 6:04am

Thanks for the statistical lecture linked. found very helpful.
But I didn’t understand your distribution [0.14188124, 0.39936942, 0.45874934].Why its required.
Actually doing simulation , I want to generate random states either s0 or s1 or s2 and then wanted to count them .

jessegrabowski · February 26, 2024, 6:17am

Your generate_chains function is not needed. It is used in the notebook to show that we recover the true (unknown) transition matrix. Since you just want to generate samples from a known transition matrix, you can just do that:

with pm.Model() as model:
    x0 = pm.Categorical.dist(trans_mat[0]) # Or whatever you want the chain to start at.
    discrete_mc = DiscreteMarkovChain("MarkovChain", P=trans_mat, init_dist=x0, steps=99)
    idata = pm.sample_prior_predictive(samples=50, random_seed=rng)

DiscreteMarkovChain is a random variable. It’s the same as any other random variable in PyMC. You can check out the getting started resources in the examples gallery if you want to understand how PyMC works more deeply.
MCMC has nothing to do with your problem as stated
\psi = [0.14188124, 0.39936942, 0.45874934] is the stationary distribution of a markov chain with the transition matrix you provided. If all you want to do is count states, a chain of length N is expected to have N\psi_i instances of state i.

Again, you should just be using mc = qe.MarkovChain(trans_mat).

M24 · February 27, 2024, 1:03am

Correctly mentioned.
With the Trans matrix would perform some large simulations and count states, a chain of length N is expected to have Nψi. instances of state i (that is what required)
I assumed PyMC had some built-in functions so thats the reason into it.
I looked into quantecon which has basic functionalities .
"MCMC has nothing to do with your problem as stated "— Actually , using markov chain (transition matrix) and monte carlo to perform simulations in order to create a chain with random states.I was looking for this.

Topic		Replies	Views
Continuous Time Markov Chain Questions	1	1451	February 18, 2019
Sample method of pymc3 Questions	4	670	July 15, 2018
pm.MCMC in pymc3 or in an other library? Questions	10	1340	March 25, 2021
Problem using emissions as likelihood for hidden Markov model version agnostic	6	66	January 8, 2025
Custom Categorical Distribution - ensure bounded candidates v5 modeling	13	388	March 31, 2024

Generating samples using mcmc method from PyMc

Related topics