# Help understanding pm.categorical

Hello all!

I’m trying to model windspeed as a Markov process - where the state of the windspeed can be one of 20 states, each of which has a average windspeed and std dev.
This is then fed into a normal distribution to generate a sample windspeed.
At each point in time, the state has a probability of changing into a new state given by the transition probability.

What I don’t understand is that in the trace, I end up with 20 traces for windspeed, looks like one per starting windspeed (trace[“wind_speeds”].T).
How do I set the initial state, and only run a trace using that initial state?

Thanks for being patient if this is a simple question!

``````import numpy as np
import pymc3 as pm
from scipy import stats

# Set the random seed for reproducibility
np.random.seed(42)

# Generate transition probabilities using a Normal distribution
def gen_prob(index):
x = [stats.norm.pdf(i - index) for i in range(20)]
x /= sum(x)
return x

# Set the transition probabilities for the Markov chain
transition_probabilities = [gen_prob(i) for i in range(20)]

# Set the average wind speeds for each state
mean_speeds = np.arange(20)

# Set the standard deviations for each state
std_speeds = np.logspace(-1,0.5,20)

# Create a Markov chain with 3 states
model = pm.Model()
with model:
mean_speeds = pm.Data('mean_speeds', mean_speeds)
std_speeds = pm.Data('std_speeds', std_speeds)

# Create a categorical variable with 3 possible states
state = pm.Categorical('state', p=transition_probabilities, shape=(20,))

#     # Create a normal distribution for the wind speeds
wind_speeds = pm.Normal('wind_speeds', mu=mean_speeds[state], tau=std_speeds[state], shape=(1,20))

# Generate a sequence of 1000 random wind speeds using the Markov chain
with model:
trace = pm.sample(1000)
``````

Welcome!

I think the problem might be that you’re not sampling sequences, but rather independent draws. To implement a markov chain, you need to use `scan`, which allows for recursive computation. It’s also a bit messy to pass RandomVariables though the scan, you have to do some tricks to register them to the model. I followed this post as a guide to re-write your code as follows:

``````import numpy as np
import pymc as pm
from scipy import stats

# Set the random seed for reproducibility
np.random.seed(42)

# Generate transition probabilities using a Normal distribution
def gen_prob(index):
x = [stats.norm.pdf(i - index) for i in range(20)]
x /= sum(x)
return x

# Set the transition probabilities for the Markov chain
transition_probabilities = at.stack([gen_prob(i) for i in range(20)])

# Set the average wind speeds for each state
mean_speeds = np.arange(20)

# Set the standard deviations for each state
std_speeds = np.logspace(-1,0.5,20)

# Create a Markov chain with 3 states
model = pm.Model()
with model:
mean_speeds = pm.Data('mean_speeds', mean_speeds, mutable=False)
std_speeds = pm.Data('std_speeds', std_speeds, mutable=False)

initial_state = pm.Categorical('x0', p=np.full(20, 1/20))
rng = aesara.shared(np.random.default_rng())

def transition(previous_state, transition_probs, old_rng):
p = transition_probs[previous_state]
next_rng, next_state = pm.Categorical.dist(p = p, rng=old_rng).owner.outputs
return next_state, {old_rng: next_rng}

markov_chain, updates = aesara.scan(transition,
non_sequences=[transition_probabilities, rng],
outputs_info=[initial_state],
n_steps=30)

model.register_rv(markov_chain, name='markov_chain')

# Create a normal distribution for the wind speeds
wind_speeds = pm.Normal('wind_speeds', mu=mean_speeds[markov_chain], tau=std_speeds[markov_chain])

# Generate 1000 random draws of wind speed sequences of length 30 using the Markov chain
with model:
``````

The important changes are:

1. stacking your transition probabilities into a matrix variable recognized by the PyMC backend (I’m using Aesara but this has just recently been forked to PyTensor, and I noticed you’re on PyMC3, so you’d be using theano. Both of us need to update our PyMC installs)
2. Drawing a single `initial_state` from a single distribution with 20 possible outcomes. I believe this gets to your initial question – `size` gives the size of the output, not the dimensionality of the support. Your categorical distribution has support over 20 numbers, but we only want to draw 1, so `size=1`.
3. Make an `rng` object. The back end needs to pass this along through the recursive computation, so we need to provide it.
4. Made a function to step the markov chain. Inside that function we make a new `Categorical` distribution, but we have to use some tricks because we 1. want to update the state of the `rng` and 2. don’t want to keep all these intermediate distributions on the computational graph.
5. Run a `scan` to recursively set the next state based on the previous state
6. Manually add the whole markov chain as a single variable to the model with `model.register`.
7. Use `pm.sample_prior_predictive` (since you didn’t show the model any ground truth data, it doesn’t make sense to use `pm.sample`), and provide the `updates` in `compile_kwargs`.

Note that the number of steps in the markov chain is dictated by `n_steps`, NOT by the number of draws in `pm.sample_prior_predictive`!

2 Likes