I would like to implement a hierarchical HMM with the following structure.
- There are two layers of hidden states—
n_iandv_i—and one layer of observed emissionsV_i. - There are two possible values for the
n_i, two possible values for theq_i, and two possible values for theV_i. - There is a time-independent transition matrix
T_ngoverning the hidden statesn_i. - The values of
n_igovern a switching process that selects one of two possible transition matrices—T_q,0orT_q,1(T_q,n_i)— governing the hidden statesq_i. - There is a time-independent emission matrix
Ethat governs the emitted valuesV_i.
Is it possible to specify such a model in PyMC?
My latest attempt is based off of threads like Adding Conditional Transition Probabilities to a Hidden Markov Model - #4 by jessegrabowski , which define a CustomDist(…, dist=markov_chain) and a markov_chain() function that calls a scan routine. This attempt tries to handle both the n and the q layer within the same scan routine.
with pm.Model() as model:
def step(P_n_stack, P_nq_stack,
x_tm1):
"""
step(sequences,
outputs_info,
non_sequences)
"""
x_tm1_n, x_tm1_q = x_tm1
x_t_n = pm.Categorical.dist(p=P_n_stack[0,x_tm1_n])
x_t_q = pm.Categorical.dist(p=P_nq_stack[0,x_tm1_n,x_tm1_q])
x_t = pt.tensor.as_tensor([x_t_n, x_t_q])
return x_t, collect_default_updates(x_t)
def markov_chain(x0, P_n_stack, P_nq_stack, shape=None):
states, _ = pt.scan(fn=step,
sequences=[P_n_stack, P_nq_stack],
outputs_info=[x0],
non_sequences=None)
return states
# T_n_stack is shape (steps, num_states_n, num_states_n) = (steps, 2, 2).
# It is made by repeating the `T_n` matrix `steps` times.
# T_nq_stack is shape (steps, num_states_n, num_states_q, num_states_q) = (steps, 2, 2, 2).
# It is made by concatenating the `T_q,0` and `T_q,1` matrices and then repeating that concatenation `steps` times.
x0_n = pm.Categorical.dist(p=pi_n.squeeze())
x0_q = pm.Categorical.dist(p=pi_q.squeeze())
x0_nq = pt.tensor.as_tensor([x0_n, x0_q])
xi_nq = pm.CustomDist(
"xi_nq",
x0_nq,
T_n_stack,
T_nq_stack,
dist=markov_chain
)
# Emissions and likelihood function not shown
I have another attempt in which I separately define a CustomDist for the n Markov chain and q Markov chain, and use the outputs of the n Markov chain to select the series of T_q,n_i matrices that are used in the q Markov chain.
with pm.Model() as model:
def step(P,
x_tm1):
"""
step(sequences,
outputs_info,
non_sequences)
"""
x_t = pm.Categorical.dist(p=P[x_tm1])
return x_t, collect_default_updates(x_t)
def markov_chain(x0, Ps, shape=None):
states, _ = pt.scan(fn=step,
sequences=[Ps],
outputs_info=[x0],
non_sequences=None)
return states
# T_n_stack is shape (steps, num_states_n, num_states_n) = (steps, 2, 2).
# It is made by repeating the `T_n` matrix `steps` times.
x0_n = pm.Categorical.dist(p=pi_n.squeeze())
xi_n = pm.CustomDist(
"xi_n",
x0_n,
T_n_stack,
dist=markov_chain
)
# T_nq_stack is shape (steps, num_states_n, num_states_q, num_states_q) = (steps, 2, 2, 2).
# It is made by concatenating the `T_q,0` and `T_q,1` matrices and then repeating that concatenation `steps` times.
T_q_stack = pt.tensor.as_tensor(T_nq_stack[:,xi_n,:,:])
T_q_stack = pt.tensor.reshape(T_q_stack, (n_steps,2,2)) # Need to drop the size-1 dimension
x0_q = pm.Categorical.dist(p=pi_q.squeeze())
xi_q = pm.CustomDist(
"xi_q",
x0_q,
T_q_stack,
dist=markov_chain
)
# Emissions and likelihood function not shown
Both of the above attempts raise the following error when sampled.
ValueError: Random variables detected in the logp graph: {categorical_rv{"(p)->()"}.out, categorical_rv{"(p)->()"}.out}.
This can happen when mixing variables from different models, or when CustomDist logp or Interval transform functions reference nonlocal variables.



