Thanks for your reply, @bob-carpenter.
Alternatively, is there a way to combine the hidden states n and q into a single regular HMM? Then you can apply all the usual HMM tricks like the forward algorithm for the likelihood or forward-backward to fully marginalize for even more efficient training. I don’t know what the state of that is in PyMC, but if there’s an HMM likelihood then it should be able to at least use the forward algorithm to efficiently calculate likelihoods.
If the two possible n state and two possible q states were to be joined together into four possible joint states 0:(0,0), 2:(0,1), 3:(1,0), 4:(1,1) , then the model I am seeking to implement is similar to that of Fig 6a of the IEEE paper linked above.
I have successfuly implemented this “joint state” model in Pomegranate, which offers convenient access to the forward-backward algorithm. However, I am ultimately working toward a larger and more complex model for which a more expressive language like PyMC or NumPyro seemed better suited and easier to scale. As an example, the single transition matrix of the “joint state” model has 4x4=16 elements, but in my case those elements aren’t all independent, so PyMC could help me pool some of those elements and reduce the number of parameters. That is what I was trying to accomplish by splitting the 4x4 matrix into a two-layer hierarchical model which still encodes 16 possible transitions, but with fewer parameters.
Is there a way you could write a graphical model out where the nodes represent random variables such that each variable is conditionally independent given the nodes that point into it? Then it’ll be easy to see what the notation means in terms of defining a likelihood and where it fits into the HMM literature, which is vast.
I hope the following diagram can help to answer your question. To the right of the diagram, I’ve sketched how each n_i and q_i node is distributed depending on the previous nodes that point into them. For q_i I tried mocking up a “switch distribution” (relevant link: Combining models using switch - #2 by ricardoV94). I’ve also included a likelihood based on the emissions V_i.
I appreciate your help!
