I am new to bayesian modeling and to PyMC and I am trying to implement a Hidden Markov Model (HMM) with gaussian emissions with the aim of predicting the next observation (x_N+1) of a given sequence of observations (x_1,x_2,…,x_N), using PyMC3.
I was able to estimate the parameters of the model and get posterior samples from multiple short sequences, but, now, I want to predict the next observation of new/unseen sequences. I read here that, to obtain out-of-sample data in a logistic regression model (for example), we can give new predictors to the model using shared variables and then use the function
pm.sample_posterior_predictive to obtain the posterior predictive distribution.
However, I cannot understand how I can apply this to my problem. I read here that the posterior predictive distribution is the data the model is expecting to see (X*) after seeing the dataset X. In other words, it is the distribution of observations the model would expect for a new experiment, given that we observed the results of previous experiments (in this case, other sequences). So, in my problem, I think it does not make sense to switch out the observed dataset for the unseen sequences using shared variables and sample from the posterior predictive. I think it would give me the observations the model would expect to see, instead of predicting the next observations of each new sequence.
In HMMs we need the last hidden state of the sequence to “propagate” the model forward and predict the next observations. However, I do not understand how I can implement this using PyMC3.
Can anyone help me?
I am sorry if this is a very basic question, but I am new to this and I am really trying to learn!
Not really an answer to your question, but you may want to check out pymc3-hmm, which was developed by @brandonwillard, one of the core contributors to aesara.
Thank you for your comment.
I used that library for everything I implemented so far! It helped me a lot. It has fully implemented distributions and step methods that we can use in PyMC3 models. However, I think it does not have any feature that allows me to solve my problem.
Thank you anyway!
You should be able to write a function outside of PyMC that generates random draws according to your HMM and a fixed set of parameters. To get new predictions you run this function with the inferred parameters (e.g., using the mean, or using a bunch of posterior draws to propagate uncertainty).
The logistic model is a trivial case where the same function used for defining the PyMC model can be easily used for doing posterior predictive as well, but this is not the only way. Once you have posterior draws you are no longer forced to keep working with PyMC.
The first part of this notebook, shows a simple function that takes random draws from a specific HMM and a set of parameters: How to wrap a JAX function for use in PyMC — PyMC documentation
Thank you! This helps a lot!
I will still need to predict the hidden states that are associated with the new observations, so probably I will have to implement something like the Viterbi Algorithm. Then, with the last hidden state of the sequence and with the inferred parameters, I will be able to obtain new predictions.
I will try to implement this. Thanks!
I was able to implement a function that receives the estimated parameters and then predicts the next observation of unseen sequences. Until now, I only used the mean of the inferred parameters, however, I would like to use a bunch of posterior draws to propagate uncertainty. Is there any function that allows me to get draws from the inferred data? Or will I need to approximate the obtained posterior distribution to a known distribution (for example, normal or half-normal) ?
If you have a function that works with the mean of your posterior, you can reuse the same function but with a single draw from the posterior. If you call this function for a 100 or so draws from your posterior (or, even better, all of them if sampling is fast enough) it should output distributions that nicely reflect your posterior uncertainty for future observations.
Thank you for your answer.
I probably didn’t explain my doubt very clearly, but I think I already understood what I have to do. I know that I can call this function for every draw I take from my posterior, however, I was having difficulties taking draws from the posterior.
I sampled 2 chains with 1000 draws each. So, now, I have to randomly take draws from the sampled chains, for each parameter, and then use the draws in my function. Finally, I will have a distribution that reflects the uncertainty related to my prediction. Am I correct?