Hi Nicolo!
The short and unhelpful answer is that you need to use a predictive model. Here’s an article that explains all about predictive models. The tl;dr is that in PyMC, you’re allowed to recycle posteriors inside new models. This is useful because you can tweak parts of your model to get it to do new and interesting things, like recover marginalized variables, do counter-factual analysis, and, importantly, forecast.
When you fit the HMM, you use the scan to do a sequence of one-step ahead forecasts. Looking at the model again:
y = pm.ConstantData("y", dta_hamilton, dims="dates")
yobs = pm.ConstantData("yobs", dta_hamilton.iloc[4:], dims=["obs_dates"])
# ... other stuff
result, updates = pytensor.scan(
AR_step,
sequences=[
{"input": hidden_states, "taps": [0, -1, -2, -3, -4]},
{"input": y, "taps": [-1, -2, -3, -4]},
],
non_sequences=[state_mus, ar_coefs],
)
sigma = pm.HalfCauchy("sigma", 0.8)
obs = pm.Normal("y_hat", mu=result, sigma=sigma, observed=yobs, dims=["obs_dates"])
So the data appears twice, once as y
, which is the full dataset, and again as yobs
, which drops the first 4 observations. In this way, we’re doing one-step ahead forecasts.
This is nice for fitting but useless for forecasting. For forecasting, we want to use the last 4 observations to kick of a recursive cycle of input data → step forward → output data → input data → step forward …
To do this, we need to re-write the model. Here’s how it looks, with some comments that I hope help to explain what is being done and why:
# Number of steps forward to forecast
H = 20
# Generate date coordinates for the forecasts, that will make it easier to plot it together with the observed data.
# pd.date_range is left inclusive, so we want to drop the first date (that's data).
coords.update({'forecast_dates':pd.date_range(dta_hamilton.index[-1], freq=dta_hamilton.index.freq, periods=H+1)[1:]})
# Notice that we're making a new model! I'm calling this one forecast_model, but it just needs to be new.
with pm.Model(coords=coords) as forecast_model:
# Grab the last 4 observations as the first inputs to our forecasts
last_obs = pm.ConstantData("last_obs", dta_hamilton.iloc[-4:])
# Variables that we want to be sampled from the posterior we obtained during fitting should use pm.Flat distributions.
# This is not strictly necessary -- pm.sample_posterior_predictive matches on the name of the random variables, and if
# it finds a match, it will totally ignore the model variable. BUT, pm.Flat will raise an error if it doesn't find a
# match, which is desireable in our case.
P = pm.Flat("P", size=(2,2))
hidden_states = pm.Flat("hidden_states", dims=["dates"])
# I found that hidden_states was sampled as float64, which raised an error, so I had to add this intermediate variable.
hidden_states_int = pm.Deterministic('hidden_states_int', hidden_states.astype('int64'))
# A DiracDelta is a distribution with only a single value. DiscreteMarkovChain needs an initial distribution, and we will
# tell it to always begin from the last estimated hidden state.
initial_hidden = pm.DiracDelta.dist(hidden_states_int[-1], shape=(1,))
# Sample a new sequence of hidden states going forward from the end of the data. I say steps = H + 2 because that's what
# ended up working...
forecast_states = DiscreteMarkovChain('forecast_states', P=P, init_dist=initial_hidden, steps=H + 2)
# We lose the first 4 hidden states when we scan (because there are 4 lags), so we need to append the initial hidden
# states to get the ball rolling.
all_hidden = pm.Deterministic('all_hidden', pt.concatenate([initial_hidden, forecast_states]))
# More pm.Flat, because we just want to sample these from the posterior
state_mus = pm.Flat("state_mus", dims=["states"],)
ar_coefs = pm.Flat("coefs", size=order, dims=["ar_params"])
# New scan. Notice that now we're using outputs_info instead of sequences -- this is what signals that we're doing
# recursive computation. Otherwise things are pretty much the same; we even recycle the AR_step function.
result, updates = pytensor.scan(
AR_step,
sequences=[{"input": all_hidden, "taps": [0, -1, -2, -3, -4]}],
outputs_info=[{'initial':last_obs, 'taps':[-1, -2, -3, -4]}],
non_sequences=[state_mus, ar_coefs],
)
# More pm.Flat!
sigma = pm.Flat("sigma")
mu = pm.Deterministic('mu', result, dims=['forecast_dates'])
# We actually want to sample this variable, so we need to 1) use a real distribution (NOT pm.Flat!), and 2) give it a
# unique name (that wansn't in the parent model)
forecast = pm.Normal("forecast", mu=result, sigma=sigma, dims=["forecast_dates"])
# Call pm.sample_posterior_predictive, providing the idata that contains our posteriors
idata_forecast = pm.sample_posterior_predictive(idata, var_names=['forecast'])
Here’s how the forecasts look (HDI shaded from 50% (darkest) to 99% (lightest)):
I might have flubbed the indices, make sure to double-check all that.