A general question here.
I’m working with a forecasting model where each individual input consists of lags of the response variable (let’s say six lags in this case) and the response variable is the next 12 steps forecasted. See example data below for how this data looks.
import numpy as np
x = np.random.random((21,6))
y = np.random.random((21,12))
However, I’d like to know if
six lags is the right number or if it should be 10 or 12 or 2. Is there a bayesian way to quickly test how each corresponds to a target variable of a different size? Something
like a correlation coefficient but…with credibility intervals?
I think lag selection is usually done with shrinkage priors? You specify a large number of lags, then use something like a horseshoe prior to strongly bias the coefficients towards zero. See for example here.. Example implementation in PyMC here. I also do a modified version here, which has been suggested to be more compatible with NUTS. I’m not sure mine is implemented correctly.
Why are you thinking about correlations between the lags and the forecasts? What would that tell you? If the data generating process is stationary, the auto-covariance matrix should not be a function of time, so the correlations between time t and t-h for any t and a fixed h is the same. Thus the auto-correlations between the forecasts and any lags ought to be the same as the same auto-correlations as the in-sample data I think.
Ultimately, I’m running multiple experiments with multiple lags. I used the term “correlation” loosely in my original post (probably not a good idea considering the audience).
But with multiple items needing a forecast, and all of them having different lengths of demand history, I was curious if there were some way to test (quickly) and see how many lags gave the best result. Your post was helpful and the modified version notebook you shared is clear and to the point.
Thank you for the help.
It might just come down to running a boatload of models and computing MAPE/RSME/whatever for different specifications, especially if you have forecasts for lots of products that might all respond differently to the number of lags. But I think shrinkage is a good principled way to handle the problem.
Yes, I think you’re right. I was “grasping at straws” hoping there was a way to avoid that.
Thinking about it a bit more, I’m not sure it makes sense to do a huge battery of model comparisons – the search space is too big. If you have information sharing between time series (i.e. hierarchical parameters), you are going to need a quasi-automatic way to pick the lag order. I’d recommend at the very least putting Laplace priors on the lag coefficients.
I also don’t think you need to ever contemplate more lags than the seasonal order of the model. For example if you have weekly data, the max lag i’d consider is 4, otherwise I’d consider it monthly seasonality and handle it with fourier bases or something. Daily data I’d consider up to 7, monthly up to 12, etc.
tldr: treat autoregressive terms are like a garbage can for unexplainable, short-term effects. If you can explicitly model other structural elements (like seasonality, trend, cycle, etc), do that instead of relying on autoregression.