Predicting out-of-sample for autoregressive models

gkbk · March 21, 2018, 4:50pm

Hi All,

Is there an easy way to make out-of-sample predictions for autoregressive models? I’m running some forecasting models with two AR components (one is seasonal) and I can’t find a simple way to generate estimates. Any help would be appreciated

colcarroll · March 22, 2018, 1:55am

Do you mean like sample_ppc? This is a good guide for that.

gkbk · March 24, 2018, 12:25pm

Thanks, @colcarroll !

It’s possible that I haven’t set up my AR coefficients correctly for these purposes, but since each new observation is based on the previous N observations, I can’t seem to forecast out with PPC, since I can’t pass Y_{t-N} to a shared variable, as it won’t have been estimated yet for the future y_t I’m trying to estimate.

In other words, if my latest in-sample time point is t_100 and the expectation for my likelihood is y_t = a + phi*y_{t-7} , then I have to estimate t_101 before I can get to t_108, which I’m not sure I can do with sample_ppc.

junpenglao · March 24, 2018, 4:29pm

I dont think you can do it using sample_ppc, as AR and AR1 does not have a random method.

You will have to write a generative function, and index to the posterior samples (as trace point) to generate ppc.

maxliving · November 29, 2018, 2:10pm

I believe I have a basic working version of this. Assumes your trace has a rho and scale. This also allows for multiple observations for the same date, which is often a case I’m working with. But if you want to just have one observation per date you could just remove the date_idx parts.

 def predict_outofsample(trace, date_idx):
        """
        trace: a pymc3 MultiTrace object
        date_idx: np.ndarray with shape (N_obs), indicating for each observation what date it corresponds to
            (so you can have multiple observations on the same day that will have the same prediction)
        """
        samples = []
        horizon = np.max(date_idx)
        for point in enumerate(trace.points()):
            rho, scale = point['rho'], point['scale']
            thetas = [np.random.normal(loc=0, scale=scale)]
            for i in range(horizon):
                thetas.append(rho*thetas[-1] + np.random.normal(loc=0, scale=scale))
            samples.append(thetas)
        return np.array(samples)[:, date_idx]

twiecki · November 30, 2018, 3:44pm

Alternatively you can just append nans to your data for the period you want to predict, these will be interpreted as missing values and HMC will generate a posterior predictive for them during inference. But it will be slower than your current approach.

maxliving · November 30, 2018, 4:09pm

Thanks Thomas! Also then I would need to know at time of inference how many future dates I wanted to predict, right?

twiecki · December 3, 2018, 9:19am

Yes, that’s right.

Topic		Replies	Views
Forecasting using distributions\timeseries in pymc 4.4.0 Questions	3	497	March 23, 2023
Prediction concept in Pymc3 Questions	0	369	September 7, 2021
Making Out of Sample Predictions with GaussianRandomWalk v5 time_series	9	1030	October 5, 2022
Out-of-sample random-walk time series prediction Questions	7	2271	May 5, 2023
How to obtain "Poisson Prophet" model's out-of-sample prediction for each level of each group? v5 modeling	0	103	June 29, 2024

Predicting out-of-sample for autoregressive models

Related topics