(Also, I did read the API on seasonal components.)
I became convinced…
[Edit: I shortened my question a lot to save time] Basically, is the recommended approach to inject 0’s for missing days of data (Mon, Tues in my data) when estimating seasonality, or should I just reduce the ‘n’ parameter in the frequency and time seasonality components to match my data (e.g. not 365 days per year but 261 days per year since I have 261 days of data)? Does it become impossible to forecast for Mondays and Tuesdays without those injected 0’s?
Definitely don’t put 0. You can put np.nan and just interpolate the missing values with the seasonal lengths at 365 for annual and 7 for weekly. But if you drop them, you definitely need to set the seasonal lengths to 261 for annual and 5 for weekly, otherwise the filter is not going to be able to learn the right pattern.
You can help check an approach is right or wrong by making seasonal plots for each option and checking that the seasonal pattern looks right in the data at a given frequency.
Somehow, I think I’m still not following the intended approach with nan’s for missing data while using seasonal components.
The first image below shows a failure to capture the weekly pattern. The second image shows a perfect capture of the weekly and quarterly, with residual noise in the quarterly because, I assume, I haven’t included a measurement error component yet.
If I put nan’s back into my data, I’ll get a seasonal output like this:
The other difference between the two images is with the first, I parameterize the seasonal components (7, 365) and I’m using data containing nan’s for missing days. In the second image I only use my data (no nans, no zeros) and I parameterize seasonal components with 5 days a week and 261 days a year.
After your reply, I believed I could set my seasonal components to be (7, 365) with nans for missing data, and, I thought nan’s would be handled by Kalman filter interpolation auto-magically. But now I’m not sure what to do. Could you advise a little further?
Also, suppose I do nan’s with seasonals correctly, but my model is still having a hard time resolving the weekly variance with interpolated Mondays and Tuesdays, would you recommend injecting 0’s in for the NaN’s but then using an exogenous indicator variable called “Days Open” that might circumvent Kalman interpolation creating weird variance patterns?
I’m asking because before I came to this forum, I noticed weekly variance kept flip flopping between quarterly and weekly, and quarterly variance doing the same thing in tandem. I began to believe that weekly variance was being improperly estimated, causing the model to fail to resolve the patterns accurately.
n=4 on the annual component might be too expressive. As you increase n, the fourier bases have higher and higher frequency components, and they’re able to capture more ‘local’ variation. I suggest you keep n=1 or n=2 for the annual component. n=1 restricts you to pure sine waves, while n=2 will allow for a “big peak, small peak” pattern. Usually I just want a pure wave in the annual component.
Also be really picky about setting the variance of the innovation priors. I’ve found that these models are extremely sensitive to this. I don’t have a good solution for it yet, but there are some R2D2 style priors out there that could be helpful someday.
You could complete that with the implicit priors approach presented by Buerkner at StanCon. I haven’t tried yet but intuit that could be very helpful to improve priors of AR-style models.
I’m actually working on including both of these methods into one of our time series tutorials but keep getting delayed on it
This approach is quite exciting. I agree it could be useful for improving priors in time series models. There are known bijective transformations that can be used to constrain SARIMA and VARMA parameters to be stationary/invertible, but they are computationally burdensome (recursive, of course), especially in the gradients. We could use them forward to get some draws from a stationary version of a user’s priors, then do prior preconditioning, remove the expensive transformations, and do NUTS.
Yeah I was thinking of doing that for the new sections I’m adding to the structural TS example on PyMC: it’s super hard to get a stationary prior on an AR2 model for instance
Thanks for the input; in the end I did what you suggested - I took my annual to a n=2. I believe the confusion between the weekly and annual has to do with how constrained they both were - and the fact that Tuesday is a NaN in my data - which has HUGE variance. So if I overly constrain my weekly, then parts of my model want to begin accounting for the 5 standard deviations of variance on Tuesdays. Loosening up my weekly and tightening up the annual to n=2 seems to have fixed my issues.
Thank you for your hard work on this; I’m really appreciating exogenous variables. Now I just need Cuda so that I don’t have to wait overnight for my model to finish.
GPU sampling should already work with numpyro mode. I haven’t personally had much success with it, but if you have access to multiple devices maybe you will have better results than me. I’m also running on WSL, and I think my drivers might be borked.
One thing I did have good success with is MAP+Laplace via JAX, on CPU and GPU. That’s in pymc_extras.inference.fit_laplace. There’s currently a bug, but it will be fixed very soon. That might be enough for your application, or at least it will help you to iterate on models before you do commit to a full overnight run.
No, that tells you to conda install better_optimize.
The bug is that gradient_backend isn’t respected when you do fit_laplace. So you’ll get an error about scan gradients if you set gradient_backend='jax', which you need to do.