Recurring hierarchical partial pooling (at scale)

Hi everyone,

I need forecast a few hundred time-series and was considering using partial pooling to benefit from cross-learning. However, I can’t fit the model to all time-series at once, as they become available at different times.

My question is whether I would always need to fit the model on all time-series available at a given time, or if there’s a way to reuse an already existing fit at a later point in order to fit the same model on some new time-series.

For example, I might have 1,000 time-series at time A, which I’d use for fitting right away. At a later time B, 50 new time-series become available: Do I need to fit a new model on the 1,050 time-series, or could I use the model from time A for the fitting at time B (which would hopefully be faster)?

Any advice would be appreciated. Thanks!

Expectation propagation might be the proper way to do but it is not available in PyMC yet: (not sure if it available at all in other PPLs). The challenge is that you want to fit the new time series but also update the old posterior you have.

And alternative is to construct a good VI approximation, and train it with mini-batch. Then you can treat new time series as new batches.

All in all: https://twitter.com/junpenglao/status/1453298183132651522?s=20

3 Likes