State Space Models in PyMC

Fair enough that I am mixing together two issues, inference and forecasting. When forecasting, of course all data is used, because the information set for out-of-sample data is all the data. Both methods, the Kalman filter and the GRW, should produce the same result. I will test this when I have time. In addition, outputs based on the Kalman smoothing distribution should be very close to the GRW results – I observed something like that way up in the beginning of this thread. So you get expanding window train-test split results for free with the Kalman filter by not using future information in the forward pass, then full-information estimation in the backward pass.

But there is still an issue with respect to inference, and there are reasons why you wouldn’t use future information when doing parameter inference. It depends on the perspective you want to take in your analysis: that of yourself, looking back from the future, or that of an historical agent who is generating the data. In the second context you aren’t ignoring data, you are handling the data in a manner that respects the data generating process. If you want to back-test a trading strategy based on market beta, for example, you need to use market beta estimates that only use information available up to time t, i.e. the Kalman filtered estimates. Otherwise your estimates will be “over fit” to the history that actually occurred, instead of the myriad that might have unfolded, given information at time t. This the sense in which I meant to use the term “over fit”.

These types of situations come up often in macroeconomics and finance. Again, my thinking on the issue is biased because that’s where I work. I can imagine there are similar concerns in other social sciences – election outcome prediction perhaps? – but I don’t have any expertise there so I couldn’t say.

1 Like