Hi all,
I am trying to implement a multivariate state-space model in PyMC for compositional time-series data.
Suppose I have multiple related time series observed over a common time index, but some series only begin partway through the sample. For example:
- Series A and B are observed from
t=1 - Series C only starts at
t=50
Before t=50, Series C is genuinely unobserved/nonexistent.
The paper I am reading proposes the following:
the ‘errors’ for these run-in periods are forced to zero using the formula
The idea is that the state-space recursion can still be written using the full dimensionality of y_{t}, while pre-entry observations remain np.nan.
What I am unsure about is how to handle these missing/pre-entry error components in PyMC so that they do not update the state and do not enter the observed-data likelihood/objective..
What would be the cleanest way to implement this in PyMC? Any guidance or example patterns would be greatly appreciated.
There is a public implementation available on GitHub which is not built on PyMC. Admittedly I haven’t spent a lot of time with this problem and also know very little about PyMC. But looking at the codebase I am trying find a similar alternative for this code in PyMC.