I am modeling a collection of timeseries. The data is right censored to a varying degree (about 30% of the data is missing overall). There are some general trends in the data, specifically, most individual time-series show a growth period until a peak level, followed by an exponential decay to zero, and/or a sharp drop to zero. The amplitude, growth/decay rates, timing of the switch from growth to decay all differ between individual time-series.
I tried a few different models, including a hierarchical parametric model and a Multivariate Gaussian Random Walk Model. However, I can’t seem to get a good fit for the data.
The flat parametric model works well on most individual time series but the hierarchical model fit is poor. I am not sure why, but I am guessing that there is too much variability in the data for a multi-level model.
The Gaussian model does not infer the missing data correctly, i.e. it imputes that the timeseries process continues around the level of the last observation, whereas it should be going up until some timepoint, t, and then decaying to zero. Another issue with the Gaussian model is that it imputes a lot of the missing data as negative, which does not make sense for the real-world application. I tried constraining the model to force the likelihood to be located above zero but that slows the sampling down and makes the fit worse.
At this point, I am contemplating fitting many flat parametric models for individual timeseries. However, that might be computationally and logistically difficult, as I would be looking at fitting ~2k models. The second issue is that I would like to use the timeseries with less right-censoring to impute the ones with more missing data (either via their parameters or directly from the data), and I am not quite sure how to do that.
Does someone have a suggestion on how I can try modeling this data? Any tips on what have I not tried / done wrong would be greatly appreciated!