CausalPy - Evaluating uncertainty for Interrupted time series

Hi @aabugaev, thanks for the interest in CausalPy. I’ll have a stab at answering your questions. I’ll just drop the image from the page you linked to make this easier.

1 - Cumulative causal impact

There’s a slight misunderstanding in what’s happening here. We aren’t in fact summing intervals to get cumulative ones.

If we were in point-estimate land then you’d have just one time series. To calculate the causal impact you simply subtract the predicted time series from the observed time series. Then calculating the cumulative causal impact would simply be a matter of calculating the cumulative sum of these differences.

The only thing we are doing differently here is to apply that exact algorithm to each MCMC sample. So now, rather than having 1 predicted time series, you might have 1000 for example.

This is done with this line here:

self.post_impact_cumulative is an xarray object and you can read the cumsum API here. So we are effectively just doing the same cumulative sum operation on a single time series that you’d do in point-estimate land, but we’re just doing that for all the MCMC samples.

2 - Why do Confidence [credible] Intervals grow with time?

Under the Bayesian approach we have credible intervals, which are different from confidence intervals, but I’ll leave that for your third question.

You are right to say that this is currently just linear regression so there is no explicit time series modelling going on here. So what you’d expect to see is no increasing uncertainty in the actual post-intervention estimate and causal impacts (top and middle plots), and that’s exactly what we see.

There is a plan to add actual time series modelling here, in which case you would expect to see increasing uncertainty into the future.

Anyway, the point is that we are seeing in increase in the cumulative causal impact. When you remember that what’s going on is simply cumulative sums independently for all MCMC samples, then it becomes pretty intuitive. But let me know if that is enough to make the penny drop or not.

3. Intervals

Time is a bit short on my side, so I can’t give a whole primer here. Rather than confidence intervals, Bayesians use credible intervals. And yes, there is also the Bayesian posterior predictive distribution. In short, this doesn’t just represent the expected value, but also takes the observation noise / likelihood distribution into account and can be thought of as a prediction of what you are likely to see next.