CausalPy - Intuition behind calculation of causal impact and treatment time inputs

ykarle · June 4, 2025, 10:44pm

Was following the below CausalPy tutorial for Interrupted Time Series and using it for a simple pre-post comparison.

Pre-world consists of the old world which had a old model controlling the outputs and post-world consists of the new world since we rolled out a new version of the model.

I have fitted a very basic model with a simple enough formula with just two variables - time counter (t) and then another variable that has effect on the target KPI

treatment_time = pd.to_datetime("2025-02-05")
result_model_a = cp.InterruptedTimeSeries(
    its_df,
    treatment_time,
    formula="kpi__target ~ 1 + kpi__driver + t_",
    model=cp.pymc_models.LinearRegression(sample_kwargs={"random_seed": seed}),
)

I have couple of questions:

Is there any documentation to help with interpretation of the causal impact output I am getting? i.e., I want to be able to trace this mean impact back to actual KPI uplift (both in %s as well as the monetary value).
Is there a way to specify more than one treatment times for the same model? Basically aiming to compare (Pre-test vs during-test) and (Pre-test vs post-test) impacts in one go.

Any help or documentation you can point me towards will be of big help!
Thanks you!

drbenvincent · July 4, 2025, 9:41am

Hi @ykarle

Question 1

There is a small bit of documentation in the notebook around that cell. The result.post_impact contains the causal impact over for each post-intervention time point.

The post_impact is calculated here

github.com/pymc-labs/CausalPy

causalpy/experiments/interrupted_time_series.py

8b26d4271


      
          self.post_impact = self.model.calculate_impact(self.post_y, self.post_pred)

which calls calculate_impact which is currently

github.com/pymc-labs/CausalPy

causalpy/pymc_models.py

8b26d4271


      
          def calculate_impact(
              self, y_true: xr.DataArray, y_pred: az.InferenceData
          ) -> xr.DataArray:
              impact = y_true - y_pred["posterior_predictive"]["y_hat"]
              return impact.transpose(..., "obs_ind")

Though there is some discussion about whether we should be using the model expectation rather than the posterior predictive y_hat, see Is `calculate_impact` correct? · Issue #496 · pymc-labs/CausalPy · GitHub

In sort, it represents the difference between the observed post-intervention data and the posterior predictive distribution. The latter represents what we would have expected to see in the absence of an intervention - because the model predictions are based on the model whose parameter estimates are learnt from the pre-intervention data.

Given that result.post_impact has values for all post-intervention time points, then you may wish to summarise or aggregate in various ways. In the notebook we talked about taking the mean, which would equate to the average causal impact over the entire post-intervention period. There is a warning box in the notebook that highlights that that metric might not be the most useful if you have a transient causal impact. So you may instead want to look at the sum - which would equate to the total causal impact of the intervention.

Your question is a good one, not least because it relates to the discussion I already mentioned (Is `calculate_impact` correct? · Issue #496 · pymc-labs/CausalPy · GitHub), but because the docs could do with more detail here. I’ve created an issue to improve the docs in this way Expand docs: more info required about aggregating the causal impact over time · Issue #498 · pymc-labs/CausalPy · GitHub

drbenvincent · July 4, 2025, 9:49am

Question 2

I don’t know if this directly gets at what you want, but we have an in progress PR that will give new functionality to the interrupted time series experiment Draft (new feature) : Model to estimate when a intervention had effect by JeanVanDyk · Pull Request #480 · pymc-labs/CausalPy · GitHub

Namely, it allows you to model intervention time as an unknown latent variable (a parameter of the model). That is, it will give you a d

This is not yet ready, but could you say more about what your ultimate goal would be here?

If you wanted to evaluate the causal impact during the intervention and after, then you can just aggregate over the post intervention causal impact for the different time periods that you wanted to look at. result.post_impact is an xarray object, so you can just slice over the time periods you want. Is this sounding close to what you want?

JeanVanDyk · July 4, 2025, 11:17am

Question 2

Hi @ykarle,

I’m writing to give you more details about that PR and check whether it could actually be useful for your case.

At the moment, the InterruptedTimeSeries feature in CausalPy only supports a single treatment time. That might still work for you, though, if your goal is simply to compare two periods separated by a specific point in time, although you’d have to run it separately for each comparison.

However, if what you’re trying to do is estimate where the causal impact starts or account for uncertainty around multiple possible treatment times, given your assumptions about the intervention’s effects, then I highly recommend taking a look at this notebook from the PR.

To help us better understand your use case, could you share a bit more about your main goal (e.g. detect change point, quantify effect, etc.)

That would help us guide you more effectively, and improve the tool if needed !

Cheers,
Jean

ykarle · July 8, 2025, 10:59am

Thanks @drbenvincent very clear about the rationale behind the workings underneath the result.post_impact and the ways to use it for interpretation of the causal impact. Very valid call-out of the important warning about waning off of the effect in the post-intervention period, something I too was thinking about in terms of drawing a line (end_date filter for the query when gathering post-period data) when calculating the impact. Thanks also for linking the PR for the docs improvements!

ykarle · July 8, 2025, 12:38pm

Thanks @JeanVanDyk for details on the PR notebook relating to question #2 above.

There’s a slight development at my end since I asked the question and there are two lines of thought here:

Case where there is a period of time where the population was exposed to multiple iterations of a treatment - essentially a series of A/B tests carried out until the most optimal treatment was found (say iteration v5). In this case, slicing the result.post_impact xarray object as suggested by @drbenvincent or running it separately for each comparison as you suggested solved for my case where I wanted to exclude the during-test period from my posterior data to gauge the true causal impact post the rollout of iteration v5 only.
Another use-case, and slightly related to the PR you mention here but not necessarily direct change-point detection, is about trying to spot and account for other unrelated or external treatments that might be in play in the time-series data. Examples - a marketing/discounts campaign or a release of a new product feature that might quantify a part of the effect instead of purely attributing all of the effect to the treatment time if that makes sense.

JeanVanDyk · July 16, 2025, 5:52pm

Glad to hear that the slicing/“running it separately” approach worked well for your first use case, that’s encouraging!
Regarding the second case: from my perspective, the most straightforward solution would be to use well-informed priors that reflect your assumptions about the structure and timing of different effects.

One of the main challenges, and one of the key reasons for developing the changepoint detection feature, is precisely that you rarely know in advance when an effect starts to manifest. However, if you do have a good analytical understanding of the expected effects (in terms of timing and nature), setting appropriate priors could allow the model to disentangle them and store them in distinct latent variables. After inference, you could then inspect the posterior distributions to estimate the contribution of each.

Alternatively, if you have access to synthetic control data that captures the influence of external factors (as you said marketing/discounts campaign or a release of a new product feature) but not the treatment you’re targeting, then you could design a model that isolates the specific causal impact of your treatment of interest by contrasting against those controls, using the already implemented synthetic control experiment.

There might be other solutions, but I’ve already gone a bit beyond what I’m most familiar with.
Please feel free to reach out again if you have more questions, and if there are other possible methods, I’m sure Ben will jump in with some suggestions.

ykarle · July 27, 2025, 11:39am

Thanks @JeanVanDyk for an elaborate answer, definitely making sense in the context of the second use-case (quantifying effect of other possible treatments before/after the main treatment) I had originally mentioned. This is sounding exactly what I am trying to achieve here in terms of multiple expected effects (at different point in times).

Can you please provide some guidance as to how I could tweak the vanilla Interrupted Time-Series model setup to use well-informed priors for achieving the above? And also for disentangling them and storing them in distinct latent variables - do you have any examples you can share that would be really helpful!

Noted also about possibly formulating the data/problem as a Synthetic Control experiment - something I can look into in terms of capturing the influence of external factors and then isolate the causal impact of the main treatment.

drbenvincent · July 29, 2025, 11:59am

Can you please provide some guidance as to how I could tweak the vanilla Interrupted Time-Series model setup to use well-informed priors for achieving the above?

We hope to merge new functionality which will allow users to specify custom priors soon. We’ll be updating the docs to make sure it’s clear how to do that.

But in the mean time, you’d have to edit the pymc models, or to create your own. That’s not actually that difficult (though is obviously not the optimal API) if you look at the LinearRegression class in the repo. You can just create your own class exactly like that (ensuring to inherit from PyMCModel) and add in your own custom priors.

But like I say, a nice API for user-specified priors is coming soon

Topic		Replies	Views
CausalPy - Evaluating uncertainty for Interrupted time series version agnostic time_series , modeling , causalpy	9	727	March 2, 2024
New PyMCon Talk Released: Bayesian Causal Modeling by Thomas Wiecki & Ben Vincent PyMCon Web Series	9	1217	October 2, 2023
Do Operator not working correctly with deterministic function v5 bug , modeling	1	233	November 15, 2023
Time series intervention analysis Questions	1	506	October 18, 2018
🚀 PyMC 5.8.0: What's New? News	0	452	September 12, 2023

CausalPy - Intuition behind calculation of causal impact and treatment time inputs

Question 1

Question 2

Question 2

Related topics