New PyMCon Talk Released: Bayesian Causal Modeling by Thomas Wiecki & Ben Vincent

Welcome to the 10th event of the PyMCon Web Series! As part of this series, most events will have both an asynchronous component and a live Q&A.

Speaker: Thomas Wiecki, CEO & founder of PyMC Labs.
Event type: Recorded Talk with Live Q&A
Q&A Date/Time: 2023-09-28T13:00:00Z (subscribe here for email updates)
Register for Q&A: Meetup event (to get the Zoom link)
Website: PyMCon Events · PyMCon Web Series

NOTE: This session is exclusively for Q&A. We kindly request that you watch the recording before joining the event. Plus, The event will be recorded. Subscribe to the PyMC YouTube channel for notifications.

Abstract of the talk:

Causal analysis is rapidly gaining popularity, but why? Machine learning methods might help us predict what’s going to happen with great accuracy, but what’s the value of that if it doesn’t tell us what to do to achieve a desirable outcome? Without a causal understanding of the world, it’s often impossible to identify which actions lead to a desired outcome.

Causal analysis is often embedded in a frequentist framework, which comes with some well-documented baggage. In this talk, Thomas will present how we can super-charge PyMC for Bayesian Causal Analysis by using a powerful new feature: the do operator.

Content

Slides:

Code:

About the Speaker:

Sponsor

We thank our sponsors for supporting PyMC and the PyMCon Web Series. If you would like to sponsor us, contact us for more information.

Adia Lab is an independent, Abu Dhabi-based laboratory dedicated to basic and applied research in data and computational sciences.
ADIA Lab focuses on societally-important topics such as climate change and energy transition, blockchain technology, financial inclusion and investing, decision making, automation, cybersecurity, health sciences, education, telecommunications, and space, by conducting cutting-edge research in Data Science, Artificial Intelligence, Machine Learning, and High-Performance Computing.

3 Likes

Does the do-operator in PyMC work with time series? If I intervene on a state at time t, does this propagate through to later states in the model?

In your example, (if I recall correctly) you implemented the do(z) operator with a binary variable. How does this change when z is a continuous variable? I believe this may be relevant to your hello fresh example, but I am not sure if I am making the connection.

What changes to Bayesian workflow (Gelman et al 2020) should we consider making when we’re also trying to do “causal workflow”?

Hi, super cool talk. I would like to know, When a node is fixed with the do operator into a certain value, What happens with the parent nodes (the ones above in the hierarchy)? Are those sampled from the prior or simple not used in the model anymore? Thanks!

1 Like

You would probably need to create a new predictive model to intervene more granularly inside a time-series (so manually, instead of using do to do that for you). Something like the forecasting examples in this blogpost: Out of model predictions with PyMC - PyMC Labs

2 Likes

If they are not connected to the likelihood through any other path, then yes, they will be sampled from the prior (in prior and posterior sampling). There is a kwarg in do to remove such variables: prune_vars=True.

https://www.pymc.io/projects/docs/en/stable/api/model/transform/generated/pymc.model.transform.conditioning.do.html

1 Like

The type of variable being intervened upon does not change the process. Do you have a specific concern in mind?

1 Like

EDIT: I replied to the wrong spot so I am moving it to the correct reply.

Hi Ricardo,

Thomas was able to answer this in the discussion. He pointed out that we get so used to thinking about distributions that we forgot about the concept of point estimates altogether :laughing:. He pointed out that it does not matter if z is discrete or continuous; however, we do need to remember that do(z) will be a constant value. That clarified the issue for me.

The second question I asked was more clarification about do vs observe in time series problems. I was wondering if we can treat observe as the do operator if we just replace our data with a simulated time series using observe. He pointed out that, no, we cannot do that because observe just replaces the data while maintaining the graph structure, whereas do breaks the graph structure to isolate that point of intervention. That helped clarify the difference between these two operators for me.

Thanks all, this was great!

2 Likes

If you couldn’t attend the live Q&A session, you can watch the recording on YouTube. Here’s the link:

1 Like