Partial Missing Multivariate Observation and What to Do With Them by Junpeng Lao

Talk Abstract

Missing value is pretty common in any real world data set. While PyMC3 provides convenient automatic imputation, how do we verify it works, especially dealing with multivariate observation with partially missing value? Come to this tutorial to find out!

Junpeng Lao Twitter @junpenglao


All codes are here in this notebook:

Junpeng Lao

Junpeng Lao is a PyMC developer and currently a data scientist at Google. He also contribute to Tensorflow Probability and varies other Open source libraries.

This is a PyMCon 2020 talk

Learn more about PyMCon!

PyMCon is an asynchronous-first virtual conference for the Bayesian community.

We have posted all the talks here in Discourse on October 24th, one week before the live PyMCon session for everyone to see and discuss at their own pace.

If you are available on October 31st you can register for the live session here!, but if you are not don’t worry, all the talks are already available here on Discourse (keynotes will be posted after the conference) and you can network here on Discourse and on our Zulip.

We value the participation of each member of the PyMC community and want all attendees to have an enjoyable and fulfilling experience. Accordingly, all attendees are expected to show respect and courtesy to other attendees throughout the conference and at all conference events. Everyone taking part in PyMCon activities must abide by the PyMCon Code of Conduct. You can report any incident through this from.

If you want to support PyMCon and the PyMC community but you can’t attend the live session, consider donating to PyMC

Do you have suggestions to improve PyMCon? We have an anonymous suggestion box waiting for you

Have you enjoyed PyMCon? Please fill our PyMCon attendee survey. It is open to both async PyMCon attendees and people taking part in the live session.


I guess my talk did not exactly match the abstract, as I end up explaining the missing value handling in general :sweat_smile:
Funny story is that after I submitted the abstract, I realized we dont need special treatment for modeling partial missing multivariate observation, PyMC3 does its magic automatically!


Feel free to edit it however you’d like!

Thanks, just to give a bit more authentic feel to the conference, i will be the one giving a talk that is not the same as the abstract at all :joy:


@junpenglao Really loved the talk and all the details of how to think of missing values from a Bayesian perspective. Interestingly, I am recording a talk for PyData Global in a couple of weeks, which also tries to look at missing value imputation as Bayesian inference. I focus on the issues that simplistic imputation was causing me in my work, and talk about the “iterative imputer” in sklearn that I used to impute (and how the iterative imputer is doing approximate inference in a Bayesian model).

Don’t have a link to the talk yet, but check out the slides if you are interested:

Also, I have added a link to your tutorial in my slides for people to get a deeper understanding of the topic :slight_smile:

1 Like