Microbial Cell Counting in a Noisy Environment by Cameron Davidson-Pilon

Talk Abstract

In this LBAM, we’ll introduce the microbiological task of cell counting and understand all the potential sources of error involved. We’ll model each source of error probabilistically, introduce priors, and then discuss inference on the posterior. Finally, we’ll explore how we can extend our model to use in a calibration curve for other instruments. Only basic probability theory is required for this LBAM.

Cameron Davidson-Pilon Twitter @cmrn_dp


Cameron Davidson-Pilon

Cameron Davidson-Pilon has worked in many areas of applied statistics, from the evolutionary dynamics of genes to modeling of financial prices. His contributions to the community include lifelines, an implementation of survival analysis in Python, lifetimes, and Bayesian Methods for Hackers, an open source book & printed book on Bayesian analysis. Formally Director of Data Science at Shopify, Cameron is now applying data science to food microbiology.

This is a PyMCon 2020 talk

Learn more about PyMCon!

PyMCon is an asynchronous-first virtual conference for the Bayesian community.

We have posted all the talks here in Discourse on October 24th, one week before the live PyMCon session for everyone to see and discuss at their own pace.

If you are available on October 31st you can register for the live session here!, but if you are not don’t worry, all the talks are already available here on Discourse (keynotes will be posted after the conference) and you can network here on Discourse and on our Zulip.

We value the participation of each member of the PyMC community and want all attendees to have an enjoyable and fulfilling experience. Accordingly, all attendees are expected to show respect and courtesy to other attendees throughout the conference and at all conference events. Everyone taking part in PyMCon activities must abide by the PyMCon Code of Conduct. You can report any incident through this from.

If you want to support PyMCon and the PyMC community but you can’t attend the live session, consider donating to PyMC

Do you have suggestions to improve PyMCon? We have an anonymous suggestion box waiting for you

Have you enjoyed PyMCon? Please fill our PyMCon attendee survey. It is open to both async PyMCon attendees and people taking part in the live session.

Speaker Tag: @CamDavidsonPilon


This was really neat!

I gotta ask. What gear did you use for your microscope such that your iphone was able to capture that? I’m getting the impression that you’ve been able to get a pretty good bang for your buck gear-wise. Your talk kind of might have inspired me to consider yet another nerdy hobby.

1 Like

I have a similar project where I need to count the number of bacteria in 100 fields of view. Though the counting is done by a neural network, I think it would be interesting to incorporate Bayesian methods in this case as well. Thanks for the inspiration!

1 Like

Hi Cameron. This is an excellent talk – thanks very much! I am not sure I understand what you were saying near the end about moving uncertainty around like sand. You made it sound like there’s no benefit, as if using a better pipette would not reduce the uncertainty of the estimate. That’s not what you meant to say, is it?


Thanks for the kind words, Allen!

Let me try to explain more carefully here. By using a better pipette, you would reduce the uncertainty. However, that uncertainty has not vanished - it has been concentrated near the point estimate. Like sand under a carpet, we can only move the sand around, and never remove it. Better measurements / more data move it (usually) to create a taller peak. However, there is the same uncertainty, just at a smaller scale. Graphically:


Great talk! I’m amazed how well "n = 1 observation" worked. I guess it was an observation of a handful of squares.

I work on survey statistics for measuring carbon in agricultural soil (for quantifying carbon credits), so it was fun to see analogs in your talk. I’m interested in Bayesian approaches to survey inference, and it seems your solution could be cast in that way, as a simple random sample of molecules from your beaker.

Thanks for the fun talk.

1 Like

I don’t understand this explanation, Mr Allen is right I think. Using better pipettes which results in more concentrated distributions reduces the “confidence interval” -which is a good thing- for your final decision making.

A cool follow-up analysis could be:

  • we value uncertainty (e.g., the half width of the 90% interval divided by the estimated concentration) using such and such functional relationship (e.g., $100 for each percentage point of reduced uncertainty)
  • the fancy pipette costs $X
    At what price $X is it worth buying the fancy pipette?
1 Like