The MLDA Multilevel Sampler in PyMC3 by Tim Dodwell, Mikkel Lykkegaard, and Grigorios Mingas

Talk Abstract

This presentation will give you the chance to know more about PyMC3’s new multilevel MCMC sampler, MLDA, and help you use it in practice. MLDA exploits multilevel model hierarchies to improve sampling efficiency compared to standard methods, especially when working with high-dimensional problems where gradients are not available. We will present a step-by-step guide on how to use MLDA within PyMC3, go through its various features and also present some advanced use cases, e.g. employing multilevel PDE-based models written in FEniCS and using adaptive error correction to correct model bias between different levels.

Tim Dodwell Twitter @proftimdodwell Personal website
Mikkel Lykkegaard Personal website
Grigorios Mingas GitHub gmingas


Tim Dodwell

Prof. Tim Dodwell has a personal chair in Computational Mechanics at the University of Exeter, is the Romberg Visiting at Heidelberg in Scientific Computing and holds a 5 year Turing AI Fellowship at the Alan Turing Institute where he is also an academic lead.

Mikkel Lykkegaard

Mikkel Lykkegaard is a PhD student with the Data Centric Engineering Group and Centre for Water Systems (CWS) at University of Exeter. His research is mainly concerned with Uncertainty Quantification (UQ) for computationally intensive forward models.

Grigorios Mingas

Dr. Grigorios Mingas is a Senior Research Data Scientist at The Alan Turing Institute. He received his PhD from Imperial College London, where he co-designed MCMC algorithms and hardware to accelerate Bayesian inference. He has experience in a wide range of projects as a data scientist.

This is a PyMCon 2020 talk

Learn more about PyMCon!

PyMCon is an asynchronous-first virtual conference for the Bayesian community.

We have posted all the talks here in Discourse on October 24th, one week before the live PyMCon session for everyone to see and discuss at their own pace.

If you are available on October 31st you can register for the live session here!, but if you are not don’t worry, all the talks are already available here on Discourse (keynotes will be posted after the conference) and you can network here on Discourse and on our Zulip.

We value the participation of each member of the PyMC community and want all attendees to have an enjoyable and fulfilling experience. Accordingly, all attendees are expected to show respect and courtesy to other attendees throughout the conference and at all conference events. Everyone taking part in PyMCon activities must abide by the PyMCon Code of Conduct. You can report any incident through this from.

If you want to support PyMCon and the PyMC community but you can’t attend the live session, consider donating to PyMC

Do you have suggestions to improve PyMCon? We have an anonymous suggestion box waiting for you

Have you enjoyed PyMCon? Please fill our PyMCon attendee survey. It is open to both async PyMCon attendees and people taking part in the live session.

Thanks for the clear presentation! The delay acceptance MCMC is something I want to learn about and you did a great job of explaining it. Some ideas after watching the presentation:

  1. would it possible to run the coarse subchains in batch like particle filtering? for example instead of the evaluating a subchain for 10 steps we run 10 independent subchains 1 step. For systems with batch support like TFP it might improve the runtime speed greatly.
  2. I like that fact that you can create an approximate coarse model by either subsampling the data or coming up with a cheaper (approximate) likelihood - I am wondering if there are technique or information we could retains during the MLDA step, to then select the “best” approximation coarse model?
    2.1 would it possible to have a coarse model with free parameter that being trained during the MLDA step? like plugging in a ADVI that got train during sampling as well.

Hi @junpenglao
Thanks for your interest in our sampler.

  1. That is certainly an interesting idea. An consideration would be which of the subchains would be chosen as the proposal? For the chain to be Markovian, there has to be detailed balance. If the subchain would be chosen randomly, I don’t think there is any problem. But if it is by some criterion, I think we would have to work that into the acceptance ratio. It’s a bit late here, so I can’t wrap my head around it now, but I will have a think about it.

  2. The models are hierarchical, so all of the the coarse models are used, in turn. Proposals trickle up from the bottom and are evaluated at each level before reaching the top. But any of those levels could be multiple models, I suppose. They could live inside a tt.Op. I don’t think possible with the code we have developed now, since a coarse model doesn’t see the finer model, and the fine model essentially only sees the log-probability of the coarse, not its output. But it’s definitely possible to implement.
    2.1 Yes, that should be possible, but I haven’t tried it myself. The different levels have to have the same free parameters when you specify them, but you could just not make use of that parameter on the higher levels. This would work only for the coarsest level, I think. I don’t know what would happen if some mid-level model would have a free parameter that isn’t used by the models below, which are doing the random walks. Then you could end up having some very poor proposals, since that parameter wouldn’t be constrained by the levels below.


Ah so there are more thought into the design of the coarse model so make sure the hierarchical nature makes sense - that’s very enlightening, thanks for the answer

For reference, these are the links to the PRs that have been merged so far: