Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler

BayesianNewbie · July 27, 2022, 4:22pm

I am trying to recreate a model from a journal that used Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. The codes were written in Matlab and I just want to recode it in Python. The model is rather simple. All they did was to regress a time series data of heart rates on the activity (step counts) data with an additional AR(1) (Autoregressive) factor by conditioning the noise at the current time times by the noise measured in the previous time step. There is an equivalent package for Affine Invariant Markov chain Monte Carlo in Python called emcee v3. However, I am just wondering if there is any additional benefits of using Affine Invariant Markov chain Monte Carlo over HMC or NUTS. Should I bother trying emcee v3 or just implement everything in PyMC? Any advice is appreciated. Thanks a ton!

Nat

cluhmann · July 27, 2022, 5:44pm

Welcome!

Emcee is very easy to use from an API standpoint, but it is just an sampler, not a full-blown probabilistic programming language (PPL). You just build a function that takes parameter values and returns a log posterior probability (likely built in in python, likely with help from numpy and scipy.stats), pass that function to emcee, and you’re pretty much good to go. I don’t know of any particular advantages to ensemble sampling (which is not to say there aren’t any), but it does have a reasonably well-known problem with moderate- to high-dimensional models. So it really depends on what you need. If you are looking to closely replicate the procedure, emcee seems like a good choice. If you are just looking to recreate the model, but you are interested in a) using a PPL that makes model-building relatively easy and b) a wide-variety of sampling tools and convenience functions, then PyMC is much more full-featured.

[edit: should have mentioned that the affine invariance is a definite benefit if you have skewed/poorly scaled distributions, but maybe that 's evident from the name?]

bwengals · July 28, 2022, 12:02am

Want to highlight something @cluhmann said,

You just build a function that takes parameter values and returns a log posterior probability

As in, you don’t additionally provide the gradient of the log posterior probability to emcee, like you would have to for HMC or NUTS. This is the big advantage of the Goodman & Weare sampler (emcee). Sometimes to calculate your likelihood you have to run some big computer simulation or something, which isn’t possible to differentiate through. In that case you cant really use HMC, and so emcee is a great choice. It’s popular in Astronomy and Physics because of this, lot’s of big computer simulations.

If your model is an AR1, or some variant of, that’s definitely differentiable, so I’d bet you’d have good results with HMC/NUTS. I’m very biased of course but I do think the easiest way would be to implement it in PyMC.

cluhmann · July 28, 2022, 12:35am

See, I knew there had to be something more to it. I guess with autograding being what it is these days, that benefit wasn’t super obvious to me (also, I’m not an astrophysicist).

BayesianNewbie · July 28, 2022, 3:22am

Thank you all so much for your inputs. The paper said that this ensemble sampler is particularly helpful given the degree of missingness in their dataset. However, I dont think that reason is really satisfying in this case. Since the use of emcee is really straight forward, I am going to try both to see if there is a real difference. I will follow-up with the results in this post

bwengals · July 28, 2022, 4:46pm

It is important though like you said that in moderate to high dimensions HMC >> emcee. Also, found using emcee to sample a PyMC3 model: emcee + PyMC3 | Dan Foreman-Mackey

Topic		Replies	Views
Blog post running the emcee sampler on PyMC3 models Sharing	0	665	August 22, 2018
Parallel (ensemble) sampling methods in PyMC Development	7	135	June 23, 2025
Things happening that I am really excited about Development gsoc	5	1033	June 16, 2017
LittleMCMC — A Standalone HMC and NUTS Sampler in Python Sharing development	0	872	October 12, 2020
Adjusting models with mcmc Questions	5	1446	June 26, 2020

Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler

Related topics