What sampler should I use for the fastest inference in hierarchical Bayesian MMM?

LBrs · November 21, 2025, 4:17pm

Hi all,

I’m working on a hierarchical Bayesian Marketing Mix Model in PyMC and I’m trying to figure out which sampler would give me the best performance (fastest convergence and wall-clock time) for this type of model.

The model size can vary:

Timestamps ~ 200
Geography ~ 4 to 50
Features ~ 30

Given this structure, I’m wondering:

Which sampler is generally recommended for hierarchical MMMs in PyMC for speed?
- Our baseline today is NUTS (NumPyro backend) on CPU, which is the fastest method we’ve consistently used so far.
We expected to gain computation working on GPU T4 with float32 but so far it’s slower
(I’ve seen people report big gains, but I’m not sure if that holds for hierarchical nonlinear MMM of such a “small” size.)
If anyone has benchmarks, practical experience comparing sampler performance for hierarchical MMMs, I’d love to hear about them.

Thanks in advance!

ricardoV94 · November 21, 2025, 9:39pm

Have you tried nutpie with numba backend? That’s usually the fastest on CPU, and you can usually get away with less tuning steps

daniel-saunders-phil · November 22, 2025, 7:32pm

Hi, there are some published benchmarks in this article PyMC-Marketing vs Meridian: Benchmarking Performance, Accuracy & Scalability . The article itself is about pymc-marketing vs meridian but it has some nice metrics buried in there: four different pymc-enabled samplers across four different sizes of MMMs. Numpyro does really well for trivially small models and then nutpie takes the lead as the size of the model increases. Those benchmarks are from CPUs.

GPUs may or may not be faster. Sometimes when people find the GPU is faster, then really just mean having a big processer is faster. There are a few operations that benefit from the architectural features of GPUs but not the same way neural networks do. Here’s a nice article detailing some limitations of GPUs for NUTS sampling Running Markov Chain Monte Carlo on Modern Hardware and Software

Topic		Replies	Views
Pymc3 / pymc v4 GPU example for a basic hierarchical model with NUTS sampling version agnostic gpu	1	1298	April 6, 2022
Using GPUs correctly in Pymc Marketing v5 modeling	0	281	November 15, 2024
Improving model convergence and sampling speed Questions	8	6373	September 27, 2019
PSA: pm.sample now has full integration with Numba backend Announcements	5	576	December 14, 2024
PyMC v4 sampling efficiency questions v5	9	1813	July 11, 2022

What sampler should I use for the fastest inference in hierarchical Bayesian MMM?

Related topics