Differences of PyMC3 posterior compared to Numpyro and R rethinking packages

ben_slacker · June 8, 2021, 4:54pm

While working on a particular problem from Chapter 7 of Statistical Rethinking, I observed a pretty striking difference in PyMC posterior distributions compared to Numpyro and the original R rethinking package. The original problem is the same one described by roesta07 here which has the original R code. It’s a simple linear model which can be instantiated this way in PyMC3:

with pm.Model() as m7_1:
    a = pm.Normal("a", mu=0.5, sd=1)
    b = pm.Normal("b", mu=0, sd=10)
    log_sigma = pm.Normal("log_sigma", mu=0, sd=1)
    mu = a + b * df_brains["mass_std"]
    brain_std = pm.Normal(
        "brain_std", mu=mu, sd=np.exp(log_sigma), observed=df_brains["brain_std"]
    )
    sample = pm.sample(1000, tune=1000, return_inferencedata=True)

The posterior distribution of all three parameters (a, b, log_sigma) that PyMC3 outputted was different than the Numpyro and R rethinking package. The most obvious difference is when looking at the log_sigma value (or transformed to sigma). Here is looking at PyMC3 vs. running code from the Numpyro repo of Chapter 7.

The difference is clear by looking at the mean for log-sigma:
PyMC3 mean for log-sigma: -1.3848
Numpyro mean for log-sigma: -1.7089
R rethinking mean for log-sigma: -1.7048

The similarity between Nympyro and R’s output leads me to believe that PyMC3 is “wrong”.

As mentioned, the difference is also seen in other parameters (means are the same but variances are higher in both a and b parameters).

Is there a rational reason for PyMC3 showing these differences compared to Numpyro and R rethinking?

(New user restrictions prevent me from adding more links but hope this explanation is sufficient. I’m aware of the PyMC3 repo for Statistical Rethinking, but it does not setup this problem the way the book does.)

ricardoV94 · June 8, 2021, 5:43pm

It seems both the book and the Numpyro example are using the Laplace Approximation, whereas in PyMC3 we are using full MCMC + NUTS. I would expect the PyMC3 posterior to be more accurate than either of those two.

ben_slacker · June 8, 2021, 6:12pm

Oh… yes, that’s it… I re-ran the R code using the rethinking’s ulam package (which does MCMC) and I get back a mean for log-sigma that is closer to PyMC3: -1.3879. The a and b parameters using ulam also show a variance more in line with PyMC3.

Thanks for the quick response and helping me clear up this mystery @ricardoV94!

Topic		Replies	Views
Any reason why the HPD in Pymc3 differs from what the code in Statistical Rethinking gets? Questions	2	494	August 26, 2018
Pymc3 produces different results than Stan/NumPyro v3	8	1253	March 15, 2022
Is there any way to to calculate lppd (pymc3)? Questions	1	739	June 8, 2021
Divergent behavior between Kruschke's BESTmcmc and PyMC3 Questions	2	535	December 16, 2018
Converting Model from Stan Questions	15	3133	June 27, 2017

Differences of PyMC3 posterior compared to Numpyro and R rethinking packages

Related topics