Slow sampling speed with newer versions of PyMC

I am experiencing a really weird issue where newer versions of PyMC are significantly slower than an older version of PyMC. Here is a reproducible example:

  • There is a environment.yml (statistical-rethinking-2023.yml) file on this repo That will setup a conda environment with PyMC version 5.8.0

  • In the same repo there is a “Lecture 18 - Missing Data.ipynb” notebook where code block number 25 samples in 1 minute for me on a MacOS 14.4.1 using the conda environment that is set up with the environment.yml file from above

  • Now if I run conda update --all this will install PyMC version 5.15.0. Trying to run code block number 25 from the aforementioned notebook samples extremely slowly with an ETA of 15 hours until completion. I have let this sample for 10 minutes to make sure it is not just an issue with the ETA

I am wondering if this is just an issue on MacOS? Can someone else reproduce this?

Installing old versions of PyMC and then trying to update them is not recommended as it can lead to version conflicts and other issues with your python environment.

The official installation instructions are here. In this case, you may need other packages in order to run the Rethinking notebooks. You can try and figure out what those packages are and install them into the environment you have freshly created and installed PyMC into. Alternatively, you can ask the person responsible for those notebooks for a more up-to-date environment specification.

Hey @cluhmann thank you for your response. I have already tried installing a PyMC using the linked instructions into a new conda environment. I just tried again using a fresh install following the instructions in the link you provided me. I am still experiencing slow sampling. There aren’t any external samplers being used in this example either. Are you unable to reproduce the issue that I am experiencing?

A couple of questions. First, when you import pymc in a fresh python session, are there any warnings or errors? If so, hwat are they? Second, what is the output when you run conda list in the relevant environment?

@cluhmann Thank you for helping me with this. I get no warning messages when importing pymc. Here is the output from conda list:

Ok, that doesn’t look terrible with my quick glance. And do you have an example of a model that is sampling slowly? Something smaller would be easier to work with.

Yeah, there is a model already set up in this notebook on code block 25 . Everything is already set up all you have to do is execute the code blocks. When I install PyMC using the official instructions sampling is extremely slow. However, when I use the environment.yml file that install version 5.8.0 sampling finishes in 1 minute.

I can also pull out the relevant code and paste them here if you prefer?

I’m a bit suspicious of generic openblas being installed along side the (correct) arm64 libblas. I’d run the test_blas script with:

python `python -c "import os, pytensor; print(os.path.dirname(pytensor.__file__))"`/misc/

And make sure the right version is getting picked (check blas__ldflags in the output)

Thank you @jessegrabowski for your suggestion. I ran the command you suggested and got:


All of this is correct (or at least it matches what I get on my macbook). I’ll run the notebook you’re talking about a bit later and see if I get the same slow sampling.

In the meantime, can you try sampling the model with an alternative backend (either jax or numba) and let me know if that’s also slow?

Thank you @jessegrabowski. I tried using numpyro and nutpie. Numpyro samples in around 6 minutes, however, there is a model on code block 36 that it complains about some “Not Implemented” function. Nutpie samples the model but is extremely slow. These models get sampled in a around 2 - 3 minutes using the native pymc sampler when I setup the environment using the environment.yml file that is included in the repo.

Well I can report slow sampling on my end. If there really is a performance regression, it’d be good to know exactly which PR caused it? You said it runs fast on 5.8, I looked through the changelogs and 5.9.1 has a PR that directly touched MvN. Otherwise I don’t see any obvious candidates – though that doesn’t mean it wasn’t something else. If you’re willing to keep testing, I’d ask you to try 5.9 and 5.9.1. If those are both good, see if you can find the newest version that samples quickly for you?

Try it just with conda create env -n pymc_env "pymc==5.9.0" (rather than updating the environment yaml from that rethinking repo), because the environment in that repo has a ton of other packages I’d rather not interfere with the speed tests.

Thank you @jessegrabowski. Yes, I am willing to do that testing today and will report back here.

1 Like

Okay, @jessegrabowski it might not be the PyMC version that is the culprit. I installed a fresh environment using conda create env -n pymc_env "pymc==5.8.0" This matches the PyMC version that samples quickly when installed from the env.yml file on the repo. The sampling on this fresh environment is still very slow. Here is the output from conda list for the fresh env:

Here is the output from conda list from the environment that samples quickly:

I have tried downgrading the following packages from the fresh env to match the fast sampling env but none of them made the sampling fast:


I am not sure what other packages are relevant here?

The relevant packages should only be pymc, pymc-base, pytensor, pytensor-base, and the blas library.

What if you try the apple accelerate blas? conda install libblas=*=*accelerate*

I just tried the accelerate blas by installing it in the fresh environment but sampling is still slow. Would there be any packages that are used specifically for MvN in PyMC?

@ricardoV94 @aseyboldt

For ease of reference, here is a copy+pastable snippet we’re trying to figure out:

import pymc as pm
import pytensor.tensor as pt
import pandas as pd
import numpy as np

df = pd.read_csv('', delimiter=';')
distance_matrix = pd.read_csv('', delimiter=';')
distance_matrix.columns =

df.dropna(subset='brain', inplace=True)
distance_matrix = distance_matrix.loc[df.index, :].loc[:, df.index].copy()

D_mat = (distance_matrix / distance_matrix.max())

def log_standardize(x):
    x = np.log(x)
    return (x - x.mean()) / x.std()

coords = {'primate': df['name'].values}
with pm.Model(coords=coords) as naive_imputation_model:    
    G_obs = log_standardize(df.group_size.values)
    M_obs = log_standardize(df.body.values)
    B_obs = log_standardize(df.brain.values)
    # Priors
    alpha = pm.Normal("alpha", 0, 1)
    beta_G = pm.Normal("beta_G", 0, 0.5)
    beta_M = pm.Normal("beta_M", 0, 0.5)

    # Phylogenetic distance covariance prior, L1-kernel function
    eta_squared = pm.TruncatedNormal("eta_squared", 1, .25, lower=.001)
    rho = pm.TruncatedNormal("rho", 3, .25, lower=.001)
    K = pm.Deterministic('K', eta_squared * pt.exp(-rho * D_mat))

    # Naive imputation for G, M
    G = pm.Normal("G", 0, 1, observed=G_obs, dims='primate')
    M = pm.Normal("M", 0, 1, observed=M_obs, dims='primate')

    # Likelihood for B
    mu = alpha + beta_G * G + beta_M * M
    pm.MvNormal("B", mu=mu, cov=K, observed=B_obs)

    naive_imputation_inference = pm.sample()

Thank you for all of your help @jessegrabowski. I really appreciate it.

Hey @jessegrabowski, I think I figured out the problem. Would you mind trying this on your end to confirm it works?

1). conda create -n pymc_env "pymc==5.8.0"
2). Sample the model confirm it is slow then interrupt the kernel
3). conda install libopenblas=0.3.24 libsqlite=3.43.0 (you may need to restart Jupyter/vs code)
4). Sample the model and confirm it is fast

I can also confirm that this works for me with the latest version of PyMC