Non-centering results in lower effective sample size

I am playing around with a simple multi-level model and finding that when I non-center the parameters, the number of effective samples is much lower than when I use a centered parameterization. This is the opposite of what I was expecting and I am wondering if I am making a mistake or if this something that happens sometimes and I should not worry about it.

The data that I am using is from Richard McElreath’s Statistical Rethinking course and can be found here: .

The code for the model is basically just taken from the Statistical Rethinking Course (see week video from 15-Feb 11 here, but I non-centered the parameters. It models the probability of survival of tadpoles in different ponds (also referred to as tanks). Each row of the dataset corresponds to a different pond. There are small and large ponds and they contain different numbers of tadpoles which is taken into account by the variable n below. The fraction of tadpoles that survive in each pond is modeled by a Binomial distribution.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pymc3 as pm
import arviz as az

df1 = pd.read_csv('reedfrogs.csv', sep=';')
N_TANKS = len(df1)

with pm.Model() as model_1_1:
    pond = pm.Data('pond', df1.index.values.astype('int'))
    n = pm.Data('n', df1.density.values.astype('int'))
    survival_obs = pm.Data('survival_obs', df1.surv)
    a_bar = pm.Normal('a_bar', mu=0, sigma=1.5)
    sigma = pm.Exponential('sigma', lam=1)
    # This parametrization does not give a warning
    # a = pm.Normal('a', mu=a_bar, sigma=sigma, shape=N_TANKS)
    # use non-centered variables for better sampling
    # this gives me a warning, but I don't undertsand why:
    # "The number of effective samples is smaller than 25% for some parameters"
    z = pm.Normal('z', mu=0, sigma=1, shape=N_TANKS)
    a = pm.Deterministic('a', a_bar + z*sigma)
    p = pm.invlogit(a[pond])
    survival = pm.Binomial('survival', n=n[pond], p=p, observed=survival_obs)
    trace_1_1 = pm.sample(2000, tune=2000, return_inferencedata=True, random_seed=RANDOM_SEED, chains=4)

If anyone can shed some light on this, I’d be very grateful.

1 Like

I think Mike Betancourt’s recent tweet is a pretty good summary:

TLdr is that, non-centered parameterization does not always leads to better MCMC samples, as it interacts with the hyper priors and data. In general, with informative enough data, centered parameterization is better (“enough” not just the size of data, but how much it constrains posterior)

1 Like

Thanks! This is very helpful.

Do you know if there is a paper or blog post associated with the plot from the tweet? I could not find one, but after your answer put my on the right track I found this paper which discusses the point you made in more detail .