Convergence with noncentered hierarchical model

NateAM · April 21, 2021, 3:56pm

I have an issue where a noncentered hierarchical model suffers from lots of divergences while the centered version works just fine. This is counter to my expectations so I suspect I’m probably implementing the noncentered variables incorrectly. I create a dataset:

np.random.seed(123)

idist = scipy.stats.norm(loc=10, scale=2)

sdist = scipy.stats.norm(loc=10, scale=0.5)

noise = scipy.stats.norm(loc=0, scale=0.1)

nsamples = 4

intercepts = idist.rvs(nsamples)
slopes = sdist.rvs(nsamples)

point_min = 2
point_range = 10

npoints = [int(point_range * np.random.rand()) + point_min for _ in range(nsamples)]

density_min = 0.90
density_range = 0.20

densities = [np.sort(density_range * np.random.rand(n) + density_min) for n in npoints]
density_mean = np.mean(np.hstack(densities))
density_std = np.std(np.hstack(densities))

normdensities = [(d - density_mean)/density_std for d in densities]

moduli = [s * (d - 0.5 * density_range + density_min) + i + noise.rvs(d.size) for d, s, i in zip(densities,     slopes, intercepts)]
moduli_mean = np.mean(np.hstack(moduli))
moduli_std  = np.std(np.hstack(moduli))

normmoduli = [(m - moduli_mean)/moduli_std for m in moduli]

So I can specify my centered model as
with pymc3.Model() as noncentered_model:

    m = pymc3.Normal('m', mu=0, sd=1)

    b = pymc3.Normal('b', mu=0, sd=1)

    zb = pymc3.Normal('zb', mu=0, sd=1, shape=(nsamples,))

    bi = pymc3.Deterministic('bi', b + zb)

    noise = pymc3.HalfNormal('noise', sd = 1)

_ynorm = [None for _ in range(nsamples)]

_y = [None for _ in range(nsamples)]

_l = [None for _ in range(nsamples)]

for i in range(nsamples):

    with noncentered_model:
    
        _ynorm[i] = pymc3.Deterministic(f'ynorm_{i+1}', m * normdensities[i]+ bi[i])
        _y[i] = pymc3.Deterministic(f'y_{i+1}', _ynorm[i] * moduli_std + moduli_mean)
    
        _l[i] = pymc3.Normal(f'l{i+1}', mu = _ynorm[i], sd=noise, observed = normmoduli[i])

This approach fails with MANY divergences. Specifying bi directly (bi = pymc3.Normal(‘bi’, mu=b, sd=1, shape=(nsamples,))) works great.

Does anyone see where I’m going wrong?

OriolAbril · April 22, 2021, 10:57pm

I don’t have the references at hand, but the noncentered parametrization is not universally better, if it were we’d use it everywhere and probably even have tried to have pymc3 use it under the hood even when not specified explicitly. Depending on the model and data you’ll have a funnel geometry and sampling problems when using the noncentered parametrization instead of when using the centered one. In these cases the centered one generally samples efficiently and without problem.

NateAM · April 23, 2021, 2:53pm

@OriolAbril Okay! From all that I had read I was under the impression that centered approaches were for chumps and that you always wanted to use centered distributions so this is really helpful. Thanks!

Alex_K · April 26, 2021, 7:55pm

Michael Betancourt discusses the trade offs here. Interestingly, in this section Betancourt uses a heuristic criteria to assign parameters with fewer observations (< 25) a non-centered parameterization and the rest a centered parameterization. In his toy example, the mixed parameterization is better suited (i.e. shows no divergences) than the uniformly centered or non-centered parameterizations.

I’m still trying to wrap my head around it myself, but this is something I am trying to apply to my own problem at the moment.

NateAM · April 26, 2021, 10:58pm

Thanks for that reference @Alex_K (and welcome to the forum!). I’ll go through that carefully.

jhrcook · July 16, 2021, 9:02pm

Thanks for sharing this post! I really wanted to experiment with the mixed-parameterization that Betancourt shows in Stan, but in PyMC3, so I put together this blog post. It’s a bit long, so I copied the model-building function below so you can see how I accomplished it. Did you ever implement this type of model, and if so, how did you code it? I feel like mine is a bit clunky.

def mixed_parameterization(
    d: dict[str, Any],
    cp_idx: np.ndarray,
    ncp_idx: np.ndarray,
    draws: int = 10000,
    tune: int = 1000,
) -> tuple[pm.Model, az.InferenceData, dict[str, np.ndarray]]:
    n_cp = len(np.unique(cp_idx))
    n_ncp = len(np.unique(ncp_idx))
    with pm.Model() as mixp_model:
        mu = pm.Normal("mu", 0, 5)
        tau = pm.HalfNormal("tau", 5)

        theta_cp = pm.Normal("theta_cp", mu, tau, shape=n_cp)
        eta_ncp = pm.Normal("eta_ncp", 0, 1, shape=n_ncp)
        theta_ncp = pm.Deterministic("theta_ncp", mu + tau * eta_ncp)

        _theta = list(range(d["K"]))
        for i, t in enumerate(cp_idx):
            _theta[t] = theta_cp[i]
        for i, t in enumerate(ncp_idx):
            _theta[t] = theta_ncp[i]

        theta = pm.Deterministic("theta", pm.math.stack(_theta))

        y = pm.Normal("y", theta[d["idx"]], d["sigma"], observed=d["y"])

        trace = pm.sample(
            draws=draws,
            tune=tune,
            chains=4,
            cores=4,
            return_inferencedata=True,
            random_seed=RANDOM_SEED,
            target_accept=0.95,
        )
        y_post_pred = pm.sample_posterior_predictive(
            trace=trace, random_seed=RANDOM_SEED
        )
    return mixp_model, trace, y_post_pred

For context, d is a data dictionary where "K" is the number of individual groups, cp_idx and ncp_idx are the indices of the individuals that should get a centered and non-centered parameterization, respectively.

Alex_K · July 24, 2021, 4:15pm

I did not try to implement the mixed-parameterization, but I will check out your post! I’m still a relative newcomer to pymc3, so I don’t think I would know the best (or a better) way to do it at this point. I’ll experiment with your code and let you know if I learn anything.

Topic		Replies	Views
Hierarchical Linear Regression Model -- why is it not converging? Questions	3	1562	March 14, 2018
Hierarchical Binomial Model - Not converging Questions	2	887	February 1, 2019
Extending the Hierarchical Linear Regression Examples Questions	6	737	November 27, 2024
Divergences in a non-centered implementation of the normal/normal hierarchal model v5 modeling	2	558	July 24, 2022
Mixed parameterization of hierarchical model Questions	0	392	July 16, 2021

Convergence with noncentered hierarchical model

Related topics