Sampler initialization error with a model containing an LKJCholeskyCov distribution

Hi, as reported initially in this thread (Dynamic shaping, "round" function, JAX, and a "few" more questions - #21 by caclav), the code below, which is basically identical to the following example (LKJ Cholesky Covariance Priors for Multivariate Normal Models — PyMC example gallery), returns an error when the sampler’s inivals parameter is initialized to model.initial_point().

import numpy as np
import pymc as pm

# configuration
RANDOM_SEED = 12345
rng = np.random.default_rng(RANDOM_SEED)
N = 10000

# true parameters
mu_actual = np.array([1.0, -2.0])
sigmas_actual = np.array([0.7, 1.5])
Rho_actual = np.matrix([[1.0, -0.4], [-0.4, 1.0]])
Sigma_actual = np.diag(sigmas_actual) * Rho_actual * np.diag(sigmas_actual)

# simulated data generation
x = rng.multivariate_normal(mu_actual, Sigma_actual, size=N)

# PyMC Model definition
coords = {"axis": ["y", "z"],
          "axis_bis": ["y", "z"],
          "obs_id": np.arange(N)}
with pm.Model(coords=coords,) as model:
    chol, corr, stds = pm.LKJCholeskyCov(
        "chol", n=2, eta=2.0, sd_dist=pm.Exponential.dist(1.0, shape=2)
    )
    cov = pm.Deterministic("cov", chol.dot(chol.T), dims=("axis", "axis_bis"))
    
    mu = pm.Normal("mu", 0.0, sigma=1.5, dims="axis")
    obs = pm.MvNormal("obs", mu, chol=chol, observed=x, dims=("obs_id", "axis"))

init_pts = model.initial_point()
if False: # AN ERROR IS RAISED IN THIS CASE
    pass
else: # THE ISSUE DOESN'T HAPPEN IN THIS CASE
    del init_pts["chol_cholesky-cov-packed__"]
    # line below facultative/optional
    # init_pts['chol'] = np.array([0.70931675, -0.61463295,  1.36559879])

# NUTS MCMC sampling and inference - THE ERROR HAPPENS HERE!
with model:
    idata = pm.sample(initvals=init_pts,
        idata_kwargs={"dims": {"chol_stds": ["axis"],
                               "chol_corr": ["axis", "axis_bis"]}},
    )

The error is:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Traceback (most recent call last):

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/spyder_kernels/py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File ~/Bureau/detector_work/IPC/_bug_test.py:49
    idata = pm.sample(initvals=init_pts,

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/sampling/mcmc.py:682 in sample
    initial_points, step = init_nuts(

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/sampling/mcmc.py:1327 in init_nuts
    initial_points = _init_jitter(

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/sampling/mcmc.py:1204 in _init_jitter
    ipfns = make_initial_point_fns_per_chain(

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/initial_point.py:86 in make_initial_point_fns_per_chain
    make_initial_point_fn(

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/initial_point.py:134 in make_initial_point_fn
    sdict_overrides = convert_str_to_rv_dict(model, overrides or {})

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/initial_point.py:47 in convert_str_to_rv_dict
    initvals[rv] = model.rvs_to_transforms[rv].backward(initval, *rv.owner.inputs)

  File ~/InstProg/anaconda3/envs/myenv/lib/python3.10/site-packages/pymc/distributions/transforms.py:157 in backward
    return pt.set_subtensor(value[..., self.diag_idxs], pt.exp(value[..., self.diag_idxs]))

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

I hope I haven’t made any stupid mistakes in this code sample and that I won’t waste your time.

((by the way, I think that in the example mentioned - which is extremely useful for a project I’m working on! - the rng_seeder parameter passed as an input parameter to pm.Model is obsolete in the latest versions of PyMC.))

Thank you for your help!

My apologies, I should’ve read your question more carefully before answering. As you mentioned the problem is with the initial values you assigned. I can run the model with no issues using the default arguments of the sampler:

with pm.Model(coords=coords,) as model:
    chol, corr, stds = pm.LKJCholeskyCov(
        "chol", n=2, eta=2.0, sd_dist=pm.Exponential.dist(1, shape=2)
    )
    cov = pm.Deterministic("cov", chol.dot(chol.T), dims=("axis", "axis_bis"))

    mu = pm.Normal("mu", 0.0, sigma=1.5, dims="axis")
    obs = pm.MvNormal("obs", mu, chol=chol, observed=x, dims=("obs_id", "axis"))

with model:
    idata = pm.sample(1000)

az.summary(idata)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [chol, mu]
 |████████████| 100.00% [8000/8000 00:26<00:00 Sampling 4 chains, 0 divergences]Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 46 seconds.

'''
                  mean     sd  hdi_3%  ...  ess_bulk  ess_tail  r_hat
mu[y]            0.995  0.007   0.983  ...    3905.0    3155.0    1.0
mu[z]           -1.977  0.015  -2.007  ...    3709.0    3044.0    1.0
chol[0]          0.708  0.005   0.699  ...    4736.0    3434.0    1.0
chol[1]         -0.606  0.014  -0.632  ...    4359.0    3084.0    1.0
chol[2]          1.362  0.010   1.345  ...    5933.0    3117.0    1.0
chol_corr[0, 0]  1.000  0.000   1.000  ...    4000.0    4000.0    NaN
chol_corr[0, 1] -0.406  0.008  -0.422  ...    4562.0    3382.0    1.0
chol_corr[1, 0] -0.406  0.008  -0.422  ...    4562.0    3382.0    1.0
chol_corr[1, 1]  1.000  0.000   1.000  ...    3124.0    3434.0    1.0
chol_stds[0]     0.708  0.005   0.699  ...    4736.0    3434.0    1.0
chol_stds[1]     1.491  0.011   1.472  ...    5046.0    3070.0    1.0
cov[y, y]        0.501  0.007   0.488  ...    4736.0    3434.0    1.0
cov[y, z]       -0.429  0.012  -0.449  ...    3899.0    3118.0    1.0
cov[z, y]       -0.429  0.012  -0.449  ...    3899.0    3118.0    1.0
cov[z, z]        2.223  0.031   2.167  ...    5046.0    3070.0    1.0

Results seem sensible, maybe you don’t really need to assign initial values. Is there a specific reason for using them?

Thank you for investigating where this error comes from. Yes, according to the PyMC example mentioned above, or the documentation of LKJCholeskyCov (see the sd_dist parameter), shape should be equal to n. Indeed, it seems there is an issue with the initialization based on model.initial_point(), which, unless I am mistaken, shouldn’t happen.

Yes, I’m sorry, I edited my answer a second time :sweat_smile: . I didn’t read your question carefully. So I was just asking whether you actually need to assign initial values.

So, just to properly answer you question. The problem is the name assigned to the LKJ prior. When you use init_pts = model.initial_point(), you will get the internal name.

init_pts = model.initial_point()

init_pts
Out[39]: {'chol_cholesky-cov-packed__': array([0., 0., 0.]), 'mu': array([0., 0.])}

Though the documents state that the names of transformed variables should be used, it actually works with the name of the assigned variable:

init_pts = {'chol': np.array([1, 1, 1]), 'mu': np.array([1, 1])} ##note I use 1 as initial value, as with 0 it may break

with model:
    idata = pm.sample(initvals=init_pts,
        idata_kwargs={"dims": {"chol_stds": ["axis"],
                               "chol_corr": ["axis", "axis_bis"]}},
    )
    
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [chol, mu]
 |████████████| 100.00% [8000/8000 00:54<00:00 Sampling 4 chains, 0 divergences]Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 74 seconds.
1 Like

Thank you again @Simon for taking the time to understand where the error in this code sample came from and for the solution you provided!

This is also what I ended up doing, i.e. using the names of the un-transformed parameters in the initialization dictionary supplied to the sampler, even though the documentation suggests otherwise.

Of course the most important thing is that it works, but it is still a bug I think, albeit a minor one.

1 Like