Is pymc capable of performing blocked sampling, where the blocks consist of subsets of the pymc variable?
Here’s a concrete example that I was hoping to test:
import pymc as pm
import numpy as np
D = 6
# define block covariance
Σ = np.block(
[
[0.8 * np.ones((3, 3)) + 0.2 * np.eye(3), 0.1 * np.ones((3, 3))],
[0.1 * np.ones((3, 3)), 0.8 * np.ones((3, 3)) + 0.2 * np.eye(3)],
]
)
with pm.Model() as model:
θ = pm.MvNormal("θ", mu=np.zeros(D), cov=Σ, shape=D)
with model:
block1 = pm.NUTS(vars=[θ[0:3]])
block2 = pm.NUTS(vars=[θ[3:6]])
trace = pm.sample(
tune=1000,
draws=2000,
step=[block1, block2],
random_seed=42
)
Unfortunately, the function get_value_vars_from_user_vars() throws a ValueError with,
ValueError: The following variables are not random variables in the model: ['None']
The NUTS example might not be so compelling (since NUTS can handle the correlations above easily without blocking)… however, I’m seeking to study blocked sampling schemes with some black-box models that have high parameter correlation (which can be decomposed into a handful of different blocks), so this would be really helpful, perhaps essential.
Thanks 
I suggest you first try throwing NUTS at your sampler, including nutpie advanced tuning routines: sampling-options – Nutpie
Breaking into distinct samplers can be done with quite some manual work. I wouldn’t start there.
Got it, thanks very much for the quick reply!
Unfortunately my model likelihood is black-box, so I can’t throw NUTS at this one. This said, it’s helpful to know there’s no easy way to do it.
you can build the model with potentials instead as described in the last part of this notebook: rosetta_stone_pymc_stan.ipynb · GitHub
Otherwise there may be properties of the mvnormal that allow you to split the components arbitrarily.
1 Like
Building the logp by hand is slightly different than doing custom blocked sampling though.
We don’t have a good tutorial for writing a custom step sampler, but PyMC is written to be extendable here.
For completeness, here is some code showing the Potential approach:
import pymc as pm
import pytensor.tensor as pt
with pm.Model() as m:
x1 = pm.Flat("x1", shape=(3,))
x2 = pm.Flat("x2", shape=(2,))
x = pm.math.concatenate([x1, x2])
pm.Potential(
"mv_normal_density",
pm.logp(pm.MvNormal.dist(mu=pt.arange(5), cov=pt.eye(5)), x),
)
step = [pm.NUTS(x1), pm.NUTS(x2)]
idata = pm.sample(step=step)
pm.stats.summary(idata)
# mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
# x1[0] 0.012 1.023 -1.961 1.912 0.017 0.025 3792.0 1361.0 1.0
# x1[1] 1.005 0.998 -0.850 2.877 0.018 0.019 3226.0 1289.0 1.0
# x1[2] 1.999 1.017 0.028 3.834 0.018 0.015 3371.0 1561.0 1.0
# x2[0] 2.998 0.952 1.250 4.821 0.021 0.015 1989.0 1519.0 1.0
# x2[1] 4.015 0.996 2.056 5.785 0.024 0.017 1784.0 1404.0 1.0
And using some new code that I pushed here Derive logprob for Split operation by ricardoV94 · Pull Request #7875 · pymc-devs/pymc · GitHub
that allows actually splitting an RV and register as separate variables in the model:
import pymc as pm
import pytensor.tensor as pt
with pm.Model() as m:
x = pm.MvNormal.dist(mu=pt.arange(5), cov=pt.eye(5))
x1, x2 = pt.split(x, splits_size=(3, 2), n_splits=2)
x1 = m.register_rv(x1, "x1", initval=[0, 0, 0])
x2 = m.register_rv(x2, "x2", initval=[0, 0])
step = [pm.NUTS([x1]), pm.NUTS([x2])]
idata = pm.sample(step=step)
pm.stats.summary(idata)
# mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
# x1[0] -0.009 1.089 -1.941 2.131 0.015 0.017 5112.0 3032.0 1.0
# x1[1] 0.996 1.125 -0.996 3.191 0.014 0.014 6114.0 2859.0 1.0
# x1[2] 2.001 1.086 -0.002 4.013 0.014 0.011 6034.0 2871.0 1.0
# x2[0] 2.991 1.109 0.846 4.968 0.016 0.012 4660.0 3056.0 1.0
# x2[1] 3.993 1.122 1.879 6.089 0.017 0.012 4479.0 2554.0 1.0
1 Like