Hi PyMC Community,
I’m trying to use SMC sampling in PyMC v5 to compute marginal likelihoods for Bayesian model comparison, but I’m running into persistent issues. I have four different models, and in all cases, SMC sampling either runs extremely slowly (with beta staying at 0 for over an hour) or throws errors. These same models work perfectly fine with NUTS sampling.
My Goal
I want to use SMC to compute marginal likelihoods and Bayes factors for model comparison between four competing models.
The Problem
When I try to use SMC sampling, I encounter one of two issues:
- The sampling runs extremely slowly - beta remains at 0 for over an hour with no progress
- It throws a
NotImplementedError
Code Example
Here’s one of my models that works perfectly with NUTS:
# Assume Theta_normalized is my observed data with shape (n_samples, num_features_theta)
with pm.Model() as skewed_model:
mu = pm.Uniform('mu', lower=0, upper=1, shape=num_features_theta)
omega = pm.HalfNormal('omega', sigma=1, shape=num_features_theta)
alpha = pm.Normal('alpha', mu=0, sigma=2, shape=num_features_theta)
for i in range(num_features_theta):
pm.SkewNormal(
f'obs_feature_{i}',
mu=mu[i],
sigma=omega[i],
alpha=alpha[i],
observed=Theta_normalized[:, i]
)
NUTS sampling works fine:
with skewed_model:
trace = pm.sample(8000, tune=4000, nuts={'max_treedepth': 18},
target_accept=0.9, chains=8, cores=32)
ppc = pm.sample_posterior_predictive(trace)
SMC sampling fails:
with skewed_model:
trace_smc = pm.sample_smc(2000, chains=4)
Error Message:
NotImplementedError Traceback (most recent call last)
Cell In[29], line 2
1 with skewed_model:
----> 2 trace_smc = pm.sample_smc(2000, chains=4)
Questions
-
Are there known issues with SMC sampling in PyMC v5, particularly with models containing multiple observed variables or SkewNormal/Mvnormal distributions?
-
What could cause beta to remain at 0 for extended periods?
-
Are there specific model structures or distributions that are incompatible with SMC sampling?
-
What alternatives would you recommend for computing marginal likelihoods in PyMC v5?
Any guidance would be greatly appreciated! I’m happy to provide more details about my models or environment if needed.
Thank you!