Hi there! I have recently switched from stan to pymc especially because of nutpie`s speed compared to stan’s sampler.
I was playing around with estimating some basic mixture models where the data come from two normal distributions with different means and standard deviations. I was trying to recreate the stan example shown here regarding a finite mixture model in pymc
. Specifically, I am using pm.Potential()
because I eventually want specify mixtures of arbitrarily many processes and pm.Potential()
seemed like the way to go if I need flexibility. The model is the following:
import pymc as pm
import numpy as np
import arviz as av
# Simulate some mixture of two normals
np.random.seed(42)
y_obs = np.concatenate([
np.random.normal(-1, 2, size=150),
np.random.normal(3, 1, size=350)
])
with pm.Model() as model:
# means and SDs
mu1 = pm.Normal('mu1', mu=-1, sigma=2)
mu2 = pm.Normal('mu2', mu=3, sigma=2)
sigma1 = pm.HalfNormal('sigma1', sigma=2)
sigma2 = pm.HalfNormal('sigma2', sigma=2)
# Mixing weights
w1 = pm.Beta('w1', alpha=3, beta=7)
# Log-likelihood for each component
logp1 = pm.logp(pm.Normal.dist(mu=mu1, sigma=sigma1), y_obs)
logp2 = pm.logp(pm.Normal.dist(mu=mu2, sigma=sigma2), y_obs)
# Mixture log-likelihood using log-sum-exp
log_mix = pm.math.logsumexp(
[pm.math.log(w1) + logp1,
pm.math.log(1 - w1) + logp2],
axis=0
)
# Add to model logp via Potential
pm.Potential('mixture_logp', log_mix.sum())
# Sample posterior
trace = pm.sample(1000)
az.summary(trace)
With the default pymc
sampler, the model converges just fine and the parameters recover reasonably. However, once I switch to nutpie and change the penultimate line to
trace = pm.sample(1000, nuts_sampler="nutpie")
I consistently experience convergence issues (for each parameter, nutpie seems to sample from two distinct areas of the posterior once I look at the pairs plot with az.plot_pair(trace)
).
Does anyone have any insight into what may be happening? I am relatively new to pymc
and don’t use python much, so I apologize if I am missing something obvious. Any suggestion is much appreciated!