Sampling does not start or very very slow while attempting a Mixture Model tutorial


#1

I am new to PYMC3. While learning, I did a Mixture Model tutorial, outlined in the book “Probabilistic Programming & Bayesian Methods for Hackers”. I believe I am doing exactly what was outlined in the tutorial, however my sampling does not start or is very slow. I tried various permutations, as below, without success. Note, the data size is 300. Could someone tell me what I am doing wrong?

with pm.Model() as model:
p1 = pm.Uniform('p1', 0, 1)
p2 = 1 - p1
p = T.stack([p1, p2])
assign = pm.Categorical('assign', p, shape = data.values.shape[0], testval = np.random.randint
                       (0, 2, data.values.shape[0]))
sdevs = pm.Uniform('sdevs', 0, 100, shape = 2)
means = pm.Normal('means', mu = np.array([120, 190]), sd = np.array([10, 10]), shape = 2)
means_ = pm.Deterministic("means_", means[assign])
sdevs_ = pm.Deterministic("sdevs_", sdevs[assign])
obs = pm.Normal('obs', mu = means_, sd = sdevs_, observed = data.values)`

First, I let PyMC decide which routines to use and saw no results in 5 hours:

with model:
trace = pm.sample(25000)
Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
> NUTS: [means, sdevs, p1]
> BinaryGibbsMetropolis: [assign]
> Sampling 4 chains: 0%| | 24/102000 [5:03:59<40738:03:24, 1438.15s/draws

I also tried to be more prescriptive and attempted the following, but the sampler is very slow:

with model:
start = find_MAP(model = model)
step1 = pm.Metropolis(vars = [p, sdevs, means])
step2 = pm.ElemwiseCategorical(vars = [assign])
trace = pm.sample(25000, start = start, step = [step1, step2])
logp = -4.7305e+05, ||grad|| = 299.51: 100%|██ █████████████████████████████████████████| 39/39 [01:37<00:00, 2.51s/it]
C:\Users\bikim\AppData\Local\conda\conda\envs\pymc3p36\lib\site-packages\ipykernel_launcher.py:6: 
DeprecationWarning: ElemwiseCategorical is deprecated, switch to CategoricalGibbsMetropolis.

Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
> CompoundStep
> > Metropolis: [means]
> > Metropolis: [sdevs]
> > Metropolis: [p1]
> > ElemwiseCategorical: [assign]
> > Sampling 4 chains: 1%|▎ | 552/102000 [30:35<105:28:23, 3.74s/draws]

This should be an easy problem. I appreciate any insight into what I am doing wrong.


#2

Mixture model parameterized with explicit latent label is really difficult to sampled from, the recommended solution is to rewrite it into a marginalized mixture model - you can do a search in this discourse there are quite a few related discussions.


#3

Following up on Junpeng’s answer, there’s a good example of marginalized mixture models in the docs: https://docs.pymc.io/notebooks/marginalized_gaussian_mixture_model.html


#4

Thank you both (_eigenfoo and junpenglao). It works like a charm now.


#5

Glad to hear!