Weird effective sample size instability

Hello everyone,

I am trying the following, fairly simple experiment with standard pm.sample, but I am confused by the results I see.
My script looks like this:

import pymc3 as pm
from pymc3 import sample, Model, effective_n
import numpy as np

for i in range(10):
    with Model():
        energy = pm.Mixture(
            "GaussianMixture1D",
            w=[1. / 2., 1. / 2.],
            comp_dists=[
                pm.Normal.dist(mu=mu_component, sd=np.sqrt(var_component))
                for mu_component, var_component in
                ((0., 0.1), (4., 0.1))
            ]
        )
        trace = sample(njobs=1, tune=3000, discard_tuned_samples=False,
                       draws=5000, chains=2)

        print(effective_n(trace))

I get the following output:

{'GaussianMixture1D': 7790.2085023290865}
{'GaussianMixture1D': 7790.2085023290865}
{'GaussianMixture1D': 1.0130707549727525}
{'GaussianMixture1D': 1.0131580129322948}
{'GaussianMixture1D': 7320.439857781025}
{'GaussianMixture1D': 6980.01966073396}
{'GaussianMixture1D': 1.0132495691592036}
{'GaussianMixture1D': 1.0129443086809973} 
{'GaussianMixture1D': 7128.332036857501}
{'GaussianMixture1D': 1.0130164334418752}
{'GaussianMixture1D': 6943.337715887186}

Thanks in advance for clearing up why I see this strong oscilliation.
I am using the most recent release of pymc3, version 3.5.

Best,
MFreidank

There is label switching happening in your mixture model. You can do a search in this discourse there are quite lot of similar discussions re mixture models.

1 Like

Perhaps the ordered transform could help you? See #5 on Colin’s blog post.

Following suggestions in Colin’s blog post and elsewhere I tried to use tr.ordered in my component mixtures, but to no avail.
Here is my changed code, that still exhibits label switching (which I can confirm from looking at traceplots.)

import pymc3 as pm
from pymc3 import sample, Model, effective_n
import pymc3.distributions.transforms as tr
import numpy as np
import theano.tensor as tt


for i in range(3):
    with Model():
        mu1 = pm.Normal.dist(
            mu=0.,
            sd=np.sqrt(0.1),
            transform=tr.ordered,
            testval=np.asarray([-1, 0, 1])
        )
        mu2 = pm.Normal.dist(
            mu=4.,
            sd=np.sqrt(0.1),
            transform=tr.ordered,
            testval=np.asarray([-1, 0, 1])
        )

        energy = pm.Mixture(
            "GaussianMixture1D",
            w=[1. / 2., 1. / 2.],
            comp_dists=[
                mu1, mu2
            ]
        )
        trace = sample(njobs=1, tune=3000, 
                      discard_tuned_samples=False,
                       draws=5000, chains=2)

        pm.traceplot(trace)
        import matplotlib.pyplot as plt
        plt.show()
        print(effective_n(trace))

Ordering within mu1 and mu2 is not enough, mu1 and mu2 could still be switching even though they follow non-exchangeable priors (see Betancourt’s Stan case study Identifying Bayesian Mixture, here is the PyMC3 version)