Using a mixture of multivariate normal distributions as a prior

Following on from Using a Multivariate Normal as a prior is it possible to use a mixture of multivariate normals as a prior? Can the `pm.Mixture` even be used in defining a prior distribution, as it often seems to require `observed` values?

The current mixture Class is not optimal for multivariate distributions (although it has gotten so much better lately thanks to @lucianopaz). If you are using is a prior, make sure you specify the shape carefully.

It is possible in the current master branch on GitHub. You can do something like the following

``````with pm.Model():
component0 = pm.MvNormal('c0', mu=[0, 0], cov=[[1, 1], [1, 1]], shape=2)
component1 = pm.MvNormal('c1', mu=[0, 0], cov=[[1, -1], [-1, 1]], shape=2)
w = pm.Beta('w', 1, 1)
mix = pm.Mixture('mix', w=[w, 1 - w], comp_dist=[component0.distribution, component1.distribution], shape=2)
obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sigma= 1, observed=y)

``````
2 Likes

Thanks very much. That works very well. Any idea when the next release will be that includes this?

We had a brief discussion last week on whether we could bundle together a nightly-release. Iâ€™m not sure when that would be, nor when 3.7 will be released. @colcarroll, @twiecki, @ferrine, do you have any updates on nightly and 3.7 releases?

Thanks.

Sorry to add another question, but do you know if itâ€™s also possible to use the mixture within a `pm.Bounded` object, to set lower and upper bounds on the distribution?

I think we should push out 3.7 asap.

2 Likes

I havenâ€™t tried, but in principle it should be possible. Let us know if you run in to any trouble.

1 Like

Iâ€™ve tried adding bounds and it seems to lead to something a bit odd. Below is some example code:

``````%matplotlib inline

from matplotlib import pyplot as pl
import numpy as np
import pymc3 as pm

# set some example data (which is uniformative)
x = np.linspace(0, 9, 10)
sigma = 100.
data = sigma*np.random.randn(10)  # broad (uninformative) Gaussian data

mus = [[0., 0.], [5., 5.]]  # means of Gaussians
covs = [np.array([[1., 0.], [0., 1.]]), np.array([[1., 0.9], [0.9, 1.]])]  # covariances

weights = [0.7, 0.3]  # weights of each mode

lowerbounds = [0., 0.]
upperbounds = [5., 5.]

shape = len(mus[0])  # number of dimensions

# set test values in the centre of the bounds

with pm.Model():
# define bounded MvNormal
BoundedMvN = pm.Bound(pm.MvNormal, lower=lowerbounds, upper=upperbounds)

comp_dists = []
for i, mu, cov in zip(range(shape), mus, covs):
comp_dists.append(BoundedMvN('c{}'.format(i), mu=mu, cov=cov, shape=shape).distribution)
mix = pm.Mixture('mix', w=weights, comp_dists=comp_dists, shape=shape, testval=testvals)

obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sd=sigma, observed=data)

trace = pm.sample(2000)
``````

If I plot the output:

``````pl.plot(trace['mix'][:,0], trace['mix'][:,1], '.')
``````

It gives:

which shows it only seems to have sampled from one of the modes of the mixture.

If I remove the bounds, e.g., using:

``````with pm.Model():
comp_dists = []
for i, mu, cov in zip(range(shape), mus, covs):
comp_dists.append(pm.MvNormal('c{}'.format(i), mu=mu, cov=cov, shape=shape).distribution)
mix = pm.Mixture('mix', w=weights, comp_dists=comp_dists, shape=shape, testval=testvals)

obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sd=sigma, observed=data)

trace = pm.sample(2000)
``````

It samples from both modes as expected:

Do you have any idea what might be the issue?

I should add that if I plot the the individual components `c0` and `c1` they do seem to be sampled correctly, its just the mixture `mix` that only shows one component.

And if you use `sample_prior_predictive` instead, do you still get just a single component?

Using `sample_prior_predictive` gives the following error:

``````ValueError: Drawing samples from distributions with array-valued bounds is not supported.
``````

Hmm, Iâ€™ll try to look into the problem when I have some time. The `sample_prior_predictive` part will take some time so Iâ€™ll start with the inference part with no data being stuck in a single component.

Iâ€™m actually running the snippet you supplied and it works fine for me with a conda-installed pymc3-3.6. It feels weird to say that the supplied example runs too well :

``````import arviz as az

data = az.from_pymc3(trace)
az.plot_joint(data.posterior, var_names=['mix'], coords={'mix_dim_0': np.array([0, 1])});
``````

1 Like

Playing around a bit more, it is possible NUTS was just having trouble mixing on your first run. I reran with `target_accept=0.99` since there were a lot of divergences, and get a better mix of samples. Still plenty of divergences, I assume from the boundary conditions.

Hereâ€™s the new marginal plot. The scatter is hard to distinguish in the first plot, but you can see differences in the marginal distributions. Iâ€™m using a custom stylesheet here!

3 Likes

Cool! Trying again it looks like I get decent sampling of both modes now. It must have just been that my initial attempt got a bit stuck!

Just a quick note - I think part (or maybe all!) of the reason I originally only got one mode was because the likelihood wasnâ€™t entirely uninformative and therefore led to one mode being favoured. If I make the data â€śnoise freeâ€ť then I think it more consistently returns both modes.

1 Like