Following on from Using a Multivariate Normal as a prior is it possible to use a mixture of multivariate normals as a prior? Can the `pm.Mixture`

even be used in defining a prior distribution, as it often seems to require `observed`

values?

# Using a mixture of multivariate normal distributions as a prior

**mattpitkin**#1

**junpenglao**#2

The current mixture Class is not optimal for multivariate distributions (although it has gotten so much better lately thanks to @lucianopaz). If you are using is a prior, make sure you specify the shape carefully.

**lucianopaz**#3

It is possible in the current master branch on GitHub. You can do something like the following

```
with pm.Model():
component0 = pm.MvNormal('c0', mu=[0, 0], cov=[[1, 1], [1, 1]], shape=2)
component1 = pm.MvNormal('c1', mu=[0, 0], cov=[[1, -1], [-1, 1]], shape=2)
w = pm.Beta('w', 1, 1)
mix = pm.Mixture('mix', w=[w, 1 - w], comp_dist=[component0.distribution, component1.distribution], shape=2)
obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sigma= 1, observed=y)
```

**mattpitkin**#4

Thanks very much. That works very well. Any idea when the next release will be that includes this?

**lucianopaz**#5

We had a brief discussion last week on whether we could bundle together a nightly-release. Iâ€™m not sure when that would be, nor when 3.7 will be released. @colcarroll, @twiecki, @ferrine, do you have any updates on nightly and 3.7 releases?

**mattpitkin**#6

Thanks.

Sorry to add another question, but do you know if itâ€™s also possible to use the mixture within a `pm.Bounded`

object, to set lower and upper bounds on the distribution?

**lucianopaz**#8

I havenâ€™t tried, but in principle it should be possible. Let us know if you run in to any trouble.

**mattpitkin**#9

Iâ€™ve tried adding bounds and it seems to lead to something a bit odd. Below is some example code:

```
%matplotlib inline
from matplotlib import pyplot as pl
import numpy as np
import pymc3 as pm
# set some example data (which is uniformative)
x = np.linspace(0, 9, 10)
sigma = 100.
data = sigma*np.random.randn(10) # broad (uninformative) Gaussian data
mus = [[0., 0.], [5., 5.]] # means of Gaussians
covs = [np.array([[1., 0.], [0., 1.]]), np.array([[1., 0.9], [0.9, 1.]])] # covariances
weights = [0.7, 0.3] # weights of each mode
lowerbounds = [0., 0.]
upperbounds = [5., 5.]
shape = len(mus[0]) # number of dimensions
# set test values in the centre of the bounds
testvals = np.add(lowerbounds, np.subtract(upperbounds, lowerbounds)/2.)
with pm.Model():
# define bounded MvNormal
BoundedMvN = pm.Bound(pm.MvNormal, lower=lowerbounds, upper=upperbounds)
comp_dists = []
for i, mu, cov in zip(range(shape), mus, covs):
comp_dists.append(BoundedMvN('c{}'.format(i), mu=mu, cov=cov, shape=shape).distribution)
mix = pm.Mixture('mix', w=weights, comp_dists=comp_dists, shape=shape, testval=testvals)
obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sd=sigma, observed=data)
trace = pm.sample(2000)
```

If I plot the output:

```
pl.plot(trace['mix'][:,0], trace['mix'][:,1], '.')
```

It gives:

which shows it only seems to have sampled from one of the modes of the mixture.

If I remove the bounds, e.g., using:

```
with pm.Model():
comp_dists = []
for i, mu, cov in zip(range(shape), mus, covs):
comp_dists.append(pm.MvNormal('c{}'.format(i), mu=mu, cov=cov, shape=shape).distribution)
mix = pm.Mixture('mix', w=weights, comp_dists=comp_dists, shape=shape, testval=testvals)
obs = pm.Normal('obs', mu=mix[0] * x + mix[1], sd=sigma, observed=data)
trace = pm.sample(2000)
```

It samples from both modes as expected:

Do you have any idea what might be the issue?

**mattpitkin**#10

I should add that if I plot the the individual components `c0`

and `c1`

they do seem to be sampled correctly, its just the mixture `mix`

that only shows one component.

**lucianopaz**#11

And if you use `sample_prior_predictive`

instead, do you still get just a single component?

**mattpitkin**#12

Using `sample_prior_predictive`

gives the following error:

```
ValueError: Drawing samples from distributions with array-valued bounds is not supported.
```

**lucianopaz**#13

Hmm, Iâ€™ll try to look into the problem when I have some time. The `sample_prior_predictive`

part will take some time so Iâ€™ll start with the inference part with no data being stuck in a single component.

**colcarroll**#14

Iâ€™m actually running the snippet you supplied and it works fine for me with a conda-installed pymc3-3.6. It feels weird to say that the supplied example runs *too* well :

```
import arviz as az
data = az.from_pymc3(trace)
az.plot_joint(data.posterior, var_names=['mix'], coords={'mix_dim_0': np.array([0, 1])});
```

**colcarroll**#15

Playing around a bit more, it is possible NUTS was just having trouble mixing on your first run. I reran with `target_accept=0.99`

since there were a lot of divergences, and get a better mix of samples. Still plenty of divergences, I assume from the boundary conditions.

Hereâ€™s the new marginal plot. The scatter is hard to distinguish in the first plot, but you can see differences in the marginal distributions. Iâ€™m using a custom stylesheet here!

**mattpitkin**#16

Cool! Trying again it looks like I get decent sampling of both modes now. It must have just been that my initial attempt got a bit stuck!

**mattpitkin**#17

Just a quick note - I think part (or maybe all!) of the reason I originally only got one mode was because the likelihood wasnâ€™t entirely uninformative and therefore led to one mode being favoured. If I make the data â€śnoise freeâ€ť then *I think* it more consistently returns both modes.