I’m starting to look at mixture models. I have a toy liner regression where I’m trying to infer the same latent parameter by choosing the linear operator which has the least noise.

Using pm.Mixture, it works as expected:

```
import pymc3 as pm
np.random.seed(2)
m = 60
n = 3
eps = 0.01
siga = 1
H = np.random.random(size=(m,n))
xt = np.expand_dims(np.arange(1,n+1),1)
y = H @ xt + np.expand_dims(np.random.normal(loc=0,scale=eps, size=m),1)
H1 = 0.1 + H + np.random.normal(0,eps*10, size=(m,n))
H2 = H + np.random.normal(0,eps, size=(m,n))
nit = 3000
with pm.Model() as model:
x = pm.Normal("x", mu=xt.max()/2, sd = siga, shape = (n,1))
likemix = []
for Hi in [H1,H2]:
likemix = likemix + [pm.Normal.dist(mu = pm.math.dot(Hi, x), sd=eps, shape=m)]
w = pm.Dirichlet('w', a=np.array([2, 2]))
Y = pm.Mixture('like', w=w, comp_dists = likemix, observed=y)
trace = pm.sample(nit, tune=1000)
pm.plot_trace(trace)
```

But if I write out the problem in parts, using the addition of weighted likelihoods, the weightings (this time using an equivalent Beta distribution) just revert to the prior.

```
with pm.Model() as model2:
x = pm.Normal("x", mu=xt.max()/2, sd = siga, shape = (n,1))
a1 = pm.Beta(f"alpha", alpha=2, beta=2)
Y = a1*pm.Normal(f'y1', mu = pm.math.dot(H1, x), sd=eps, observed=y) + \
(1-a1)*pm.Normal(f'y2', mu = pm.math.dot(H2, x), sd=eps, observed=y)
trace = pm.sample(nit, chains=3, tune=1000)
pm.plot_trace(trace)
```

What’s going on here? I want to build a more complex mixture model, which the pm.Mixture functionality doesn’t accommodate. This is is why I want to construct it as above.