4 lines basic mixture model outputs wrong results

hwassner · March 12, 2020, 10:32am

This is my observed bimodal data :

My model is mixture of 2 uniform distributions (exactly the same that what I used to produce data), one on [0,5] the other [10,15].

My goal is to be able to tell wich distribution produce every data sample.
(Which shoud be obvious. If the sample is in [0,5] then it comes form the first distribution, and if the sample is in [10,15] then it comes from the second distribution. There is no overlap.).

import numpy as np
import pymc3 as pm

# Test data
data = np.concatenate([np.random.uniform(low=0,high=5,size=100),np.random.uniform(low=10,high=15,size=100)])

with pm.Model() as model:      
    c1 = pm.Uniform.dist(lower=0,upper=5) 
    c2 = pm.Uniform.dist(lower=10,upper=15) 
    w = pm.Dirichlet('w',a=np.array([1,1]),shape=(len(data),2))
    mix = pm.Mixture('mix',w=w,comp_dists=[c1,c2],observed=data,shape=len(data))
    
    trace = pm.sample()

Wich seems to sample well :

Since the both data distributions and the the two mixture components (c1 & c2) are totally distinct, I expect that the mixture weights w will be nicely separated. But it’s not!! Let’s see :

First let’s see the shape of w :

First coordinate is the distribution, second coordinate is the observed data sample, and the last coordinate is the 2 component weights.

So let’s choose the first data sample (0), and see the corresponding distribution of w for that sample.

%matplotlib notebook
import matplotlib.pyplot as plt
sample = 0
plt.hist(trace[‘w’][:,sample,0],alpha=0.5) # w[0] blue
plt.hist(trace[‘w’][:,sample,1],alpha=0.5) # w[1] orange
plt.grid(True)

In my understanding of the problem and model, these two histograms should be separated… How can that be that this two distributions overlap ?

Since c2 is uniform on the interval [10,15], the likelihood of the c2 distribution is 0 at ~2.24. Then the log-likelyhood is -infinity.

How the mixture can hold a non-zero weight on a -infinity likelyhood !!?

Is there a problem with my understanding or a problem with the mixture feature ?

Any help will be appreciated…

Topic		Replies	Views
How to (re-)sample from an observed random (mixture) variable Questions	5	1648	April 9, 2018
Best way to model a observed variable that is a mixture of two distributions Questions	2	863	November 19, 2018
Help with mixture model of MvNormals in pymc3? Questions	4	700	October 24, 2019
Multinomial mixture with observed-values themselves as mixtures Questions	8	1035	March 29, 2019
Weird shape behavior/error using a mixture Questions	0	409	March 2, 2020

4 lines basic mixture model outputs wrong results

Related topics