Issue on 2-D marginalized GMM

blueone · November 18, 2018, 4:42am

Hi!

I am trying to implement 2-D marginalized GMM as follows,
p(x_n | \pi, \mu, \sigma) = \sum_{i=1}^K \pi_k \text{Normal}(x_n | \mu_k, \tau_k)

p(\pi) = \text{Dirichlet} (\pi | \alpha \mathbf{1}_k)

p(\mu_k) = \text{Normal}(\mu_k | 0,I)

p(\tau_k ) = \text{Gamma} (\tau_k | a, b)

My data has shape
X.shape
(500,2)

According to an example in pymc3.tests/text_mixture.py

def test_normal_mixture_nd(self):
        nd, ncomp = 3, 5

        with Model() as model0:
            mus = Normal('mus', shape=(nd, ncomp))
            taus = Gamma('taus', alpha=1, beta=1, shape=(nd, ncomp))
            ws = Dirichlet('ws', np.ones(ncomp))
            mixture0 = NormalMixture('m', w=ws, mu=mus, tau=taus, shape=nd)

        with Model() as model1:
            mus = Normal('mus', shape=(nd, ncomp))
            taus = Gamma('taus', alpha=1, beta=1, shape=(nd, ncomp))
            ws = Dirichlet('ws', np.ones(ncomp))
            comp_dist = [Normal.dist(mu=mus[:, i], tau=taus[:, i])
                         for i in range(ncomp)]
            mixture1 = Mixture('m', w=ws, comp_dists=comp_dist, shape=nd)

        testpoint = model0.test_point
        testpoint['mus'] = np.random.randn(nd, ncomp)
        assert_allclose(model0.logp(testpoint), model1.logp(testpoint))
assert_allclose(mixture0.logp(testpoint), mixture1.logp(testpoint))

I create the model as follows:

# set up model
nd = 2
ncomp = 3
with pm.Model() as model0:
    mus = pm.Normal('mus', shape=(nd, ncomp))
    taus = pm.Gamma('taus', alpha=1, beta=1, shape=(nd, ncomp))
    ws = pm.Dirichlet('ws', np.ones(ncomp))
    mixture0 = pm.NormalMixture('m', w=ws, mu=mus, tau=taus, observed=X)

But it throws out error saying

Input dimension mis-match. (input[0].shape[0] = 500, input[1].shape[0] = 2)

Could you guys provide some help for this problem? Many thanks!

junpenglao · November 18, 2018, 9:47am

An ugly workaround:

# set up model
nd = 2
ncomp = 3
with pm.Model() as model0:
    mus = pm.Normal('mus', shape=(nd, ncomp))
    taus = pm.Gamma('taus', alpha=1, beta=1, shape=(nd, ncomp))
    ws = pm.Dirichlet('ws', np.ones(ncomp))
    mixture0 = pm.NormalMixture('m', 
                                w=ws, 
                                mu=mus, 
                                tau=taus, 
                                observed=np.random.randn(500, 2, 1))

As you need to evaluate the observed on each component of the comp_dist. Currently we automatically add a dimension on the right for 1D observed, but for multivariate observed it is not working yet…

blueone · November 18, 2018, 3:39pm

Thanks for the reply. I used your code above but the inference performance is not good. Hence, I turned to pm.Mixture.

I tried to create diagonal matrix using the vector tau. But pymc.math seems to not have a create-diagonal-matrix method. Hence, I used theano’s method but it seems to not be compatible with pymc’s input type.

The code is as follows and it throws the error ('AllocDiag only works on vectors', [tau0]).

# set up model
K = 3
with pm.Model() as model0:
    pi = pm.Dirichlet('pi', np.ones(K))

    comp_dist = []
    mu = []
    tau = []
    precision = []
    for i in range(K):
        mu.append(pm.Normal('mu%i'%i, 0, 10, shape=2))
        tau.append(pm.Gamma('tau%i'%i,1, 1, shape=2))
        
        precision.append(tt.nlinalg.AllocDiag()(tau))
        comp_dist.append(pm.MvNormal.dist(mu=mu[i], tau=precision[i]))

    xobs = pm.Mixture('x_obs', pi, comp_dist,
            observed=X)

Do you have any suggestions? Thanks!

junpenglao · November 18, 2018, 4:41pm

Using Mixture and Normalmixture is equivalent in this case which means you will also have inference problem. What kind of inference problem you are having?

blueone · November 18, 2018, 7:13pm

I am using Gaussian Mixture Model to perform image segmentation.

More specifically, the data is an image which is transformed to a pixel matrix of shape (width * height, 3), where 3 represents 3 channels. Hence N= width * height and D=3.

The problem is that, first ADVI seems to have “mode collapse” problem, in that in only capture a few clusters.
secondly, the inferred means of clusters (which are a list of (R,G,B) tuple, does not seem to be in accordance with the original plot.

You could find the segmented figure and the original figure below.

But MCMC gives me a rather good result ( I used Gibbs Sampling from Edward)

However, the plot of ADVI.hist shows that EBLO havs converged. I made a few adjustments to figure out why ADVI fails to give a good segmented figure:

The optimization is stuck at a local optimum. However, I tried with different initialization, and increase the number of clusters, it is not helping much.
The model assumption is not good enough. But since MCMC is working, this actually would not be a problem.
I also tried FullRankADVI, and LKJ prior on the covariance matrix. Not helping much as well.

Could you please provide some insights into this task? I really appreciate your help!

junpenglao · November 18, 2018, 9:24pm

Why not using sampling in PyMC3 if sampling is working fine?
It is hard to say without the data and more information, but in general, I will try to compare the estimation from mcmc and advi. For example, doing a scatter plot of all the estimated parameters (x-axis being mcmc, y-axis being advi)

blueone · November 18, 2018, 9:29pm

I tried to use NUTS in pymc but it takes forever.

Thanks for your suggestions for the comparison and I will try! In fact, I used PSIS to diagnose ADVI, but evaluating the model likelihood p(\theta, X) also takes a lot of time.

Do you have any idea in speeding up NUTS, and PSIS (evaluating model likelihood)?

junpenglao · November 18, 2018, 9:50pm

You can try profiling the model to check for slowness. If both NUTS and PSIS is slow, it is a strong indication that the logp implementation should be improved.

For NUTS, in general more information prior helps. Otherwise, the reason is case by case and you need to identify specifically to your model.

Also, mixture model is hard to do inference on in general (there are plenty of discussion on the discourse).

blueone · November 19, 2018, 1:32am

Thank you so much! I will try your suggestions.

There is one last question that I’ve been thinking for a while? How could I create a diagonal matrix from a vector in pymc3? That is, I have

tau = pm.Gamma('tau%i'%i,1, 1, shape=2)

And I want to do

precision = diagonal_matrix(tau) # pseudo-code

I found that pymc3.math only supports extracting-diagonal-element operation.

junpenglao · November 19, 2018, 5:39am

This works for me:

tau = pm.Gamma('tau', 1, 1, shape=2)
c = tt.nlinalg.alloc_diag(tau)

blueone · November 20, 2018, 12:29am

thanks!

Topic		Replies	Views
2D Gaussian Mixture Model Questions	7	2074	August 25, 2017
Shape issues when creating Dirichlet mixtures Questions	0	427	August 19, 2021
Multivariate Gaussian Processes Questions	2	565	July 27, 2020
Mixture of Linear models Questions	3	1191	July 21, 2018
Bayesian regression using Gaussian mixture model Questions	0	530	September 30, 2021

Issue on 2-D marginalized GMM

Related topics