Hierarchical Mixture of Multivariate Normal

Hello!

I’m try code hierarchical clustering model as in the example on stackoverflow

Clustering is done on the parent observations, but each parent observation has some number of individuals. I code simular example with some modifications in this topic. But I find that that sampling has many divergences and almost all parent observations refer to one cluster. Please, tell me what I’m doing wrong?

Any help would be great, thank you!

The code that generates the data:

NUM_OF_FEATURES = 2
NUM_OF_CLUSTERS = 3
NUM_OF_PARENTS = 60
NUM_OF_OBSERVATIONS = 50

B = 5  # mu 
tau = 3 # variances
nprov = 60 #number of parent observations

mu = [[0,0],[0,B],[-B,0]]
true_cov0 = np.array([[1.,0.],[0.,1.]])
true_cov1 = np.array([[1.,0.],[0.,tau**(2)]])
true_cov2 = np.array([[tau**(-2),0],[0.,1.]])

trueprobs = [.4, .3, .3]   #probability of being in each of the three mixtures
prov = np.random.multinomial(2, trueprobs, size=nprov)
v = prov[:,1] + (prov[:,2])*2
numtoeach = 50

vAll = np.tile(v,numtoeach)
ndata = numtoeach*nprov

data = (vAll==0)*(np.random.multivariate_normal(mu[0],true_cov0,ndata)).T \
  + (vAll==1)*(np.random.multivariate_normal(mu[1],true_cov1,ndata)).T \
  + (vAll==2)*(np.random.multivariate_normal(mu[2],true_cov2,ndata)).T
data=data.T

Model:

parents_item_list = range(NUM_OF_PARENTS)
parents_idx_obs_list = np.tile(parents_item_list, NUM_OF_OBSERVATIONS)

with pm.Model() as Model:
    Lower = tt.stack([pm.LKJCholeskyCov('sigma_{}'.format(k), n=NUM_OF_FEATURES, eta=2.,
                                        sd_dist=pm.HalfNormal.dist(sd=1.)) for k in range(NUM_OF_CLUSTERS)])
    Chol = tt.stack([pm.expand_packed_triangular(NUM_OF_FEATURES, Lower[k]) for k in range(NUM_OF_CLUSTERS)])

    Mus = tt.stack([pm.Uniform('Mu_{}'.format(k), lower=-5., upper=5.,
                               shape=NUM_OF_FEATURES) for k in range(NUM_OF_CLUSTERS)])
    
    Weights = pm.Dirichlet('w', np.ones(NUM_OF_CLUSTERS) / NUM_OF_CLUSTERS)
    
    Cluster = pm.Categorical('cluster', p=Weights, shape=NUM_OF_PARENTS)
    
    mean = pm.Deterministic('mean', Mus[Cluster[parents_idx_obs_list]])
    prec = pm.Deterministic('prec', Chol[Cluster[parents_idx_obs_list]])

    MultivariateNormals = [pm.MvNormal.dist(mu=mean[k], chol=prec[k], shape=NUM_OF_FEATURES) for k in range(NUM_OF_CLUSTERS)]

    like = pm.Mixture('like', w=Weights, comp_dists=MultivariateNormals, observed=data)