Hierarchical Gaussian Mixture Regression Model

@Aditya_sai_ram_Nemal

Mea culpa.

Your model is basically correct. It only fails to sample because of using Normals rather than HalfNormals on the standard deviations.

Test data:

import numpy as np
import seaborn as sbn
import pymc3 as pm
import theano.tensor as tt
import pandas as pd

cluster_props = [0.25, 0.75]
n_clusters = len(cluster_props)
n_samples = 80

np.random.seed(213121)

slopes = np.random.standard_t(1., size=(n_clusters,)) * 3
intercepts = np.random.standard_t(1., size=(n_clusters,)) * 5

def mksample(n_meas):
    measurements = np.random.uniform(low=0., high=10., size=(n_meas,))
    clust = np.where(np.random.multinomial(1, pvals=cluster_props) > 0.1)[0]
    clust2 = np.where(np.random.multinomial(1, pvals=cluster_props) > 0.1)[0]
    slope, int_ = slopes[clust] + np.random.normal(0.1), intercepts[clust2] + np.random.normal(0.1)
    y = int_ + slope * measurements + np.random.normal(0, 0.5, size=(n_meas,))
    return pd.DataFrame({'x': measurements, 'y': y})

data = {'sample_%d' % (1+i): mksample(1 + np.random.binomial(9, 0.5)) for i in range(n_samples)}

image

Model:

k=2
with pm.Model() as hm_model:
    # priors
    weight_prior_s = pm.Dirichlet('wps', 2. * np.ones(k))
    weight_prior_i = pm.Dirichlet('wpi', 2. * np.ones(k))
    clust_slope_means = pm.Normal('mu_slope', 0., 5., shape=(k,))
    clust_int_means = pm.Normal('mu_int', 0., 5., shape=(k,))
    slope_offset_err = pm.HalfNormal('se_slope', 2.)
    int_offset_err = pm.HalfNormal('se_int', 2.)
    resid_err = pm.HalfNormal('resid_err', 1.)
    # the mixture model
    slope_prior = pm.NormalMixture('slope_mix', w=weight_prior_s, mu=clust_slope_means, sd=slope_offset_err, shape=(len(data),))
    int_prior = pm.NormalMixture('int_mix', w=weight_prior_i, mu=clust_int_means, sd=int_offset_err, shape=(len(data),))
    # the observed data
    for i, (sample, samdat) in enumerate(data.items()):
        sam_slope = slope_prior[i]
        sam_int = int_prior[i]
        lik_sam = pm.Normal('%s_obs' % sample, sam_int + sam_slope * samdat['x'], resid_err, observed=samdat['y'])
        
    res = pm.sample(150, chains=4, cores=2, nuts_kwargs={'target_accept': 0.8, 'max_treedepth': 10})

It samples quickly and well:

The slope_mix trace has a line that looks weird and all-over-the-place; but this comes from a sample with only one observation:

image

In sum: The following could be occurring for your model

  1. Bad parametrization of standard errors (use HalfNormal instead)
  2. Poor identifiability (no clear clusters for mean/slope; or incorrect # of clusters)
  3. Overparametrization (insufficient # of samples, or measurements per sample for all the parameters)