Hierarchical Gaussian Mixture Regression Model

I think you’ll find that’s what the model does. Each sample is treated as having an intercept or slope from one of k clusters (with unknown mean and (co)variance).

Your model draws each slope/intercept independently for each sample from the mixture model. The way you would include this in a hierarchical model (with replicates) is by allowing offsets for each of the individuals from each of the clusters. This can quickly lead to over-parameterization, since there are an additional n \times k parameters.

If you have the N \times n indicator matrix mapping observations to sample identifiers sample_ids you can do this:

sample_beta_offsets = pm.Normal('per_sample_offsets', 0., 0.1, shape=(n, k, 2))
    beta_sample = pm.Deterministic('per_sample_beta', tt.transpose(tt.dot(tt.transpose(sample_beta_offsets), sample_ids.T)) + beta)
    
    # Each sample has its own (slope, int) for each cluster
    # (N, k, 2)
    # observe (N, 2) values X
    # Multiplying the values for a particular sample gives the cluster mean
    # (i, k, 2) * X[i,:] = mean_i
    
    cluster_means = pm.Deterministic('cl_means', tt.stack([tt.dot(beta_sample[i,:,:], X[i,:]) for i in range(sample_ids.shape[0])]))

but it’s really painful…