I’m trying to fit a hierarchical regression model where the slopes follow a bimodal distribution (a mixture of two normals) rather than being normally distributed around a group mean. Essentially each beta value should come from one of two normal distributions, but I have no prior knowledge of which distribution it comes from.
I’ve tried implementing this using a
NormalMixture as the group beta and estimating the mean of each of its two component distributions. However the component distributions tend to just collapse to the overall mean rather than giving the true bimodal distribution.
This may be a dumb way to approach this, or simply impossible, but I’d appreciate any help!
Here is a minimal example:
import pandas as pd import pymc3 as pm df = pd.read_csv('example_data.csv') with pm.Model() as hierarchical_model: # Means of component distributions mus = pm.Normal('mixture_mus', mu=0.5, sd=0.5, shape=2) # Group-level beta beta_mu = pm.NormalMixture('beta_mu', [0.5, 0.5], mu=mus, sd=0.5) # Betas beta = pm.Normal("beta", mu=beta_mu, sd=1, shape=len(df['level2'].unique())) # Error eps = pm.HalfNormal('eps', 1) # Estimated y values y_est = beta[df['level2'].values] * df['x'] # Likelihood y_like = pm.Normal('likelihood', y_est, sd=eps, observed=df['y']) with hierarchical_model: trace = pm.sample(3000, chains=1, tune=500)
And for reference, the true beta values used to simulate data (which I’d expect the group beta to look like):
example_data.csv (633.6 KB)