Using a mixture is a nice idea. You could try using the new MarginalModel
to marginalize over the mixture index to get better results. I tried a mixture of two gaussian PDFs plus a deterministic trend (it seemed like the data was upward sloping to me outside of the bump region). Here’s what I came up with:
import pandas as pd
import pymc as pm
import arviz as az
import numpy as np
import pytensor.tensor as pt
import matplotlib.pyplot as plt
from pymc_experimental import MarginalModel
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
df = pd.read_csv('data/Oxygen_Data.csv', header=None)
df.columns = ['x', 'y']
df = (df - df.mean()) / df.std()
coords={'params':['slope', 'intercept'],
'peak':[0, 1],
'group':['trend', 'peak_1', 'peak_2'],
'obs_idx':np.arange(df.shape[0])}
coords={'params':['slope', 'intercept'],
'peak':[0, 1],
'group':['trend', 'peak_1', 'peak_2'],
'obs_idx':np.arange(df.shape[0])}
with MarginalModel(coords=coords) as m:
x_data = pm.ConstantData('x', df.x, dims=['obs_idx'])
y_data = pm.ConstantData('y', df.y, dims=['obs_idx'])
X = pt.concatenate([pt.ones_like(x_data[:, None]), x_data[:, None]], axis=-1)
beta = pm.Normal('beta', dims='params')
mu_trend = X @ beta
mu_peak = pm.Normal('mu_peak',
dims=['peak'],
transform=pm.distributions.transforms.ordered,
initval=[-1, -0.5])
sigma_peak = pm.Exponential('sigma_peak', 1, dims=['peak'])
p_x = pt.exp(pm.logp(pm.Normal.dist(mu=mu_peak, sigma=sigma_peak), x_data[:, None]))
logit_p_group = pm.Normal('logit_p_group', dims=['group'])
group_idx = pm.Categorical('group_idx', logit_p=logit_p_group, dims=['obs_idx'])
sigma = pm.Exponential('sigma', 1)
mu = pt.switch(pt.lt(group_idx, 1),
mu_trend,
pt.switch(pt.lt(group_idx, 2),
p_x[:, 0],
p_x[:, 1])
)
y_hat = pm.Normal('y_hat',
mu = mu,
sigma = sigma,
observed=y_data,
dims=['obs_idx'])
Then you have to know some boilerplate for using MarginalModel
, but it’s not so hard. Tutorial is here.
with m:
m.marginalize(['group_idx'])
idata = pm.sample()
m.recover_marginals(idata)
m.unmarginalize(m.marginalized_rvs)
idata = pm.sample_posterior_predictive(idata, extend_inferencedata=True)
Here’s what I got:
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
mu_peak[0] -0.635 0.003 -0.640 -0.631 0.0 0.0 3417.0 2632.0 1.0
mu_peak[1] -0.546 0.003 -0.551 -0.541 0.0 0.0 4565.0 3245.0 1.0
sigma_peak[0] 0.119 0.002 0.115 0.123 0.0 0.0 3980.0 3139.0 1.0
sigma_peak[1] 0.129 0.002 0.125 0.132 0.0 0.0 4844.0 3142.0 1.0
Some failures I hit on the way:
- My first instinct was to use a Dirichlet to parameterize
p
for the Categorical, but this ends up not letting you use NUTS after you marginalize (@ricardoV94 is this a known bug/limitation?) - The ordered transform on the means of the mixture components ends up being important. I guess this is quite common in mixtures, see this discussion. In this case, when I didn’t include it, the model couldn’t pick out the two modes (I guess due to mode switching, but they were so close i didn’t see it in the trace plots [also I didn’t look that carefully])
- I wanted to have a linear model of
X
onp_logit
, butMarginalModel
didn’t that either.