Just from looking at the dataset that looks like a within subject design, is that correct? (ie you measured both comp and incomp for all subjects, so the measurements come in pairs). A linear model of the log(reation_times) or a glm with log link sound like the easiest models for a dataset like that (I don’t have any clue about reaction time modeling, so take that with a grain of salt). Or do you have specific reasons for your exponential and power model? The linear model would look something like that:
data = pd.DataFrame({
'subject_id': [0, 0, 1, 1, 2, 2, 3, 3, 4, 4],
'comp': [False, True] * 5,
'age': [12, 12, 13, 13, 14, 14, 25, 25, 40, 40],
'reaction_time': np.exp(np.random.randn(10)),
})
n_subjects = data.subject_id.nunique()
with pm.Model() as model:
intercept = pm.Flat('intercept')
sd = pm.HalfNormal('subject_sd', sd=2)
raw = pm.Normal('subject_raw', shape=n_subjects)
subject = pm.Deterministic('subject', sd * raw)
# Here we assume that the age has a linear relationship with the log(reaction_time)
age = pm.Normal('age', sd=2) # TODO what scale parameter is reasonable?
comp = pm.Normal('comp', sd=2)
mu = (
intercept
+ comp * data.comp
+ age * data.age
+ subject[data.subject_id])
# Only for convenience to make residual plots easier
pm.Deterministic('mu', mu)
sd = pm.HalfCauchy('sd', beta=2)
pm.Normal('log_reaction_time', mu=mu, sd=sd, observed=np.log(data.reaction_time))
This corresponds somewhat to the exponential model, as
You get the power model by replacing age = pm.Normal('age') by log_age = pm.Normal('log_age') and then also do + log_age * np.log(data.age) in the computation of mu. That is because
Instead of trusting formal model comparison (use LOO instead of WAIC) too much, I’d rather look at posterior predictive samples and plot the residuum. That allows you to figure out why one model is better.