 Modeling human response time data with an exponential model: model comparison issues

Hi Community,
The graph below displays mean reaction time human data as a function of age in a simple two-choice task with two experimental conditions (compatible and incompatible).

I would like to model these data with N(μ, σ), with either
μ = α * pm.math.exp(-β * Age) + γ [exponential model]
or μ = α * Age**-β + γ [power model]
For each model variant (exponential vs power), I would further like to constrain β to be identical between compatible and incompatible conditions.

For the power model, I computed:

with pm.Model() as model_pow:
α_c = pm.HalfCauchy('α_c', 10)
α_i = pm.HalfCauchy('α_i', 10)
β = pm.Normal('β', 1, 0.5)
γ_c = pm.Normal('γ_c', Mean_RT_comp_Indiv.mean(), .15)
γ_i = pm.Normal('γ_i', Mean_RT_incomp_Indiv.mean(), .15)
σ = pm.HalfNormal('σ', .1)
μ_c = α_c*Years_indiv**-β + γ_c
μ_i = α_i*Years_indiv**-β + γ_i
y_obs_comp = pm.Normal('y_obs_comp', μ_c, σ, observed=Mean_RT_comp_Indiv)
y_obs_incomp = pm.Normal('y_obs_incomp', μ_i, σ, observed=Mean_RT_incomp_Indiv)

if __name__ == "__main__":
with  model_pow:
trace_power = pm.sample (4000, chains = 4, core = 4, tune = 2000, target_accept = .95)
az.plot_trace(trace_power)

For the exponential model, I computed:

with pm.Model() as model_exp:
α_c = pm.HalfCauchy('α_c', 10)
α_i = pm.HalfCauchy('α_i', 10)
β = pm.Normal('β', 1, 0.5)
γ_c = pm.Normal('γ_c', Mean_RT_comp_Indiv.mean(), .15)
γ_i = pm.Normal('γ_i', Mean_RT_incomp_Indiv.mean(), .15)
σ = pm.HalfNormal('σ', .1)
μ_c = α_c*pm.math.exp(-β*Years_indiv) + γ_c
μ_i = α_i*pm.math.exp(-β*Years_indiv) + γ_i
y_obs_comp_exp = pm.Normal('y_obs_comp_exp', μ_c, σ, observed=Mean_RT_comp_Indiv)
y_obs_incomp_exp = pm.Normal('y_obs_incomp_exp', μ_c, σ, observed=Mean_RT_incomp_Indiv)
if __name__ == "__main__":
with  model_exp:
trace_exp = pm.sample (4000, chains = 4, core = 4, tune = 2000, target_accept = .95)
az.plot_trace(trace_exp)

Unfortunately, because I am new on this forum, I can’t incorporate additional files showing trace plots. I see sampling issues and get warnings such as:

There were 13 divergences after tuning. Increase target_accept or reparameterize.
There were 3 divergences after tuning. Increase target_accept or reparameterize.
The acceptance probability does not match the target. It is 0.879040376552122, but should be close to 0.8. Try to increase the number of tuning steps.
There were 18 divergences after tuning. Increase target_accept or reparameterize.
The number of effective samples is smaller than 25% for some parameters.

How can I solve or attenuate these sampling issues?

In addition, I would like to compare the performance of the power vs the exponential model using WAIC. I thus computed:

WAIC = az.compare({‘pow’:trace_power, ‘exp’:trace_exp})

returning the following error:

TypeError Traceback (most recent call last)
in
----> 1 WAIC = az.compare({‘pow’:trace_power, ‘exp’:trace_exp})
2 WAIC

~\Anaconda3\envs\bayes\lib\site-packages\arviz\stats\stats.py in compare(dataset_dict, ic, method, b_samples, alpha, seed, scale)
153 for name, dataset in dataset_dict.items():
154 names.append(name)
–> 155 ics = ics.append([ic_func(dataset, pointwise=True, scale=scale)])
156 ics.index = names
157 ics.sort_values(by=ic, inplace=True, ascending=ascending)

~\Anaconda3\envs\bayes\lib\site-packages\arviz\stats\stats.py in waic(data, pointwise, scale)
987 )
988 if “log_likelihood” not in inference_data.sample_stats:
–> 989 raise TypeError(“Data must include log_likelihood in sample_stats”)
990 log_likelihood = inference_data.sample_stats.log_likelihood
991

TypeError: Data must include log_likelihood in sample_stats

It looks like I need to compute the log likelihood but don’t know how to proceed. Any idea?

Many thanks for your help

Just from looking at the dataset that looks like a within subject design, is that correct? (ie you measured both comp and incomp for all subjects, so the measurements come in pairs). A linear model of the log(reation_times) or a glm with log link sound like the easiest models for a dataset like that (I don’t have any clue about reaction time modeling, so take that with a grain of salt). Or do you have specific reasons for your exponential and power model? The linear model would look something like that:

data = pd.DataFrame({
'subject_id': [0, 0, 1, 1, 2, 2, 3, 3, 4, 4],
'comp': [False, True] * 5,
'age': [12, 12, 13, 13, 14, 14, 25, 25, 40, 40],
'reaction_time': np.exp(np.random.randn(10)),
})
n_subjects = data.subject_id.nunique()

with pm.Model() as model:
intercept = pm.Flat('intercept')

sd = pm.HalfNormal('subject_sd', sd=2)
raw = pm.Normal('subject_raw', shape=n_subjects)
subject = pm.Deterministic('subject', sd * raw)

# Here we assume that the age has a linear relationship with the log(reaction_time)
age = pm.Normal('age', sd=2) # TODO what scale parameter is reasonable?

comp = pm.Normal('comp', sd=2)

mu = (
intercept
+ comp * data.comp
+ age * data.age
+ subject[data.subject_id])

# Only for convenience to make residual plots easier
pm.Deterministic('mu', mu)

sd = pm.HalfCauchy('sd', beta=2)
pm.Normal('log_reaction_time', mu=mu, sd=sd, observed=np.log(data.reaction_time))

This corresponds somewhat to the exponential model, as

\log(r) = \alpha + \beta \cdot \text{age} \\ r = \exp(\beta \cdot \text{age})\exp(\alpha).

You get the power model by replacing age = pm.Normal('age') by log_age = pm.Normal('log_age') and then also do + log_age * np.log(data.age) in the computation of mu. That is because

\log(r) = \alpha + \beta \cdot \log(\text{age}) \\ r = \exp(\alpha) \cdot \text{age}^\beta.

Instead of trusting formal model comparison (use LOO instead of WAIC) too much, I’d rather look at posterior predictive samples and plot the residuum. That allows you to figure out why one model is better.

1 Like

Oh, and you can get a quick idea about which of the two models might be better by looking at a log or a log-log plot. In the exponential model a plot of log(reation_time) vs age should be more or less a line. If it follows a power law you would expect log(reation_time) vs log(age) to be a straight line.

1 Like

Dear aseybolt,
Thank you so much for your help and interesting comments. I very much appreciate.
You are right, it is a within-subject design. Let me give you a little more information about that project. Previous research in developmental psychology has shown that age-related changes in mean RT in various tasks are well described by an exponential function of the form:

y = \gamma + \alpha \exp(-\beta x)

More specifically, it has been shown that (1) the rate of decay \beta remains the same whatever the task, and is equal to .334 (beautiful isn’t it?) (2) asymptotic performance levels are attained at the adult age.

We collected RT data in a decision-making task (within-subject design, two experimental conditions) from children of different ages. I then fit each individual dataset with a popular decision-making model to decompose the data into components of processing (this model has 5 parameters called v, a, ter, max_ampl, tau; each parameter quantifies a specific processing component) and determine whether (1) components of processing also follow an exponential-like function, and (2), whether all these components decay at the same rate as the mean RT data. To remove some noise, I averaged data of children from the same grade.

I thus fit the following model to data:

with pm.Model() as model_exp:
α_v = pm.HalfCauchy('α_v', 2)
α_a = pm.HalfCauchy('α_a', 2)
α_Ter = pm.HalfCauchy('α_Ter', 2)
α_max_ampl = pm.HalfCauchy('α_max_ampl', 2)
α_tau = pm.HalfCauchy('α_tau', 2)
α_c = pm.HalfCauchy('α_c',2)
α_i = pm.HalfCauchy('α_i', 2)

β = pm.Normal('β', 0.334, 0.1)
γ_v= pm.Normal('γ_v', mean_param_per_group_model[-1, 0], .1)
γ_a= pm.Normal('γ_a', mean_param_per_group_model[-1, 1], .1)
γ_Ter= pm.Normal('γ_Ter', mean_param_per_group_model[-1, 2], .1)
γ_max_ampl= pm.Normal('γ_max_ampl', mean_param_per_group_model[-1, 3], .1)
γ_tau= pm.Normal('γ_tau', mean_param_per_group_model[-1, 4], .1)
γ_c = pm.Normal('γ_c', Mean_RT_comp_General[-1], .1)
γ_i = pm.Normal('γ_i', Mean_RT_incomp_General[-1], .1)

σ_v = pm.HalfNormal('σ_v', .1)
σ_a = pm.HalfNormal('σ_a', .1)
σ_Ter = pm.HalfNormal('σ_Ter', .1)
σ_max_ampl = pm.HalfNormal('σ_max_ampl', .1)
σ_tau = pm.HalfNormal('σ_tau', .1)
σ_c = pm.HalfNormal('σ_c', .1)
σ_i = pm.HalfNormal('σ_i', .1)

μ_v = -α_v*pm.math.exp(-β*age_categ) + γ_v
μ_a = α_a*pm.math.exp(-β*age_categ) + γ_a
μ_Ter = α_Ter*pm.math.exp(-β*age_categ) + γ_Ter
μ_max_ampl = α_max_ampl*pm.math.exp(-β*age_categ) + γ_max_ampl
μ_tau = α_tau*pm.math.exp(-β*age_categ) + γ_tau
μ_c = α_c*pm.math.exp(-β*Age_General) + γ_c
μ_i = α_i*pm.math.exp(-β*Age_General) + γ_i

y_v = pm.Normal('y_v', μ_v, σ_v, observed=mean_param[:, 0])
y_a = pm.Normal('y_a', μ_a, σ_a, observed=mean_param[:, 1])
y_Ter = pm.Normal('y_Ter', μ_Ter, σ_Ter, observed=mean_param[:, 2])
y_max_ampl = pm.Normal('y_max_ampl', μ_max_ampl, σ_max_ampl, observed=mean_param[:, 3])
y_tau = pm.Normal('y_tau', μ_tau, σ_tau, observed=mean_param[:, 4])
y_meanRT_comp = pm.Normal('y_meanRT_comp', μ_c, σ_c, observed=Mean_RT_comp)
y_meanRT_incomp = pm.Normal('y_meanRT_incomp', μ_i, σ_i, observed=Mean_RT_incomp)

if __name__ == "__main__":
with  model_exp:
trace_exp = pm.sample (4000, chains = 4, tune = 2000, target_accept = .95)
az.plot_trace(trace_exp)

which fits the data well and suggests that every processing component evolves at the same rate:

I wonder what I could do to provide more convincing evidence in favor of this conclusion. I am thinking of comparing this model with a variant in which the decay rate \beta is free to vary. However, I can’t compute a LOO statistic (I get the following error: “Data must include log_likelihood in sample_stats” when doing az.LOO). In addition, I would appreciate some advice about the best way to analyze residuum from my posterior predictive checks (right now I have just plotted an histogram of residuals, they are normally distributed and centered on 0 so everything looks ok).

Many thanks!

Hi all,

ArviZ is not saving the log_likelihood when more than one observed variable is present. We will try to fix this ASAP.