Hello,
I would like to model some repeated-measures data with Pymc3. The data comes from an experiment where n=17 subjects experienced three conditions [baseline, treatment1, treatment2] and a reaction time was measured for each condition. The final data set has 51 reaction times (3 conditions * 17 subjects).
- I have managed to fit the following varying-intercepts model with the bambi-package:
b_model = bambi.Model(data)
results = b_model.fit(‘reaction_time ~ condition’, random = [‘1|subject_id’], tune = 2000,
samples = 3000, target_accept = 0.9)
- I would like to create the same model (or a replica of the bambi-model) directly with Pymc3, but cannot get it to work. As I’ve understood it a hierarchical model with varying intercept can be specified in Pymc3 using hyperpriors and priors (and in my case only for the intercept). The model I’ve specified so far is:
with pm.Model() as bambi_replica:
# Hyperpriors
mu_intercept = pm.Normal(‘mu_intercept’, mu=0, sd=1)
sd_intercept = pm.HalfNormal(‘sd_intercept’, sd=1)
# Priors
Intercept = pm.Normal(‘Intercept’, mu = mu_intercept, sd=sd_intercept, shape=n_subjects)
reaction_time_sd = pm.HalfNormal(‘reaction_time_sd’, sd = data.reaction_time.std()*2)
condition = pm.Normal(‘condition’, mu = 0, sd = 1, shape=2)
# mu
mu = Intercept[tp_idx] + condition[0]*x_1 + condition[1]*x_2
# Likelihood (sampling distribution) of observations
reaction_time = pm.Normal(‘Y_obs’, mu=mu, sd = reaction_time_sd, observed=reaction_time_data)
Here:
- n_subjects = 17 (since I have 17 subjects)
- x_1 = array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0], dtype=int64) (to identify indexes for treatment 1) - x_2 = array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1], dtype=int64) (to identify indexes for treatment 2) - reaction_time_data = array([2.918456, 3.124924, 3.013115, 3.117612, 2.91747 , 3.157323,
3.101546, 3.595617, 2.961656, 2.960227, 2.537308, 2.908257,
2.971608, 3.074054, 2.870034, 2.867201, 3.149056, 3.186297,
3.065235, 4.028026, 3.068959, 2.902312, 3.276905, 3.398952,
3.286361, 3.105795, 3.748059, 2.893177, 4.078607, 3.412715,
3.849621, 3.440311, 3.416428, 3.169723, 3.256486, 3.014277,
2.745785, 4.160992, 3.226214, 4.321797, 3.05727 , 3.231891,
3.611861, 3.229321, 2.851948, 2.622444, 2.844787, 3.229938,
3.055713, 2.722098, 3.158651]) - tp_idx = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
dtype=int64)
Questions:
- Would this approach correspond to the varying-intercept model specified with bambi (i.e., ‘reaction_time ~ condition’, random = [‘1|subject_id’])? I understand that the priors may not be specified exactly the same, but I’m mainly interested in if the overall structure of the model corresponds to a varying-intercepts model that could be used to model the data I have.
- With bambi it seems to work fine to fit the model (i.e., the model seems to converge, traceplots look good). The model I specified directly with Pymc3, however, does not seem to converge and I get error-messages looking like below. I can’t understand why this is. Any clues?
There were 104 divergences after tuning. Increase
target_accept
or reparameterize.
The acceptance probability does not match the target. It is 0.7975575845495201, but should be close to 0.95. Try to increase the number of tuning steps.
The estimated number of effective samples is smaller than 200 for some parameters.
All suggestions are appreciated!