Thanks! I’d also like to provide some clarification about my model in case it helps make the situation clearer.
I’m building a shift lognormal model, where the ndt_i
above is used to create a shifted parameter for each participant at a certain time condition. This shifted parameter should multiply with their minimum RT.
The model runs well, but the parameter estimation results are not what I expected. As you can see in the code snippet below, my parameters of interest are rho_R_mu
and rho_R_sigma
. These parameters are drawn from a multivariate normal distribution and are related to mu
. In other models that don’t include this shift parameter, the estimation results match the Stan output exactly. This leads me to believe that the problem arises from my definition or usage of the shift parameter, like:
observed = df_flanker.RT - ndt_i[subj_idx, time_idx].eval()
Since the model is somewhat complex, I’ve included only the key lines here:
coords = {"cond": df_flanker['Condition'].unique(),
"time": df_flanker['Time'].unique(),
"subj": df_flanker['subj_num'].unique(),
"trial": df_flanker['trial_idx']}
with pm.Model(coords=coords) as model_3:
# Define correlation matrices
rho_R_mu = pm.LKJCorr('rho_R_mu', n=2, eta=1)
rho_R_sigma = pm.LKJCorr('rho_R_sigma', n=2, eta=1)
triu_idx = np.triu_indices(2, k=1)
mu_corr_upper = pt.set_subtensor(pt.zeros((2, 2))[triu_idx], packed_L_R_mu)
sigma_corr_upper = pt.set_subtensor(pt.zeros((2, 2))[triu_idx], packed_L_R_sigma)
R_mu = pt.eye(2) + mu_corr_upper + mu_corr_upper.T
R_sigma = pt.eye(2) + sigma_corr_upper + sigma_corr_upper.T
L_R_mu = pt.linalg.cholesky(R_mu)
L_R_sigma = pt.linalg.cholesky(R_sigma)
...
ndt_mean = pm.Normal('ndt_mean', mu=0, sigma=1, dims="time")
ndt_sigma = pm.Normal('ndt_sigma', mu=0, sigma=1, dims="time")
ndt_i_pr = pm.Normal('ndt_i_pr', mu=0, sigma=1, dims=["subj","time"])
rv = pm.Normal.dist(0,1)
phi = pm.logcdf(rv, ndt_mean + ndt_sigma * ndt_i_pr)
phi = pm.math.exp(phi)
ndt_i = pm.Deterministic('ndt_i', phi * rt_min, dims=["subj","time"])
...
time_idx = pm.MutableData("time_idx", df_flanker.Time, dims="trial")
subj_idx = pm.MutableData("subj_idx", df_flanker.subj_num, dims="trial")
cond_idx = pm.MutableData("cond_idx", df_flanker.Condition, dims="trial")
likelihood = pm.LogNormal("likelihood",
mu=mu[cond_idx, subj_idx, time_idx],
sigma=sigma[cond_idx, subj_idx, time_idx],
observed=df_flanker.RT - ndt_i[subj_idx, time_idx].eval(),
dims="trial")
I just ran the model, and here’s the timing message (though I’m not sure if this is exactly what you were asking for )
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (3 chains in 3 jobs)
NUTS: [L_R_mu, L_R_sigma, mu_mean_base, mu_sd_base, sigma_mean_base, sigma_sd_base, mu_mean_delta, mu_sd_delta, sigma_mean_delta, sigma_sd_delta, mu_i_base_pr, sigma_i_base_pr, mu_i_delta_pr, sigma_i_delta_pr, ndt_mean, ndt_sigma, ndt_i_pr]
Sampling 3 chains for 1_000 tune and 3_000 draw iterations (3_000 + 9_000 draws total) took 2521 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics