Greetings all,
I’m struggling with the fluctuating Rhat values. As far as I know, the default value for the tune is 1000, yet the problem is, the higher the tune value, the higher Rhat outcome (particularly with random effects) I get or vice versa. Is this a problem or am I missing something? The value worked so far is 100. I get slightly similar outputs with tune = 50
.
Also accepted value for Rhat was initially 1.05, yet later, Vehtari et al. stated it would be better if Rhat < 1.01. How should I deal with values like 1.03?
# Updated list of tense-aspect structures
structures = ['Past_Perf_Prog', 'Past_Prog', 'Present_Simple', 'Present_Perf',
'Present_Perf_Prog', 'Present_Prog', 'Past_Simple', 'Past_Perf']
# CEFR levels paired for modeling
cefr_pairs = [('A1', 'A2'), ('B1', 'B2'), ('C1',)]
with pm.Model() as frequency_model:
# Hyperprior for global mean
mu_alpha = pm.Normal('mu_alpha', mu=0., sigma=10)
# Random intercepts for native_language
alpha_native_lang = pm.Normal('alpha_native_lang', mu=0, sigma=1, shape=len(native_language_mapping))
# Random intercepts for aspect_combined
alpha_aspect = pm.Normal('alpha_aspect', mu=0, sigma=1, shape=len(aspect_mapping))
# Fixed effects for CEFR levels with custom priors
beta_cefr_A1 = pm.Normal('beta_cefr_A1', mu=0, sigma=0.25) # Updated strong prior for A1
beta_cefr_A2 = pm.Normal('beta_cefr_A2', mu=0, sigma=0.25) # Updated strong prior for A2
beta_cefr_B1 = pm.Normal('beta_cefr_B1', mu=0, sigma=0.5) # Updated moderate prior for B1
beta_cefr_B2 = pm.Normal('beta_cefr_B2', mu=0, sigma=0.5) # Updated moderate prior for B2
beta_cefr_C1 = pm.Normal('beta_cefr_C1', mu=0, sigma=0.75) # Updated weak to moderate prior for C1
# Interaction terms for each tense-aspect structure and CEFR level
for structure in structures:
for cefr_pair in cefr_pairs:
for level in cefr_pair:
if level in ['A1', 'A2']:
if structure in ['Present_Simple', 'Present_Prog', 'Past_Simple']:
sigma_value = 0.25 # High confidence for basic structures
elif structure in ['Present_Perf', 'Past_Perf', 'Past_Perf_Prog']:
sigma_value = 1.5 # Low confidence for advanced structures
else:
sigma_value = 1.0 # Moderate confidence for other structures
elif level in ['B1', 'B2']:
if structure in ['Present_Perf', 'Past_Prog', 'Past_Perf']:
sigma_value = 0.5 # Moderate confidence for intermediate structures
else:
sigma_value = 0.75 # Default for other structures at these levels
else: # For C1
sigma_value = 0.75 # Moderate confidence for all structures at advanced level
locals()[f"beta_{level}_{structure}"] = pm.Normal(f'beta_{level}_{structure}', mu=0, sigma=sigma_value)
# Expected value for the target variable (frequency of each structure)
mu = mu_alpha + alpha_native_lang[grouped_data['native_language_idx'].values] + alpha_aspect[grouped_data['aspect_idx'].values]
# Adding fixed effects for each CEFR level
for level in ['A1', 'A2', 'B1', 'B2', 'C1']:
mu += locals()[f"beta_cefr_{level}"] * (grouped_data['cefr_idx'].values == cefr_mapping[level])
# Adding interaction terms
for structure in structures:
for level in ['A1', 'A2', 'B1', 'B2', 'C1']:
mu += locals()[f"beta_{level}_{structure}"] * (grouped_data['cefr_idx'].values == cefr_mapping[level]) * (grouped_data['structure'] == structure)
# Likelihood (Poisson distribution for count/frequency data)
obs = pm.Poisson('obs', mu=np.exp(mu), observed=grouped_data['frequency'])
# Sampling from the model (adjust the number of samples as needed)
idata = pm.sample(draws = 2000, tune = 100, chains = 4, random_seed = 42, progressbar = True, nuts_sampler = 'pymc',
discard_tuned_samples=True, compute_convergence_checks=True,return_inferencedata=True)
||mean|sd|hdi_3%|hdi_97%|mcse_mean|mcse_sd|ess_bulk|ess_tail|r_hat|
|---|---|---|---|---|---|---|---|---|---|
|mu_alpha|4.791|0.643|3.574|5.998|0.045|0.032|206.0|443.0|1.01|
|alpha_native_lang[0]|-0.752|0.313|-1.335|-0.154|0.019|0.013|270.0|523.0|1.03|
|alpha_native_lang[1]|-0.271|0.313|-0.864|0.318|0.019|0.013|270.0|519.0|1.03|
|alpha_native_lang[2]|0.120|0.313|-0.456|0.726|0.019|0.013|270.0|522.0|1.03|
|alpha_native_lang[3]|-0.198|0.313|-0.766|0.417|0.019|0.013|270.0|528.0|1.03|
|alpha_native_lang[4]|-0.985|0.313|-1.575|-0.391|0.019|0.013|270.0|526.0|1.03|
|alpha_native_lang[5]|1.320|0.313|0.734|1.916|0.019|0.014|270.0|526.0|1.03|
|alpha_native_lang[6]|1.682|0.313|1.100|2.283|0.019|0.014|270.0|528.0|1.03|
|alpha_native_lang[7]|0.298|0.313|-0.290|0.891|0.019|0.013|270.0|524.0|1.03|
|alpha_native_lang[8]|0.215|0.313|-0.373|0.810|0.019|0.013|270.0|526.0|1.03|
|alpha_native_lang[9]|-1.334|0.313|-1.930|-0.749|0.019|0.013|270.0|517.0|1.03|
|alpha_aspect[0]|1.430|0.527|0.509|2.509|0.038|0.027|198.0|451.0|1.01|
|alpha_aspect[1]|-2.101|0.527|-3.018|-1.021|0.038|0.027|198.0|453.0|1.01|
|alpha_aspect[2]|-1.960|0.527|-2.874|-0.881|0.038|0.027|198.0|439.0|1.01|
|alpha_aspect[3]|2.643|0.527|1.722|3.722|0.038|0.027|198.0|450.0|1.01|