Greetings all,
I am in the middle of a similar discussion with a journal editor and here to find out answer to what is random and fixed anyway. In second language acquisition studies, it is a tradition that no one dares to say otherwise, CEFR levels (A1,A2,B1…) etc should be treated as fixed effects. However, I am trying to open discussion on nature of these levels and how they can alternatively be treated.
Here is my hypothesis:
Although CEFR levels are fixed categories, the variability in tense–aspect usage isn’t constant across them. For example, lower proficiency levels (A1, A2) tend to exhibit more constrained and predictable patterns (i.e., lower variance), whereas higher levels (C1) often show more diverse and flexible language use (i.e., higher variance). Modeling these levels as random effects lets the model accommodate this heteroscedasticity by estimating group-specific variances.
Random = hierarchical, partial pooling, as suggested @bwengals So, learners at the same CEFR level may exhibit different levels of mastery. I am having a hard time why this rationale is completely inaccurate, as suggested by the reviewer. I am curios what other think about this.
# Random effects for CEFR levels
cefr_sds = np.ones(len(cefr_mapping)) * 0.75 # Default SD
# Assign lower SDs for lower CEFR levels based on CEFR descriptions
cefr_specific_sds = {
"A1": 0.25,
"A2": 0.25,
"B1": 0.5,
"B2": 0.5,
# C1 and higher keep the default 0.75
}
for cefr_level, sd in cefr_specific_sds.items():
idx = cefr_mapping.get(cefr_level)
if idx is not None:
cefr_sds[idx] = sd
sigma_cefr = pm.HalfNormal("sigma_cefr", sigma=1)
beta_cefr = pm.Normal(
"beta_cefr", mu=0, sigma=cefr_sds * sigma_cefr, shape=len(cefr_mapping)
)
# CEFR influence on structure-frame association
cefr_influence_means = np.zeros(
(len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping))
)
cefr_influence_sds = (
np.ones((len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping)))
* 0.5
)
for cefr_level, cefr_idx in cefr_mapping.items():
influence_mean = cefr_prior_values.get(cefr_level, 0.0)
cefr_influence_means[cefr_idx, :, :] = influence_mean
beta_cefr_influence = pm.Normal(
"beta_cefr_influence",
mu=cefr_influence_means,
sigma=cefr_influence_sds,
shape=(len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping)),
)
# Linear combination for the expected value
mu = (
beta_structure[grouped_data["structure_idx"].values]
+ beta_frame[grouped_data["frame_annotation_idx"].values]
+ beta_structure_frame[
grouped_data["structure_idx"].values,
grouped_data["frame_annotation_idx"].values,
]
+ beta_cefr[grouped_data["cefr_idx"].values]
+ beta_cefr_influence[
grouped_data["cefr_idx"].values,
grouped_data["structure_idx"].values,
grouped_data["frame_annotation_idx"].values,
]
)
# Prior for the dispersion parameter
alpha = pm.Gamma("alpha", alpha=2, beta=0.1)
# Likelihood
obs = pm.NegativeBinomial(
"obs", mu=pm.math.exp(mu), alpha=alpha, observed=grouped_data["frequency"]
)