What do "Fixed Effects" and "Random Effects" mean to you?

jessegrabowski · December 5, 2024, 5:17am

Curious how different posters here think about these terms. They are notoriously vague. Andrew Gelman offers five definitions of these terms in this paper (which I only know of by way of this video):

Fixed Effects are constant across individuals, and random effects vary
Effects are fixed if they are interesting in themselves or random if there is an interest in the underlying population
Fixed effects are estimated from large samples, random effects from small samples
Random effects are realizations of random variables
Fixed effects are estimated using maximum likelihood, random effects exhibit shrinkage.

I was trained in econometrics, where (helpfully!) none of these definitions apply. There, fixed effects are group specific intercepts, and random effects are something like a clustered standard error adjustment. I will spare you the specifics.

So when people ask questions about these terms, I never really understand what is being asked. I’m curious what everyone understands these terms to mean. It’s like a community survey, but nerdier!

ricardoV94 · December 5, 2024, 7:35am

I like bambi’s terminology of common effects and group specific effects: Getting Started – Bambi

Those are the fixed and random effects, respectively, of definition 1.

bwengals · December 5, 2024, 10:13am

I’m gonna be a contrarian and say I very much like the fixed / random terminology.

Random effect: Synonymous with a hierarchical model. Partial pooling is happening. The word “random” is because each member of the group is a sample from some broader population. The sigma parameter you infer that controls the partial pooling tells you about the broader population.
For example say you have some medical study run at 5 hospitals. The hospital effect is a random effect because the 5 hospitals in the data is a random sample from an infinite number of hospitals.

Fixed effect: The effect is “fixed” by the experimenter. In your medical study say you’re testing three drugs vs. a placebo. This is a fixed effect because you want to measure the effect of drug A, drug B and drug C independently. You don’t use a hierarchical model / partial pooling because it would be a bad idea to think about the three drugs as some random sample from an infinite number of different drugs. You also don’t want to have any notion of, if drug A and B work well and C doesn’t, then that means drug C should get regularized up to the mean of all three drugs, and A and B probably aren’t as good as they seem because of C. You want to treat each drug effect (that the experimenter fixed in advance) independently.

I honestly think that those 5 definitions fit nicely into the definition above.

Definition 5: fixed effects can be estimated with maximum likelihood just fine, while random effects do exhibit shrinkage, so that’s true.
Definition 4: Yes, random effects (the observed groups), are realizations of random variables (the population), so I think that’s true too.
Definition 3: Maybe that’s usually true?
Definition 2: I think that’s often true. You don’t really care about the hospital effect, it’s a nuisance variable. You care about which drug is best.
Definition 1: The same drugs A, B, C are used across all 5 hospitals. The hospital used across each drug varies, necessarily. So I’d agree with that too.

I think the confusion comes from non-Bayesians having a hard time fitting hierarchical models? Also possible that I have retconned the fixed and random effects terminology into something that makes sense only to me?

Random = hierarchical, partial pooling is happening, information sharing. Probably don’t have control over the selection, hence random.
Fixed = not hierarchical, no pooling, effects are independent. Might control these up front.

jaharvey8 · December 5, 2024, 3:36pm

I’m struggling with this. Can’t fixed effects also be inherently hierarchical. If you have strong reason to believe that drugs A, B, and C are similar (e.g., chemical composition) can’t A, B, and C can all have a fixed effects that are hierarchical in nature, no?

jaharvey8 · December 5, 2024, 3:52pm

To give an example, is the intercept in the radon example a fixed effect or random effect? Prior to this thread I would have called it a fixed effect, but based on the definitions here it seems like it should be considered a random effect.

jessegrabowski · December 5, 2024, 3:55pm

I would also call it a “hierarchical fixed effect”, because I was taught that group-level intercepts are fixed effects. Then we slap on a hierarchy to do shrinkage.

bwengals · December 5, 2024, 9:10pm

In the radon example both the intercepts and the slopes are random effects, because they’re hierarchical over county. There’s an overall global (literally) radon level and the counties are a random sampling of discrete areas, so partial pooling makes sense. I’ll note that the PyMC radon example calls both the intercept and slope random effects.

I think it doesn’t matter to fixed vs random whether it’s an “intercept” (not multipled by some other covariate) or a “slope” (multiplied by 0 or 1 depending on basement or not). I think the distinction is only:

Random:

\begin{align} \sigma &\sim \text{HalfNormal}(\sigma=1.0) \\ \delta_j &\sim \text{Normal}(0, \sigma) \end{align}

Fixed:

\begin{align} \sigma &= 1.0 \\ \delta_j &\sim \text{Normal}(0, \sigma) \end{align}

@jaharvey8 with the drug example, I think that’s like definition 2 it depends on what your goal is. I think a fixed effect is better because personally I wouldn’t want to take a drug that was tested once, killed someone, but got it’s effect pooled up towards the other drugs that were considered similar but happened to be trialed more. I think you’d really want no information sharing, so a fixed effect.

But, if your goal is to figure out whether this drug company is a good company and if you should buy stock in it, then I think a random effect / hierarchical model / shrinkage term makes sense for the drug effect. Like the definition 2 says, your goal is now to learn something about the underlying population of their drugs, and you don’t necessarily care about the precise effect of drug A, B or C.

What would a hierarchical effect that’s fixed mean? What would be “fixed” about it?

jessegrabowski · December 7, 2024, 11:26am

Each unit in the hierarchy is different, sure. But the effect is still “fixed” in the sense that it’s a constant mean offset for that unit that shouldn’t change. It isn’t a slope coefficient for a value that could be intervened on.

bayesian_padawan · December 27, 2024, 6:21pm

Fixed = global, common to all groups or clusters in the data.

Random = related to a group or cluster. Represents effects in data which are not sampled independently.

The root of these terms probably comes from ANOVA. Fixed effects relate to all levels of a factor. Random effects relate to a subset of all factor levels which results from a non-independent sampling process.

However, in my understanding the pairs fixed/global and random/cluster explain it best.

fatihbozdag · February 17, 2025, 7:15pm

Greetings all,

I am in the middle of a similar discussion with a journal editor and here to find out answer to what is random and fixed anyway. In second language acquisition studies, it is a tradition that no one dares to say otherwise, CEFR levels (A1,A2,B1…) etc should be treated as fixed effects. However, I am trying to open discussion on nature of these levels and how they can alternatively be treated.

Here is my hypothesis:

Although CEFR levels are fixed categories, the variability in tense–aspect usage isn’t constant across them. For example, lower proficiency levels (A1, A2) tend to exhibit more constrained and predictable patterns (i.e., lower variance), whereas higher levels (C1) often show more diverse and flexible language use (i.e., higher variance). Modeling these levels as random effects lets the model accommodate this heteroscedasticity by estimating group-specific variances.

Random = hierarchical, partial pooling, as suggested @bwengals So, learners at the same CEFR level may exhibit different levels of mastery. I am having a hard time why this rationale is completely inaccurate, as suggested by the reviewer. I am curios what other think about this.

# Random effects for CEFR levels
cefr_sds = np.ones(len(cefr_mapping)) * 0.75  # Default SD
# Assign lower SDs for lower CEFR levels based on CEFR descriptions
cefr_specific_sds = {
    "A1": 0.25,
    "A2": 0.25,
    "B1": 0.5,
    "B2": 0.5,
    # C1 and higher keep the default 0.75
}
for cefr_level, sd in cefr_specific_sds.items():
    idx = cefr_mapping.get(cefr_level)
    if idx is not None:
        cefr_sds[idx] = sd

sigma_cefr = pm.HalfNormal("sigma_cefr", sigma=1)
beta_cefr = pm.Normal(
    "beta_cefr", mu=0, sigma=cefr_sds * sigma_cefr, shape=len(cefr_mapping)
)

# CEFR influence on structure-frame association
cefr_influence_means = np.zeros(
    (len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping))
)
cefr_influence_sds = (
    np.ones((len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping)))
    * 0.5
)

for cefr_level, cefr_idx in cefr_mapping.items():
    influence_mean = cefr_prior_values.get(cefr_level, 0.0)
    cefr_influence_means[cefr_idx, :, :] = influence_mean

beta_cefr_influence = pm.Normal(
    "beta_cefr_influence",
    mu=cefr_influence_means,
    sigma=cefr_influence_sds,
    shape=(len(cefr_mapping), len(structure_mapping), len(frame_annotation_mapping)),
)

# Linear combination for the expected value
mu = (
    beta_structure[grouped_data["structure_idx"].values]
    + beta_frame[grouped_data["frame_annotation_idx"].values]
    + beta_structure_frame[
        grouped_data["structure_idx"].values,
        grouped_data["frame_annotation_idx"].values,
    ]
    + beta_cefr[grouped_data["cefr_idx"].values]
    + beta_cefr_influence[
        grouped_data["cefr_idx"].values,
        grouped_data["structure_idx"].values,
        grouped_data["frame_annotation_idx"].values,
    ]
)

# Prior for the dispersion parameter
alpha = pm.Gamma("alpha", alpha=2, beta=0.1)

# Likelihood
obs = pm.NegativeBinomial(
    "obs", mu=pm.math.exp(mu), alpha=alpha, observed=grouped_data["frequency"]
)

Topic		Replies	Views
Terminology: "hierarchical", "multi-level", "mixed effect", "partially pooled" models - is it all the same? version agnostic modeling	3	113	December 27, 2024
Hierarchical regression models for ratings data ( 2 by 2 within-subject design) Questions	3	1746	December 12, 2019
Questions concerning random effects model in Bambi version agnostic bambi	10	1135	November 10, 2023
Per subject mean centered X and y for fixed effects model causalpy	6	29	February 4, 2025
Correlated samples in factor/cross-classified model version agnostic modeling	11	662	May 24, 2022

What do "Fixed Effects" and "Random Effects" mean to you?

Related topics