Per subject mean centered X and y for fixed effects model

Hi,

My goal is to replicate a causal model from the book Effect, namely the fixed effects model. Does anyone have experience with this model?

The model is a fixed effects model that simply regresses per subject mean centered X and y variables. Centering the mean per subject requires an intercept per subject. This far I got. Centering the X variables within the model is where the challenge lies. There is a progression possible to a hierarchical model; for now I am looking to just replicate the per subject de-meaned model if that is possible.

My current model set up can be found below. I also attached a data file. My understanding is that there are some extra precautions necessary in this setting with regards to de-meaning X.

Cheers,

Sanne

import polars as pl
import pymc as pm
[exercise.csv|attachment](upload://c3Ei7PvGFu7GHA4S8nA5o9HSCoh.csv) (3.6 KB)

import numpy as np


# Data
exercise_raw = (pl
                .read_csv('data/exercise.csv')
                .with_columns(pl.col('IndNo').cast(pl.Int32)))

X = (exercise_raw
     .select('IndNo', 'IntensityOfReminders')
     .with_columns(IndNo=pl.col('IndNo') - pl.lit(1))
     .to_numpy()
     )

y = (exercise_df.
     select('HealthyEatingScore')
     .to_numpy()
     .flatten()
    )

# Define coordinate values for all dimensions of the data
coords={
    "obs": range(y.shape[0]),
    "individuals": np.unique(X[:,0]),
    "features": ["IntensityOfReminders"]
}

# Define generative model
with pm.Model(coords=coords) as generative_model_2:
    individuals = pm.Data("individuals", X[:, 0].astype(int), dims=["obs"])
    X = pm.Data("X", X[:, 1].reshape(-1, 1), dims=["obs", "features"])

    # Model parameters
    alphas = pm.HalfNormal("alphas", sigma=5, dims="individuals")
    betas = pm.Normal("betas", mu=0, sigma=5, dims="features")
    sigma = pm.HalfNormal("sigma", sigma=5)

    # Linear model
    mu = alphas[individuals] + X @ betas

    # Likelihood
    # Assuming we measure deviation of each plant from baseline
    likelihood = pm.Normal("outcome", mu, sigma, dims=["obs"])

# Infer parameters conditioned on observed data
with pm.observe(generative_model_2, {"outcome": y }) as inference_model:
    idata = pm.sample(random_seed=1)
    summary = pm.stats.summary(idata, var_names=["alphas", "betas", "sigma"])
    print(summary)

This looks great to me. What were you thinking was wrong?

Hi Jesse,

Thank you for the heads up. I read up on the fixed effects model, and got perhaps a bit dazzled by this blog, which progresses to multi-level models, in combination with the de-meaning approach in the Effect.

My model indeed achieves a shared beta, and by setting individual bias terms for each individual X would indeed not have to be centered. Mission accomplished!

Cheers,

Sanne

The centering of the X’s is the so-called within estimator. It’s used in econometrics software to skip estimating the fixed effects via OLS (since they’re just usually just nuisance parameters anyway, and encoding them as dummies blows up the size of their design matrices). But if you include group-level intercepts, you don’t have to do this (this is an application of the Frisch-Waugh-Lovell theorem. ). You could also do this in PyMC if you were so inclined. In that case you would estimate only a single intercept, but you would lose all uncertainty estimates related to the group sample means. So you shouldn’t do it, except to convince yourself that the math works out as promised.

For what it’s worth, I strongly recommend this presentation on hierarchical models, which goes into all of this. Basically the fixed effect model is a limiting case of a hierarchical model, and if you do Bayes in a PPL you never have to think about this horrible terms ever again :slight_smile:

Thank you for the additional information; I did catch up with McElreath earlier.

The difference in outcome is considerable between the centered (both X and y) and the uncentered solution in PyMC; this is what got me doubting. The centered solution has a 94% hdi of 0.042 to 1.4, the uncentered is 0.047 to .69. Centering upfront throws away information though. The impact is quite big in this case, but then there is only a handful of observations.

There might also be something going on with prior scaling. Do you still see this difference if you center and scale the feature column first?

At any rate, my intuition (based on nothing at all) is that only the estimated mean of the beta should match between the two cases, but I could be wrong.

Thanks for your additional input. I checked the priors, normalized the y’s; the difference is there. I put it down to the small sample. It was a toy to start with. Your ideas set me ahead: thank you!

1 Like