Understanding how dims='team' is connected to dataframe in Rugby example

Hello,

Trying to understand dims and coords, I am currently considering the Rugby example here

The model used for the article is the following:

with pm.Model(coords=coords) as model:
    # constant data
    home_team = pm.ConstantData("home_team", home_idx, dims="match")
    away_team = pm.ConstantData("away_team", away_idx, dims="match")

    # global model parameters
    home = pm.Normal("home", mu=0, sigma=1)
    sd_att = pm.HalfNormal("sd_att", sigma=2)
    sd_def = pm.HalfNormal("sd_def", sigma=2)
    intercept = pm.Normal("intercept", mu=3, sigma=1)

    # team-specific model parameters
    atts_star = pm.Normal("atts_star", mu=0, sigma=sd_att, dims="team")
    defs_star = pm.Normal("defs_star", mu=0, sigma=sd_def, dims="team")

    atts = pm.Deterministic("atts", atts_star - pt.mean(atts_star), dims="team")
    defs = pm.Deterministic("defs", defs_star - pt.mean(defs_star), dims="team")
    home_theta = pt.exp(intercept + home + atts[home_idx] + defs[away_idx])
    away_theta = pt.exp(intercept + atts[away_idx] + defs[home_idx])

    # likelihood of observed data
    home_points = pm.Poisson(
        "home_points",
        mu=home_theta,
        observed=df_all["home_score"],
        dims=("match"),
    )
    away_points = pm.Poisson(
        "away_points",
        mu=away_theta,
        observed=df_all["away_score"],
        dims=("match"),
    )
    trace = pm.sample(1000, tune=1500, cores=4)

I can see that this model uses ‘match’ and ‘team’ dimensions.

However, ‘match’ and ‘team’ are not used in a dataframe (I would expect them to be columns).

Can you please elaborate how ‘match’ and ‘team’ dimensions are connected to the dataset during the sampling? Should they be?

Thank you in advance!

1 Like

The “team” dimension and associated coordinates are created in the cell above:

home_idx, teams = pd.factorize(df_all["home_team"], sort=True)
away_idx, _ = pd.factorize(df_all["away_team"], sort=True)
coords = {"team": teams}

The “match” dimension is specified with the data:

home_team = pm.ConstantData("home_team", home_idx, dims="match")
away_team = pm.ConstantData("away_team", away_idx, dims="match")

This just names the dimension “match” to indicate that each value in home_team and each value in away_team is taken from a different match.

1 Like

Hello, @cluhmann!
Thanks a lot for your response, it’s clear now. Please excuse my late reply =)