Rewriting Likelihood with Potential Causes the Gradient to Crash

I am just starting with PyMC3. As a first step, I would like to modify the example in here: A Hierarchical model for Rugby prediction โ€” PyMC3 3.11.2 documentation

I simplified the model to the backbone to these lines of code:

df_all = pd.read_csv(pm.get_data("rugby.csv"), index_col=0)
df = df_all[["home_team", "away_team", "home_score", "away_score"]]
teams = df.home_team.unique()
teams = pd.DataFrame(teams, columns=["team"])
teams["i"] = teams.index

df = pd.merge(df, teams, left_on="home_team", right_on="team", how="left") 
df = df.rename(columns={"i": "i_home"}).drop("team", 1)
df = pd.merge(df, teams, left_on="away_team", right_on="team", how="left")
df = df.rename(columns={"i": "i_away"}).drop("team", 1)

observed_home_goals = df.home_score.values
observed_away_goals = df.away_score.values

home_team = df.i_home.values
away_team = df.i_away.values

num_teams = len(df.i_home.drop_duplicates())
num_games = len(home_team)

observed_home_goals = df.home_score.values
observed_away_goals = df.away_score.values
num_teams = len(df.i_home.drop_duplicates())
num_games = len(home_team)

with pm.Model() as model:
    # global model parameters
    home = pm.Flat("home")

    # team-specific model parameters
    atts_star = pm.Normal("atts_star", mu=0, sigma=10, shape=num_teams)

    # likelihood of observed data
    home_points = pm.Poisson("home_goals", mu=tt.exp(home + atts_star[home_team]), observed=observed_home_goals)
    away_points = pm.Poisson("away_goals", mu=tt.exp(atts_star[away_team]), observed=observed_away_goals)

with model:
    trace = pm.sample(4000, chains=3, tune=1000, return_inferencedata=True, target_accept=0.85, random_seed=123456)

This works fine, but, to do what I want I will have to eventually create a Potential term in the model (I think). As a first step, I simply rewrite the likelihood above as a Potential, but I must be doing something wrong. This is the model I wrote:

with pm.Model() as model:
    # global model parameters
    home = pm.Flat("home")

    # team-specific model parameters
    atts_star = pm.Normal("atts_star", mu=0, sigma=10, shape=num_teams)

    # likelihood of observed data (?)
    overall = pm.Potential('result', pm.Poisson.dist(mu=tt.exp(home + atts_star[home_team])).logp(observed_home_goals) * pm.Poisson.dist(mu=tt.exp(atts_star[away_team])).logp(observed_away_goals))
  
with model:
    trace = pm.sample(4000, chains=3, tune=1000, return_inferencedata=True, target_accept=0.85, random_seed=123456)

When I try to sample from the model, I get the following error:

ValueError: Mass matrix contains zeros on the diagonal. 
The derivative of RV `atts_star`.ravel()[0] is zero.
The derivative of RV `atts_star`.ravel()[2] is zero.

I understand from this post that I cause some problem in the gradient because of how I wrote the Potential term. But I cannot really understand why and what I should do instead. Any help to understand the issue truly appreciated.

1 Like

The issue may be due to the product of logp from the Poissons instead it being a sum. You should also consider using a weakly informative prior instead of a flat one. Using flat priors is generally not recommended, we are in the process of updating the documentation to current best practices, see for example rugby analytics ยท Issue #49 ยท pymc-devs/pymc-examples ยท GitHub

1 Like

@OriolAbril thanks for your answer.

The issue may be due to the product of logp from the Poissons instead it being a sum

Yes, indeed. Of course, what I really wanted was the sum of the log-likelihoods, not the product โ€“ which may have caused some overflow eventually creating the error. If I rewrite the model correctly, it works. As I suspected the issue was trivial, but there are a lot of moving pieces for a beginner, thanks again for having the patience to point this out!

You should also consider using a weakly informative prior instead of a flat one

Once I have the full model up and running, I will experiment with different priors. Thanks for the heads-up!

1 Like