Qualitatively different results depending on shape parameter

momomorphism · October 20, 2017, 8:16pm

Hello,

Expanding on the example of a hierarchical model for rugby prediction, I ran into an issue.

After having sampled from the posterior of teams’ parameters I now want to estimate the probabilities of a new game.

One way would be to sample random indices from the trace, chose the appropriate team indices of the match I want to simulate and sample away, as in:

    n = samples * size`    
    results = np.zeros((n,n_teams))
    for i, idx in enumerate(np.random.randint(0, len(trace), samples)):
        with pm.Model() as pred_model:
           home = trace["home"][idx]
            intercept = trace["intercept"][idx]

            atts = trace["atts"][idx]
            defs = trace["defs"][idx]

            home_theta = np.exp(intercept + home + atts[home_team_idx] + defs[away_team_idx])
            away_theta = np.exp(intercept + atts[away_team_idx] + defs[home_team_idx])

            home_goals = pm.Poisson('home_goals', mu=home_theta)
            away_goals = pm.Poisson('away_goals', mu=away_theta)

        with pred_model:
            pred_trace = pm.sample(size, progressbar=True)

        results[i*size:(i+1)*size] = pred_trace["home_goals"] - pred_trace["away_goals"]

    prob_home = (results > 0).sum(axis=0)/n
    prob_draw = (results == 0).sum(axis=0)/n
    prob_away = (results < 0).sum(axis=0)/n

However, I want to use a sufficiently large number of samples from the posterior. For each sample a model has to be compiled, which is relatively slow.

I thought I’d be smart and only build model and put all the sampled parameters from the posterior in the shape of the Poisson variables, like:

random_idx = np.random.randint(0, len(trace), samples)
with pm.Model() as pred_model:
    home = trace["home"][random_idx]
    intercept = trace["intercept"][random_idx]

    atts = trace["atts"][random_idx]
    defs = trace["defs"][random_idx]

    home_theta = np.exp(intercept+ 
                    home + 
                    atts[:, home_team_idx] + defs[:, away_team_idx])

    away_theta = np.exp(intercept + 
                    atts[:, away_team_idx] + defs[:, home_team_idx])
    
    home_goals = pm.Poisson('home_goals', mu=home_theta, shape=(samples))
    away_goals = pm.Poisson('away_goals', mu=away_theta, shape=(samples))

with pred_model:
    pred_trace = pm.sample(size,progressbar=True)

And then get the estimated probabilities through

sampled_home_goals = pred_trace["home_goals"][:,:].flatten()
sampled_away_goals = pred_trace["away_goals"][:,:].flatten()

results = sampled_home_goals - sampled_away_goals

prob_home = (results > 0).sum()/(samples * size)
prob_draw = (results == 0).sum()/(samples * size)
prob_away = (results < 0).sum()/(samples * size)

The last step isn’t particularly elegant, but this should do the job. To model compiles and samples, but the results are way off from what the slower iterative model generates. For a smaller number of sampled indices from the trace, like 5, the results are also sensible, but not far beyond that.

I’m not sure whether this is a bug, or there is a check for the shape in the Metropolis sampler, which results in a reduced precision, but somesthing doesn’t seem quite right. Additionally I was wondering why the iterative recompiling of my model is so slow? It appears to not increase linearly in the number of iterations.

Thank you very much!

junpenglao · October 21, 2017, 6:16am

Why not use ppc = pm.sample_ppc and index to the game with the team combination (e.g., team A vs team B) you would like?

You can also use numpy to generate random prediction to avoid compiling a new model:

n = samples * size
results = np.zeros((n,n_teams))
for i, idx in enumerate(np.random.randint(0, len(trace), samples)):
    home = trace["home"][idx]
    intercept = trace["intercept"][idx]

    atts = trace["atts"][idx]
    defs = trace["defs"][idx]

    home_theta = np.exp(intercept + home + atts[home_team_idx] + defs[away_team_idx])
    away_theta = np.exp(intercept + atts[away_team_idx] + defs[home_team_idx])

    home_goals = np.random.poisson(lam=home_theta, size=size)
    away_goals = np.random.poisson(lam=away_theta, size=size)

    results[i*size:(i+1)*size] = home_goals - away_goals

prob_home = (results > 0).sum(axis=0)/n
prob_draw = (results == 0).sum(axis=0)/n
prob_away = (results < 0).sum(axis=0)/n

Topic		Replies	Views
Help with Hierarchical Model Example Questions	3	520	October 5, 2021
For loop vs shape parameter Questions	2	945	May 4, 2021
Shape parameter is giving different results when compared to multiple variables Questions	0	440	February 20, 2020
Model posteriors are stuck at 0. Need help debugging Questions	3	448	March 28, 2021
Shape of posterior predictive samples of GP Questions	3	536	November 10, 2019

Qualitatively different results depending on shape parameter

Related topics