Using a single dimensional distribution to initialize parameters for a multi dimensional distribtution

kiernan9811 · May 30, 2022, 3:13am

Hi, essentially what I am trying to do is the question in the title, for reference here is the problem.

I want to model hockey performance (Credit/inspiration to the rugby tutorial) and I want to make it hierarchical over the seasons. Basically I want to say that a team has an overall attack and defense score (modeled as a normal distribution around zero). I then want to use the teams overall attack and defense scores to initialize a new normal distribution for every season, representing the attack and defense scores of a team for that year.

Quick example is if there is 31 teams and 2 seasons we have a (31,) for the overall attack score then I want a per season attack score of (31, 2). Is there a way that I can do this I get a shape error when I try this.

Any help is really appreciated, thanks!

teams_dims = np.arange(N_TEAMS)
season_dims = np.arange(len(unique_seasons))
coords = {'teams':teams_dims, 'seasons':season_dims}
model = pm.Model(coords=coords)
with model:
    # global parameters
    intercept = pm.Normal('intercept', mu=0, sigma=5)
    
    attack_sigma = pm.HalfNormal('attack_sigma', sigma=2)
    defense_sigma = pm.HalfNormal('defense_sigma', sigma=2)
    
    season_attack_sigma = pm.HalfNormal('season_attack_sigma', sigma=2)
    season_defense_sigma = pm.HalfNormal('season_defense_sigma', sigma=2)
    
    # team specific parameters
    team_attack = pm.Normal('team_attack', mu=0, sigma=attack_sigma, dims='teams')
    team_defense = pm.Normal('team_defense', mu=0, sigma=defense_sigma, dims='teams')
        
    # sesason specific parameters
    team_attack_season = pm.Normal('team_attack_season', mu=team_attack, sigma=season_attack_sigma, dims=('teams','seasons'))
    team_defense_season = pm.Normal('team_defense_season', mu=team_defense, sigma=season_defense_sigma, dims=('teams','seasons'))
    
    # team specific values    
    home_theta = tt.exp(intercept + team_attack_season[home_idx_tensor, season_idx_tensor] - team_defense_season[home_idx_tensor, season_idx_tensor])
    away_theta = tt.exp(intercept + team_attack_season[home_idx_tensor, season_idx_tensor] - team_defense_season[home_idx_tensor, season_idx_tensor])
    
    # model the goal distribution of the given teams
    home_goals = pm.Poisson('home_goals', mu=home_theta, observed=home_goals_tensor)
    away_goals = pm.Poisson('away_goals', mu=away_theta, observed=away_goals_tensor)

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_11968\3492909181.py in <module>
     18 
     19     # sesason specific parameters
---> 20     team_attack_season = pm.Normal('team_attack_season', mu=team_attack, sigma=season_attack_sigma, dims=('teams','seasons'))
     21     team_defense_season = pm.Normal('team_defense_season', mu=team_defense, sigma=season_defense_sigma, dims=('teams','seasons'))
     22 

~\.conda\envs\pymc\lib\site-packages\pymc3\distributions\distribution.py in __new__(cls, name, *args, **kwargs)
    123         else:
    124             dist = cls.dist(*args, **kwargs)
--> 125         return model.Var(name, dist, data, total_size, dims=dims)
    126 
    127     def __getnewargs__(self):

~\.conda\envs\pymc\lib\site-packages\pymc3\model.py in Var(self, name, dist, data, total_size, dims)
   1136             if getattr(dist, "transform", None) is None:
   1137                 with self:
-> 1138                     var = FreeRV(name=name, distribution=dist, total_size=total_size, model=self)
   1139                 self.free_RVs.append(var)
   1140             else:

~\.conda\envs\pymc\lib\site-packages\pymc3\model.py in __init__(self, type, owner, index, name, distribution, total_size, model)
   1667             self.distribution = distribution
   1668             self.tag.test_value = (
-> 1669                 np.ones(distribution.shape, distribution.dtype) * distribution.default()
   1670             )
   1671             self.logp_elemwiset = distribution.logp(self)

ValueError: operands could not be broadcast together with shapes (31,2) (31,)

jessegrabowski · May 30, 2022, 6:29am

Looks like it’s being fussy about the shape of the team_attack and team_defense. It wants these to be column vectors so it can broadcast them across season_attack_sigma. Try:

    team_attack_season = pm.Normal('team_attack_season', mu=team_attack[:, None], sigma=season_attack_sigma, dims=('teams','seasons'))
    team_defense_season = pm.Normal('team_defense_season', mu=team_defense[:, None], sigma=season_defense_sigma, dims=('teams','seasons'))

The [:, None] adds a new dimension to the array, so it becomes N_teams x 1 and the broadcasting works.

For the record it seemed to work as you wrote it in PyMC 4.

Topic		Replies	Views
Help with Hierarchical Model Example Questions	3	499	October 5, 2021
Qualitatively different results depending on shape parameter Questions	1	417	October 21, 2017
Model posteriors are stuck at 0. Need help debugging Questions	3	447	March 28, 2021
Using pm.MvNormal for setting prior distribution Questions	6	964	October 8, 2020
Vector hyperparameter for multivariate normal v5	1	104	May 3, 2024

Using a single dimensional distribution to initialize parameters for a multi dimensional distribtution

Related topics