Alright folks I could use some help on a project. I want to predict the number of kills for league of legends players in a match using a multinomial; effectively trying to divvy up the total team kills to each player. To start I wanted to make a model with a player effect that is hierarchical. The challenge here is that, while the number of players in a match is constant (5), the identity of those players change in each match. The closest example I could find of something like this was this NUTS won't sample in Hierarchical multinomial · Issue #3166 · pymc-devs/pymc · GitHub. But when I try to implement a similar approach, i.e.
blue_team_player_kills_cols = [f'Blue_player_{i}_kills' for i in range(1,6)]
blue_team_player_idx_cols = [f'Blue_player_{i}_idx' for i in range(1,6)]
blue_side_df = df_pivot[blue_team_player_idx_cols + blue_team_player_kills_cols]
with pm.Model(coords=player_coords) as hierarchical_model:
player_idx = pm.MutableData("player_idx", blue_side_df[blue_team_player_idx_cols].values, dims=['match', "player_per_match"])
# Hyperpriors for group nodes
mu_a = pm.Dirichlet('alpha', a=np.ones(len(blue_side_df)), dims='match')
mu_a_inv = pm.math.softmax(mu_a)
a = pm.MvNormal('a', mu=mu_a_inv, cov=np.eye(len(blue_side_df)), dims = ['match', 'player_per_match'])
p = pm.math.softmax(a[player_ids])
# Data likelihood
y_obs = pm.Multinomial('y_obs',
n=blue_side_df[blue_team_player_kills_cols].sum(axis=1),
p=radon_est,
observed=blue_side_df[blue_team_player_kills_cols])
I get a shape error when trying to run sample_prior_predictive()
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (34770,3477) and requested shape (3477,3477)
I’ve read a lot of @AlexAndorra 's work on elections using a hierarchical multinomial, but looked like in that case the identity of the groups (i.e., the parties) is constant across observations. And whenever a party did not participate, their vote count was set to 0. In theory, you could do that here but it seems like maybe a bad choice because most players wont player in most games so the model will assume the number of kills by a player is most often 0. And in future predictions we’d want to set most players kills to 0 anyway (because they aren’t playing).
Any help is much appreciated. I’m super stuck here.
multinomial_test_data.csv (99.9 KB)