Hello,
I’m having some issues with dimensionality in a multinomial softmax regression I have implemented for who will score a given goal in a football team.
I have successfully implemented the model with several numerical predictors (e.g. number of minutes played, previous average form), however I am having difficulties with categorical predictors, specifically the position each player plays.
The observations (goals scored) have the shape of number of (matches x number of players)
e.g. [[0,0,0,1,2,0,0],[0,0,0,1,0,0,0],[0,2,0,1,2,0,0]]
where each row is a match, and each position in the array corresponds to the number of goals that player scored.
The predictor data is in the same format:
e.g. [[GK,FW,MF,DF,MF,FW,DF],[GK,FW,FW,DF,MF,FW,DF],[GK,FW,MF,MF,MF,FW,MF]] (also numerically encoded with an index in a separate array)
Where each row is a match and each position corresponds to the player’s position in that match.
Previously I have set this up by creating separate boolean arrays for each position (e.g. Midfield array, where position is MF 1, else 0) and used the following model in pymc successfully:
with pm.Model() as pos_model:
delta_fw = pm.Normal('delta_fw')
delta_dc = pm.Normal('delta_dc')
delta_sub= pm.Normal('delta_sub')
delta_amc = pm.Normal('delta_amc')
delta_fb = pm.Normal('delta_fb')
delta_dm= pm.Normal('delta_dm')
delta_mid= pm.Normal('delta_mid')
mu_xg = pm.math.dot(fw_arr,delta_fw) + pm.math.dot(amc_arr,delta_amc)+ pm.math.dot(fb_arr,delta_fb)+ pm.math.dot(dm_arr,delta_dm)+ pm.math.dot(mid_arr,delta_mid)+ pm.math.dot(dc_arr,delta_dc)
p_xg = pm.Deterministic('p_xg', pm.math.softmax(mu_xg, axis = 1))
counts_xg = pm.Multinomial("counts_xg", n=ttens, p=p_xg, shape=(n, k), observed=gls_scored_arr)
trace = pm.sample(4000,chains=2)
However as I need to increase the complexity of the model, I would like to learn how to use coords and dims to do this without having to split it out into different arrays for each position.
E.g. delta = pm.Normal(‘delta’, shape = (nclass))
I’ve taken a look through the discourse and can’t seem to find anything similar that works, and also read up as much as possible on coords and dims in pymc to no avail… so any help on how I can do this would be greatly appreciated!