Hi all,
I’ve been trying to get some pymc3 models to work for my work & hobbies.
Currently, I’m trying to make a skill estimator for a board game (fyi: Terraforming Mars), and I’m running into challenges regarding convergence (/“anchoring”) and modelling challenges.
Unlike most games, we can not simply rely on the total points gathered by players, as the duration of the game varies a lot per game and is not necessarily related to players’ skill levels.
The tournament will contain 3 rounds, with a match holding 3 to 4 players, and I intend to depend on the players’ pairwise performance difference.
E.g., a match has 4 players (A,B,C,D) with scores: 54 63 65 49. That means pairwise performance differences are: A-B = -9, A-C = -11, A-D = 5, B-C = -2, B-D = 14, C-D=16.
My current approach is to model each player, their performance and difference as follows:
The associated code for just two rounds performances and who-is-matched-to-who:
realised_performance = np.array([[[68, 61, 62, 69],
[65, 85, 95, 72],
[80, 69, 71, 59],
[70, 74, 55, 68],
[61, 79, 60, 63],
[81, 66, 73, 73],
[70, 98, 78, 81],
[66, 49, 61, 54]],
[[58, 92, 80, 81],
[56, 74, 63, 66],
[86, 77, 92, 86],
[67, 64, 54, 56],
[72, 74, 57, 81],
[68, 67, 69, 54],
[74, 82, 93, 72],
[59, 75, 91, 81]]])
roundassignment = np.array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]],
[[ 1, 29, 4, 24],
[21, 18, 14, 11],
[15, 26, 22, 16],
[31, 9, 2, 7],
[ 0, 10, 30, 19],
[27, 12, 23, 5],
[28, 3, 13, 20],
[ 8, 17, 25, 6]]])
player = []
oppo = []
obsdiff = []
for round, round_performance in enumerate(realised_performance):
for table, match in enumerate(round_performance):
players_table = roundassignment[round]
for i, player_in_table in enumerate(players_table):
for j, oppo_in_table in enumerate(players_table):
if i >= j :
continue
player.append(player_in_table)
oppo.append(oppo_in_table)
obsdiff.append(match[i] - match[j])
Then, in pymc3 I modelled it as follows:
with pm.Model() as model_all:
skill = pm.Normal('skill', 0, 10, shape=(32,1))
performance = pm.Normal('performance', skill, sd=5, shape=(32, 1))
rel_dif = performance[player] - performance[oppo]
dif = pm.Normal('dif', rel_dif, sd=10, observed=obsdiff)
trace = pm.sample(1000, tune=1000)
I have tried various options, but I often get acceptance and Rhat warnings. The found values also appear to be very close to each other (all 5 with std of 0.01, or all 10 with same std). I was hoping to see skill corresponding to ‘this player has usually 5 more points than that player’.
I realize that there are a few oddities in my modelling choices: the performance and differences are supposed to be integers, but I currently model them as continuous variables. Likewise, I do believe that modelling the difference as a Normal distribution might also not be the ideal option.
I could definitely use some pointers for:
- How to stabilize the convergence? Should I try to anchor the worst player for example?
- Modelling choices regarding difference
Cheers