Winter Olympics medal counts

Hi all,

I built a model for Winter Olympics medal counts. I used a random walk with momentum for team strength, a Bernouilli barrier to determine which teams will win at least one medal, and a Dirichlet-multinomial to allocate medals among the winners. I would appreciate any feedback and I hope that you find this interesting! This is my first time working with sports data.

Thanks,

Andrew

https://medium.com/@andrewpwalters/forecasting-winter-olympics-medal-counts-84b2c4a7bfaa

2 Likes

Thanks for sharing. That’s a fun model and I enjoyed the exploratory data analysis.

I’m curious whether the dispersion goes down over time. I think it’d signal the games are getting more diverse.

Did the softmax and separate dispersion help with identifiability? The main problem I find with fitting models with Dirichlets like this is identifiability, which is really challenging if you base it all on an underlying regression and want to keep it symmetric for the sake of assigning priors. A traditional way to identify would be to set one team’s scores to all zero and let others adjust around that.

I would think county GDP would be a good predictor here because Olympic training is a luxury.

This would also be a nice opportunity for some model comparison like the time series with and without momentum (i.e., first and second-order random walks) or even a static model without time series. The number of time series models is basically endless.

Speaking of priors, I didn’t see any discussion of that other than the time series. This seems like a great opportunity to do some hierarchical modeling in space and on other covariates like country GDP (total GDP and/or per capita).

Thanks! I tried it before using GDP and GDP per capita when I attempted to model the Summer games in 2020. There are some interesting effects, but for individual countries they are super persistent between games. You tend to find very free and very authoritarian countries do well, so there is a “U” when using a freedom index. For example the USSR and North Korea did disproportionately well as does the USA. Some countries like India just don’t care to be competitive in the Olympics. I tried using geographic factors to explain why the Nordic countries did so well in the Winter Olympics, using a regional dummy and also latitude. But that doesn’t explain why the Netherlands does so well but Denmark and France does not.

On whether the games are getting more diverse over time - I think that they aren’t, it’s just that the number of teams competing, number of athletes, and number of medals awarded are all scaling up ath the same rates. So the leading teams win the same proportion of medals and the additional teams pick up a few at the fringes. The games however are overall becoming more competitive between those leading teams, there’s less concentration behind one powerhouse like East Germany or the USA.

I agree alternative and competing identifications would be a good exercise. Thank you for your response!

1 Like