Simulating the World Cup Knockout Stage
The knockout stage of the 2010 FIFA World Cup is about to begin in South Africa. At the time of writing, every team has one group stage match remaining, and most teams still have a chance to finish in the top two places in their group and progress to the knockout stage (see the tournament schedule and group stage standings).
There are different approaches to ranking world football teams. The most well known is FIFA’s official world rankings, which are derived from points gained and lost in each match according to a heuristic set of rules that generally reward winning against higher-ranked opponents in more-important tournaments.
A simple alternative with a more statistical basis is an Elo rating system (described in more detail below). A handy property of Elo rating systems is that they directly provide an estimate of the probability that a given team will perform better than another. We can use Mathematica with that to set up simulations of the knockout stage of the World Cup. This lets us estimate things like the chance of each team winning the tournament. We’ll also generate some nice visualizations of the results, such as the following simulated knockout stage (based on the current top two teams in each group):
Elo rating systems are used in many other sports and games, including international Chess and Go competitions. The World Football Elo Ratings website (www.eloratings.net)* maintains up-to-date Elo ratings for all national football teams. The following table compares the top 10 national football teams according to the official FIFA rankings and the alternative Elo ratings, showing some significant differences.
An Elo rating system works by assuming that the performance of a team in a match is a random value drawn from a certain probability distribution. The mean of the distribution is called the Elo rating for that team. The particular shape of the distribution may be freely chosen, but a common choice is ExtremeValueDistribution[α,400/Log[10]]. This is the distribution assumed for the stats we will use from World Football Elo Ratings (see details of how those ratings are calculated).
Here are the expected performance distributions for Slovakia and Brazil, whose latest Elo ratings (at the time of writing) are 1605 and 2100 respectively. The plot shows that Brazil is expected to usually, but not always, have a greater performance.
The assumption that performance follows an extreme value distribution leads directly to the probability that one team outperforms another, as a function of their rating difference.
We can verify this by computing the probability that p1 > p2, where p1 and p2 are the performances of two teams, and r1 and r2 are their Elo ratings.
To simulate knockout stage matches, we will take this Elo probability of one team outperforming another to be the probability that they win the match—hence the function name WinExpectation.
We can import the latest ratings directly into Mathematica:
(Due to web traffic, www.eloratings.net may be down during the tournament. You can import ratings from Google’s cached version or from Wikipedia instead—see the downloadable notebook near the end of this post for an example.)
We’ll also define a function TeamElo that looks up the rating of a given team:
It works like this:
Now we are in a position to simulate the result of a match between teams a and b.
This SimulateMatch function returns a symbolic representation of a completed match, Match[{a,b},winner], indicating which teams played and who won. I defined a custom appearance for Match objects, which you can see at the bottom of this post.
We can define properties of symbolic objects, such as this simple one:
The winner is not always the same in our random simulated matches.
Next we’ll simulate each round of the knockout stage.
Some synonyms help standardize country names between the different data sources:
For now we’ll assume that the two teams currently atop each group make it through to the knockout stage. We can import the group stage standings directly from the World Cup website:
The teams coming first and second in each of the groups A through H slot into the knockout draw as follows (see the knockout draw on the official website):
KnockoutDraw gives a nested list encoding the structure of the knockout stage. We can visualize the resulting expression using TreeForm, showing the expected binary tree:
To simulate the first round we just need to evaluate SimulateMatch on each pair of team names sitting at the second lowest level of this tree:
A simulated first round:
For subsequent rounds we simulate new matches between the winners of each pair of matches in the previous round:
(The Sow function doesn’t affect the result. It is used here to accumulate rules indicating how each match depends on preceding matches, which will be handy later.)
The entire knockout stage has four rounds in total:
Here is the final match in one random simulation of the whole knockout stage:
Here are the two semi-finals in another simulation:
Every simulation is different. Here is a custom tree plot of a whole knockout stage:
(We used Reap to gather up the rules accumulated by Sow in the NextRound function.)
Here is a simulation starting with a different set of teams, those currently in first and third place in each group:
We certainly aren’t limited to doing one simulation at a time. Here are the winners of 1000 simulated knockout stages using the teams in first and second place in the current group stage standings:
Here are the same results as a bar chart giving the estimated probability of winning the tournament:
Using our simulation framework we can explore all sorts of things:
What about a giant knockout tournament containing the top 128 national teams?
Download this notebook to see the source code for these examples and try some of your own! You can use the same notebook to run simulations with the final set of teams participating in the knockout stage, once they are determined.**
* Note: All Elo ratings in this post and notebook are from World Football Elo Ratings, whose primary source of international football data is Advanced Satellite Consulting.
** You can view the notebook in the free Mathematica Player. To run your own simulations you need Mathematica—you can request a free trial.
Read on to see how to set up the visual representation for teams and matches.
We’ll visually represent each team with a labeled national flag, and winning teams get a bit of extra formatting to help them stand out.
Mathematica has country flags built in, as part of CountryData:
(We explicitly defined the flag for England, which is normally considered part of the United Kingdom by CountryData.)
Here’s how the team icons look:
Using the team icons we can define a visual representation of Match objects:
Using Format, we told Mathematica that all symbolic Match objects should be displayed in this way:
Nice software, bizarre teams. Why is an Italy over there? Didn’t you mean Slovakia? :-)
Why is Italy over there????
Some days ago I sent a Demonstration:http://demonstrations.wolfram.com/SupportYourWorldCupTeam/
I substituted the UK flag/name with your ‘England’ flag which is ‘politically more correct’!
I wonder if I should send in an updated version to the Demonstrations site?
Enjoyed your blog,
Erik Mahieu
nice post lets wait and see whats going to happen, whether math can really predict
espero que no hayas estudiado una carrera para esto…
I think that was just one possible outcome, not what they predict. They were using the ELO ratings to estimate the probability of outcomes for games. That is just what their RNG happened to spit out that time.
People are missing the point of this article. It was describing how to simulate the outcome of games with mathematica and elo rankings. Just because the examples they gave use teams already eliminated doesn’t make the whole article moot, you’re supposed to be able to redo these simulations with other teams.
I (and I’m assuming most other people) didn’t understand half of what they were talking about when it came to how to simulate it though, so I would love to see somebody who actually knew what they were talking about redo the simulations with current teams.
its shot to hell cuz both the finalists are actually on the same side of the bracket now… germany and paraguay will meet in semis if anything
Whatever…i could write this code with one hand while grinding up coffee beans with the other.
Timely and exciting post as a World Cup Shot.
The author need to prepare blog and predict the 16 knockout team a few day before the game run out. So that’s why Italy is there.
Wolfram technical players, Please keep us surprised!
@Jeff Green It’s easy to say that when the code’s already done for you.
Great post, actually how to rank things and account for uncertainties has lots of applications as for example resource allocation, where priorities need to be ranked and some decision on allocation should be made.
It is quite nice it accounts and quantifies variation in the outcomes, very rich and interesting model
Wow, that looks so much like math, but the graphs are cool.
Dear Andrew,
Please give most likely outcomes of finals match!
I used to compute ELO-ratings myself in the seventies, when Arpad E. Elo’s system had just come out. It used a normal distribution and matches influenced the scoring to a lesser of higher extent depending on the so-called K-factor. In your publication you list an Elo ranking of teams, but it was published some time ago. My request is that you publish the Elo-list with actual Elo-numbers and that you update the ranking based on recent games played in the world cup. Could you please compute from there the probabilities that Holland or Spain wins (and Germany versus Uruguay).
It appears the womens world match is a few days away, might be a chance now to add to this blog?
Nice post
Fantastic information, dude! Thankyou so much for your work!