Analyzing US 2008 Elections with Mathematica
September 24, 2008 — Jeff Hamrick, Special Projects Group
The 2008 United States presidential election is arguably the most interesting US presidential election in my lifetime.
Already, millions of Americans have registered to vote for the first time in their lives.
Regardless of the outcome, America is going to elect either its first black president or its first female vice president.
Both presidential candidates were born outside of the continental United States.
If elected, John McCain will be the oldest sitting US president upon ascension to the presidency.
Never before in US history has there been such a large disparity in age between the two presidential candidates, either.
It’s also the first election you can analyze using Mathematica 6.
As computational exploration continues its relentless march into the social sciences and data becomes more freely available over the internet, people are creating their own election simulations in greater numbers than ever before. While some websites will let you tally up electoral votes or gamble on state-by-state election results, Mathematica 6 makes it easy to write your own electoral-college simulator with just a few lines of code and seed it with different sets of polling data from sources like Mason-Dixon, Rasmussen or Quinnipiac.
The Import function, updated in Mathematica 6.0, makes it remarkably easy to pull the polling data of your choice. Suppose that you want to visualize recent national polling results for the upcoming national election. A little bit of internet research reveals that we might Import the data from the following web page:
Most web-based data requires a bit of clean-up to be useful:
Rasmussen didn’t do any polling around the Fourth of July, and we want to delete the “No Polling” data because it won’t plot properly with DateListPlot. That’s why we use the argument “No Polling” in the Position function.
We end up with a Mathematica-generated DateListPlot of national polling results since June 2008. Notice that the Republican ticket pulled ahead after the Republican National Convention September 1-4. But since the new difficulties in the US financial system last week, the Democratic ticket has gained ground and is now running neck and neck with the Republican ticket.
My Wolfram Research colleagues Fred Meinberg and Jason Cawley and I recently wrote an electoral-college simulator in a few hours using imported data, and we’ve made the results of our efforts freely available on the Wolfram Demonstrations Project. Here’s the output from a single simulation, which was created using recent state-by-state polling data.
If you download the live version of the Demonstration and mouse over a state on the map created by the Manipulate, you’ll see a tooltip showing how many electoral votes are allocated to the state and what the recent polling results are. As has become customary, the states predicted to be won by the Republican ticket are colored red and the states predicted to be won by the Democratic ticket blue, though the US is arguably rather purple. A word of caution: the current polling results may disagree with the coloring of the state, because the simulation captures predicted changes in voter preferences between now and election day.
In fact, that’s the basic goal of the simulation. While there is a margin of error in every poll (because a poll randomly samples a subset of a particular population in order to estimate the true polling preferences of the population), our aim is to map current polling results onto future election results in each state. Those future election results will depend, in large part, upon current polling results, plus some noise that is different from one state to another, plus some random nationwide swing affecting the results in each state.
Change the “state-specific random swing” factor to turn more and more states into toss-up states, or dial down this factor (and set “national random swing” factor to zero) to see how the states’ electoral votes will be awarded to the current poll leader. The state-specific random swing factor determines which states have their electoral college votes allocated according to biased coin flips after the national random swing factor is applied. For example, if the national random swing factor is negligible, then biased (in favor of the current polling leader) coin flips will be used to award electoral votes in states where the current difference in polling results is less than your input for this variable.
Changing the “strength of national random swing” simulates a weak or strong national effect—a random effect that influences all states in the same direction and with the same magnitude—by increasing or decreasing the “national random swing” setting. If there’s a strong national random swing in the polls between now and November 4, a landslide is more likely:
Repeating the simulation many times lets us study interesting aggregate results. For example, fixing the national random swing factor at six produces the following average numbers of electoral-college votes, by state-specific random swing strength:
Notice that the race remains very tight. The Republican ticket benefits, on average, if states where the polling difference is eleven or more points are turned into toss-up states. Why? Because then strongly Democratic states with large numbers of electoral votes—like California and New York—have those electoral votes awarded on the basis of biased coin flips instead of automatically being awarded to the Democratic ticket. Additionally, notice that there are large downward spikes when the state-specific swing factor is set to about one and about five. Why? Because the Democratic ticket is holding onto a number of states by one- or five-point leads in the polls.
If we keep the national random swing factor at six and fix the state-specific random swing factor at six as well, we can use the source code to determine the percentage of the time that the Democratic ticket wins, and the percentage of the time that the Republican ticket wins, in the electoral college. Notice that the results do not add up to 100%, because there is a small possibility (in the case of this particular run of 10,000 simulations, a 1.1% chance) that there will be a tie in the electoral college, in which case the Twelfth Amendment to the US Constitution will throw the vote to the US House of Representatives.
Mathematica’s programming flexibility and the power of symbols like Import and BlockRandom make data-driven social science experiments or the visualization of concepts in the social sciences relatively easy. Social science enthusiasts have used Mathematica to investigate various schemes for apportioning seats in the House of Representatives and to model the behavior of juries. There are infinitely many other interesting possible social science experiments that Mathematica can facilitate.
Feel free to download the notebook for this blog entry and run the simulation code yourself, especially if you’re interested in playing with the very latest polling results.
And hey… while we’re on the subject, if you’re a US citizen, don’t forget to vote this fall.