# The Shape of the Vote: Exploring Congressional Districts with Computation

June 26, 2018 — Brian Wood, Lead Technical Marketing Writer, Document and Media Systems

In the past few decades, the process of redistricting has moved squarely into the computational realm, and with it the political practice of gerrymandering. But how can one solve the problem of equal representation mathematically? And what can be done to test the fairness of districts? In this post I’ll take a deeper dive with the Wolfram Language—using data exploration with Import and Association, built-in knowledge through the Entity framework and various GeoGraphics visualizations to better understand how redistricting works, where issues can arise and how to identify the effects of gerrymandering.

## Rules of Apportionment

In the US House of Representatives, each state is assigned a number of representatives based on its population through the process of apportionment (or reapportionment). On the surface, the rules for this process are simple: each state gets at least one representative, and representative seats must be redistributed at least once per decennial census.

Apportionment has been tried using various mathematical methods throughout history. Since the 1940 Census, representatives have been assigned using the method of equal proportions (the Huntington–Hill method). This means that the next available slot goes to the state with the highest priority , defined as:

 ✕ TraditionalForm[Subscript[A, n]==P/GeometricMean[{n,n-1}]]

… where P is the population of the state and n is the number of districts already assigned to the state. You might recognize the denominator as the geometric mean of and . It’s straightforward to implement symbolically:

 ✕ Priority[pop_,n_]:=N[pop/GeometricMean[{n,n-1}]]

Formula in hand, I’d like to run a simulation to compare to the current apportionment plan. First I’ll pull the 2010 population data from the Wolfram Knowledgebase (excluding the District of Columbia):

 ✕ states=Complement[all US states with District of Columbia administrative divisions["Entities"],{Entity["AdministrativeDivision", {"DistrictOfColumbia", "UnitedStates"}]}];
 ✕ statenames=StringDrop[#["Name"],-15]&/@states;
 ✕ popdata=AssociationThread[statenames->Table[QuantityMagnitude[Dated[s,2010]["Population"]],{s,states}]];
 ✕ RandomChoice[Normal@popdata]

It’s worth noting that these population counts are slightly different from the official reapportionment numbers, which include overseas residents for each state. The discrepancy is too small to make a difference in my apportionment computations, but it could be a topic for a more detailed exploration.

To start my simulation, I give each state one representative. The initial 50 are actually assigned before applying the formula, so I’ll set those initial priority values at Infinity:

 ✕ init=Thread[statenames->∞];

From there, districts are assigned based on successively smaller priority values. Historically, no state has received more than 55 seats, so I’ll set the upper limit at 60:

 ✕ pvalues=Flatten@Table[Normal@Priority[popdata,i],{i,2,60}];

Since only 435 seats are available, the rest can be dropped:

 ✕ app=TakeLargestBy[Join[init,pvalues],Values[#]&,435];

Here’s a function that displays the apportionment data on a map:

 ✕ DistrictWeightMap[apportionment_]:=GeoRegionValuePlot[KeyMap[Interpreter["USState"],apportionment]//Normal,GeoRange->Entity["Country", "UnitedStates"],GeoProjection->"Mercator",GeoLabels->(Text[Style[#4,FontFamily->"Arabic Transparent",White,Medium],#3]&), ImageSize->1200,ColorFunction->(Which[#<.02,GrayLevel[0.6],.02<#<=.15,Darker@Blue,.15<#<=.4,Darker@Orange,#>.4,Darker@Red]&),PlotRange->{0,Length@apportionment}, PlotLegends->Histogram]

My simulation yields the following apportionment counts, which match the current plan exactly:

 ✕ DistrictWeightMap[app]

(I’ve left off Alaska and Hawaii here for easier viewing, but they have one and two districts, respectively.)

The priority formula can be used to simulate the effect of population shifts on the next apportionment stage. For instance, the continuing decline in Illinois population could cause the state to lose a seat in the next cycle—currently, it gets the 423rd seat:

 ✕ Position[Normal[app],"Illinois"][[-1,1]]
 ✕ Priority[#,17]&/@{#,#-40000.}&@popdata["Illinois"]

Here are a few states likely to overtake that particular slot:

 ✕ Select[pvalues,-10000 

Indeed, Texas is likely receive an additional seat (its 37th) based on its explosive growth since 2010:

 ✕ Priority[#,38]&/@{#,#+3000000}&@popdata["Texas"]
 ✕ Take[app,-10]

Here’s a quick function I made to show differences in apportionment counts:

 ✕ DistrictDifferenceMap[newapp_,oldapp_]:=GeoRegionValuePlot[Quiet[Normal@KeyMap[Interpreter["USState"],Merge[{newapp,oldapp},Subtract@@#&]]/.{Subtract[a__]:>a}], GeoProjection->"Mercator", GeoRange->Entity["Country", "UnitedStates"], ImageSize->540, GeoLabels->(Text[Style[#4,"Text",White,10,FontFamily->"Arabic Transparent"],#3]&),ColorRules->({_?Positive->Green,_?Negative->Red,_->Gray})]

If I reapportion using the current population estimates, Texas and North Carolina both gain a seat, while Minnesota and Pennsylvania both lose one:

 ✕ latestpopdata=AssociationThread[statenames->Table[QuantityMagnitude[s["Population"]],{s,states}]];
 ✕ latestpvalues=Flatten@Table[Normal@Priority[latestpopdata,i],{i,2,60}];
 ✕ latestapp=TakeLargestBy[Join[init,latestpvalues],Values[#]&,435];
 ✕ DistrictDifferenceMap[ReverseSort@Counts[Keys@latestapp],ReverseSort@Counts[Keys@app]]

But population growth also affects the apportionment process in other ways. While the number of districts increased steadily for many years, it has remained essentially constant at 435 since 1913 (a limit codified in 1929). As a result, modern representatives have much larger constituencies than in the past—on average, these have nearly quadrupled over the last century:

 ✕ uspophistory=Dated[Entity["Country", "UnitedStates"],All]["Population"];
 ✕ DateListPlot[TimeSeriesWindow[uspophistory,{"1918",Today}]/435.,ColorFunction->"DarkRainbow",PlotRange->Full,PlotTheme->"Detailed"]

Some states also end up with much more populous districts than others. Many have argued that this violates the established “one person, one vote” principle by, for instance, giving voters in Wyoming (with around 500,000 voters per representative) more federal voting power than those in Montana (with about 900,000 voters per representative):

 ✕ popperdist=ReverseSort@Association@Table[Interpreter["USState"][s]->N[popdata[s]/Counts[Keys@app][s]],{s,statenames}];
 ✕ GeoRegionValuePlot[popperdist,GeoProjection->"Mercator",GeoRange->Entity["Country", "UnitedStates"],ColorFunction->"TemperatureMap"]

A Congressional Apportionment Amendment was drafted (but never ratified) that set an initial guideline of “[not] less than one Representative for every forty thousand persons.” Here’s what the breakdown would look like if we used that guideline today:

 ✕ newapp=TakeLargestBy[Join[init,Flatten@Table[Normal[Priority[#,i]&/@popdata],{i,2,1000}]],Values[#]&,Floor[Total[popdata]/40000.]];
 ✕ DistrictWeightMap[newapp]

While the original limitation of 40,000 citizens per representative is perhaps no longer viable (one can imagine how chaotic an 8,000-member legislature would be), adding more seats certainly reduces the population spread among the districts:

 ✕ newpopperdist=ReverseSort@Association@Table[Interpreter["USState"][s]->N[popdata[s]/Counts[Keys@newapp][s]],{s,statenames}];
 ✕ GeoRegionValuePlot[newpopperdist,GeoProjection->"Mercator",GeoRange->Entity["Country", "UnitedStates"],ColorFunction->"TemperatureMap"]

Of course, apportionment is just the first step. Adding more seats would also mean adding more districts—and that would likely make the next stage a lot more complicated.

## Redistricting by the Numbers

Since populations migrate and fluctuate, government officials are constitutionally required to redraw congressional districts following reapportionment. On its surface, this seems straightforward: divide each state into areas of equal population. But the reality can be deceptively complex.

First, using a naive approach, the number of ways to divide a population into equal parts is huge. Suppose you wanted to split a group of 50 people into non-overlapping groups of 10:

 ✕ Times@@Binomial[Range[50.,10,-10],10]/2

This issue scales up with the size of the population; with the current population of the US, the number of ways to divide it into 435 equal districts (ignoring all other constraints) is truly astounding:

 ✕ (Times@@Binomial[Range[#1,#2,-#2],#2]/#2!)&@@{QuantityMagnitude[Entity["Country", "UnitedStates"]["Population"]],435}

Then there’s the problem of actually drawing sensible districts with roughly equal population in each state. Congressional maps are usually drawn and approved by state legislatures, who must meet varying requirements for contiguousness, compactness and other qualities associated with “fair” districts. In a recent Wolfram Community post, Professor Marco Thiel explores a computational approach to drawing unbiased districts; here is how his algorithm splits up Iowa:

The latest district maps are available through the Wolfram Knowledgebase:

 ✕ current=KeyDrop[GroupBy[EntityList["USCongressionalDistrict"],#["USState"]&],{"DistrictOfColumbia",Missing["NotApplicable"],Missing["NotAvailable"]}];

This makes it easy to (roughly) check the equal population requirement; the districts within each state differ by less than one percent, on average:

 ✕ distpop=Table[DeleteMissing[#["Population"]&/@current[s]],{s,statenames}];
 ✕ Mean@Table[If[Length[v]>1,N@StandardDeviation[v]/Mean@v,0.],{v,distpop}]

In some cases, the maps have a nice geometric aesthetic, with shapes that fit together like a children’s puzzle. This type of map tends to follow county lines, only straying when necessary to satisfy the equal population requirement. The quintessential example of this is Iowa:

 ✕ iacounties=EntityClass["AdministrativeDivision", "USCountiesIowa"];
 ✕ Show[GeoListPlot[List/@current["Iowa"],PlotLegends->None],GeoListPlot[iacounties,PlotStyle->Directive[EdgeForm[Blue],FaceForm[Opacity[0]]]]]

Unfortunately, this isn’t the case in most states. By contrast, here is North Carolina’s notoriously jagged map:

 ✕ nccounties=EntityClass["AdministrativeDivision", "USCountiesNorthCarolina"];
 ✕ Show[GeoListPlot[List/@current["NorthCarolina"],PlotLegends->None],GeoListPlot[nccounties,PlotStyle->Directive[EdgeForm[Blue],FaceForm[Opacity[0]]]]]

This kind of irregular shape is considered one of the main indications of deliberate manipulation of districts (and indeed, North Carolina’s map is currently being contested in court), but that’s not to say that every oddly shaped district is gerrymandered. Crooked borders often evolve slowly as the demography of areas change subtly over time.

## Drawing on Experience: Historical Maps

I wanted to dig a bit deeper, so I thought I’d take a look at historical congressional maps that are readily available for viewing and analysis. The TIGER/Shapefile format (with a .zip extension) can be directly imported for a GeoGraphics object containing all the districts combined:

 ✕ Import["http://cdmaps.polisci.ucla.edu/shp/districts001.zip"]

For inspecting individual districts, the files also contain detailed data in key-value pairs. I like using Association in this situation, since it lets me reference elements by name:

 ✕ c1=Association@First@Import["http://cdmaps.polisci.ucla.edu/shp/districts001.zip","Data"];
 ✕ Keys@c1

The "LabeledData" element contains ordered information about individual districts:

 ✕ ld=Association@c1["LabeledData"]; Keys@ld

From there, I can create entries that associate each state name with its district numbers and geometry:

 ✕ entries= <|#[[1]]-><|ToExpression[#[[2]]]->#[[3]]|>|>&/@ Transpose[{ ld["STATENAME"], ld["DISTRICT"], Polygon[Cases[#,_GeoPosition,All]]&/@c1["Geometry"] }];

Lastly, I consolidate all entries from each state:

 ✕ statenames=Union[Keys@entries]//Flatten;
 ✕ districts=Association@Table[Merge[Sort@Select[entries,StringMatchQ[First@Keys@#,s]&],Association],{s,statenames}];

From here, I can easily examine districts on a per-state basis:

 ✕ GeoListPlot[List/@Values@districts["Virginia"],PlotLegends->None]

Here is the original districting map for the entire country, drawn by the First United States Congress:

 ✕ Show@@Table[GeoListPlot[List/@Values@d,PlotLegends->None],{d,districts}]

I put it all together into a function that can import the full district data from any past Congress into an Association for easy exploration:

 ✕ CongressionalMapData[congressnumber_]:= Module[{baseURL="http://cdmaps.polisci.ucla.edu/shp/",raw,ld,entries,statenames},raw=Association@First@Import[baseURL<>"districts"<>StringPadLeft[ToString[congressnumber],3,"0"]<>".zip","Data"]; ld=Association@raw["LabeledData"]; entries= <|#[[1]]-><|ToExpression[#[[2]]]->#[[3]]|>|>&/@Transpose[{ld["STATENAME"],ld["DISTRICT"], Polygon[Cases[#,_GeoPosition,All]]&/@raw["Geometry"]}]; statenames=Union[Keys@entries]//Flatten; Association@Table[Merge[Sort@Select[entries,StringMatchQ[First@Keys@#,s]&],Association],{s,statenames}] ]

Rather than having to reference by Congress number, it’s easier to reference by year:

 ✕ CongressNumber[year_]:=Floor[(year-1787)/2.] CongressionalMapData[year_?(#>1700&)]:=CongressionalMapData[CongressNumber[year]]

Lastly, here’s a function for visualizing all districts in a state:

 ✕ DistrictMap[statedata_]:=GeoListPlot[Table[{s},{s,statedata}],GeoLabels->(Tooltip[#1,FirstPosition[statedata,#1][[1,1]]]&),PlotLegends->None]

This makes it easy to look at individual districts for a given state and year. Looking at the map from 100 years ago, I was surprised to learn that Illinois used to have over 70% more districts:

 ✕ dist1918=CongressionalMapData[1918];
 ✕ N@Length@current["Illinois"]/Length@dist1918["Illinois"]

This included one “at-large” representative that represented the entire state, rather than a particular district or area. In this data, such districts are numbered “0”:

 ✕ GeoGraphics[dist1918["Illinois",0]]

In general, it seems like the population has shifted away from the Midwest region:

 ✕ DistrictDifferenceMap[Length/@current,Length/@dist1918]

Importing the full set of maps took me about 40 minutes and most of the RAM on my laptop:

 ✕ allmaps=Table[CongressionalMapData[cnum],{cnum,114}];

Here’s a complete history of reapportionment counts:

 ✕ frames = Table[{DistrictWeightMap[Length /@ Values /@ allmaps[[i]]], 1789 + 2 i - 1}, {i, 114}]; ListAnimate[Labeled[#1, Style[#2, "Section"], Top] & @@@ frames]

With the full dataset, I can look at the history of districts for a particular state, which can give some insights about its development. New York is an interesting case: it started with 10 districts, and continued gaining seats with population growth until it peaked at 44 districts in the mid-20th century. Since then, it’s been losing seats due to population shifts and the 435-member cap on the House. The map has changed nearly every ten-year cycle, indicating that internal demographics have shifted as well:

 ✕ nydists=Table[{i,allmaps[[CongressNumber[i],"New York"]]},{i,1793,2013,10}];
 ✕ ListAnimate[ Labeled[DistrictMap[#2], Style[ToString[#1] <> ": " <> Capitalize@IntegerName[Length[#2]] <> " Districts", "Section"], Top] & @@@ nydists, AnimationRepetitions -> 1, AnimationRunning -> False]

New Hampshire sits on the other end of the spectrum, having gone through only minimal changes since its original map. It actually kept the same two-district plan for the eight cycles between 1880 and 1960. The simplest explanation is that, unlike New York, this state’s demographics have remained fairly constant (and its population growth average):

 ✕ nhdists=Table[{i,allmaps[[CongressNumber[i],"New Hampshire"]]},{i,1793,2013,10}];
 ✕ ListAnimate[ Labeled[DistrictMap[#2], Style[ToString[#1] <> ": " <> Capitalize@IntegerName[Length[#2]] <> If[Length[#2] == 1, " District", " Districts"], "Section"], Top] & @@@ nhdists, AnimationRepetitions -> 1, AnimationRunning -> False]

The maps also illustrate some notable historical events. When the American Civil War broke out, Virginia seceded from the Union. But a group of Unionists in the northwestern part of the state broke from this decision, taking three districts from the state to form West Virginia:

 ✕ GeoListPlot[{Values[CongressionalMapData[1859]["Virginia"]], Values[CongressionalMapData[1863]["West Virginia"]]}]

And of course, when the war was over, the two states remained separate:

 ✕ GeoListPlot[{Values[CongressionalMapData[1869]["Virginia"]], Values[CongressionalMapData[1869]["West Virginia"]]}]

After the war, population counts grew in many southern states because of freed slaves, giving them more national voting power:

 ✕ dist1859=CongressionalMapData[1859]; dist1873=CongressionalMapData[1873];
 ✕ DistrictDifferenceMap[Length/@dist1873,Length/@dist1859]

In the late 20th century, some states started adjusting maps to create majority-minority districts designed to ensure appropriate representation and voting power for minority groups (as required by the Voting Rights Act of 1965). Opponents of this practice claim that it constitutes racial gerrymandering; in some cases, the Supreme Court has agreed.

For instance, after gaining three seats in 1990, Texas attempted to draw new majority-minority districts to represent both Hispanic and African American voters. In Bush v. Vera, the court ruled that two of the new districts (the 29th and 30th) and one newly manipulated district (the 18th) violated compactness principles too severely:

 ✕ dist1993=CongressionalMapData[1993]; Row@Table[Labeled[GeoGraphics[{Green,dist1993["Texas",i]},ImageSize->150],Style[i,"Text",Darker@Green,Bold],Top],{i,{18,29,30}}]

Legislators were forced to redraw the maps:

 ✕ dist1997=CongressionalMapData[1997]; Row@Table[Labeled[GeoGraphics[{Green,dist1997["Texas",i]},ImageSize->150],Style[i,"Text",Darker@Green,Bold],Top],{i,{18,29,30}}]

This indicates that while some level of affirmative racial gerrymandering may be acceptable, the shape of a district must still be sensible. Of course, plenty of minority-majority districts exist naturally because of concentrated minority populations. Many of these are in southern regions with large African American populations:

 ✕ mm=Import["https://en.wikipedia.org/wiki/List_of_majority-minority_United_States_congressional_districts","Data"]; aalist=mm[[1,1,4,3;;27]]; GeoRegionValuePlot[Table[ current[[StringDelete[aalist[[d,3]]," "],aalist[[d,4]]]]->Quantity[aalist[[d,2]]],{d,Length@aalist}],GeoRange->{{40.,25.}, {-95.,-75.}},GeoProjection->"Mercator"]

There are also a number of southwest regions with Hispanic/Latino-majority districts:

 ✕ hisplist=mm[[1,1,6,3;;27]]; GeoRegionValuePlot[Table[ current[[StringDelete[hisplist[[d,3]]," "],hisplist[[d,4]]]]->Quantity[hisplist[[d,2]]],{d,Length@hisplist}],GeoRange->{{38,25},{-120,-95}},GeoProjection->"Mercator"]

Maps are sometimes adjusted to preserve “communities of interest,” such as separating rural and urban populations or keeping a major business in the same district as its workers. States with large metropolitan areas can be especially difficult to draw lines for because of the high population density in those areas. Here in Illinois, the city of Chicago makes up a full 21% of the population:

 ✕ N[Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Population"]/Entity["AdministrativeDivision", {"Illinois", "UnitedStates"}]["Population"]]

A look at the map shows that the city itself sprawls across nearly half the state’s 18 districts in order to distribute that population:

 ✕ Show[GeoListPlot[List/@Most[current["Illinois"]],PlotLegends->None], GeoGraphics[{FaceForm[Directive[Opacity[1.],Black]],EdgeForm[White],Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Polygon"]}],GeoRange->Entity["City", {"Chicago", "Illinois", "UnitedStates"}]]

Looking at historic maps of the area, it’s clear this wasn’t always the case. Just after the Civil War, the state had 14 districts, with Chicago mostly enclosed in just one:

 ✕ dist1865=CongressionalMapData[1865]; Length@dist1865["Illinois"]
 ✕ Show[GeoListPlot[List/@Values@dist1865["Illinois"],PlotLegends->None], GeoGraphics[{FaceForm[Directive[Opacity[1.],Black]],EdgeForm[White],Dated[Entity["City", {"Chicago", "Illinois", "UnitedStates"}],1823]["Polygon"]}],GeoRange->Entity["City", {"Chicago", "Illinois", "UnitedStates"}]]

From this perspective, it’s also clear that the current Illinois districts fall into the “jagged” category, which is often the case when mapmakers start using more complex factors to draw their maps. Since modern redistricting is often done using heavy-duty GIS software with detailed maps and high-resolution census data overlays, it can be difficult to tell what the reasoning was for districts’ shapes—or what kinds of manipulation might have taken place. But there’s growing concern that these manipulations might be damaging to the democratic process.

## Gerrymandering and the Supreme Court

Throughout America’s 200-year history, countless legislative bodies (and at least one independent commission) have been accused of partisan gerrymandering—and some of these accusations have gone all the way to the Supreme Court. But to explore this issue effectively, I need to look at more than just maps.

I found comprehensive election data in PDF format from the Clerk of the House. I tried various methods for importing these; in the end I created a package that uses string patterns to sort through election information:

 ✕ < 

The package allows me to import election data by state and year (starting in 1998) as a Dataset:

 ✕ ildata=RepresentativeVotesDataset["Illinois",2014]

To simplify analysis of elections, I also made a function to extract results for the two major parties (Republican and Democratic) in a given election cycle:

 ✕ PartyVotes[electiondata_]:=With[{votes=GroupBy[Select[electiondata,StringMatchQ[#["Party"],"Republican"|"Democrat"]&],"District"]},Table[<|#["Party"]->(#["Votes"])&/@Normal@votes[i,All]|>,{i,Length@votes}]]

In Illinois, it’s well known that Chicago is often in political opposition to much of the (rural) rest of the state—and with the current map, many argue that the city holds undue power in the legislature. In the 2014 election, Democrats won 12 of the state’s 18 seats (66%) with just 51% of the total popular vote:

 ✕ ilvotes=PartyVotes[ildata]; Total@ilvotes/Total@ildata[[All,"Votes"]]//N
 ✕ Show[GeoRegionValuePlot[Thread[Most[current["Illinois"]]->(KeySort@N[#/Total[#]]&/@ilvotes)[[All,1]]],ColorFunction->(Blend[{Red,Blue},#]&), PlotRange->{0,1}], GeoGraphics[{FaceForm[Directive[Opacity[1.],Green]],EdgeForm[White],Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Polygon"]}]]

And aside from a few “purple” bi-state areas, the irregular districts in Chicago appear to tip the balance for Democrats. While no case has been brought forth in Illinois, most critics point to the earmuff-shaped fourth district as a prime example of extreme gerrymandering:

 ✕ GeoGraphics[{Green,Polygon@current[["Illinois",4]]}]

Shape-based arguments have historically dominated in gerrymandering cases, and it’s easy to see why—literally. Anyone can look and get a general sense of how complex a shape is. But there are also some geometric tests for measuring the compactness of a district. In his Community post, Marco Thiel tests out a few techniques that involve computing the ratio of a region’s area to the area of a circumscribed shape:

The range considered acceptable for each test can be subjective, but each measure gives a value between 0 and 1. Looking at the distribution of each test among the states, you can get a good sense of what’s average:

 ✕ Multicolumn[{CloudGet["https://wolfr.am/vF9vSryp"], CloudGet["https://wolfr.am/vF9vSHh1"], CloudGet["https://wolfr.am/vF9vT2uh"], CloudGet["https://wolfr.am/vF9vTegh"]}, ItemSize -> Full]

Here are some of the least compact districts in the country, according to Marco’s computations:

Application of these and similar geometric tests has led several courts to strike down district maps that lack compactness (like in Texas). But there’s no single way to measure compactness, and some odd shapes are due to natural boundaries and other non-political factors.

Aside from that, both legislators and courts have been reluctant to make any strong statements about partisan gerrymandering because of the inherent political implications: any law or ruling that seems to favor a particular party could be highly criticized. So the fact that the Supreme Court took two cases on this topic (in addition to one on racial gerrymandering) is a pretty big deal.

The first case, Gill v. Whitford, takes a practical approach to the problem: if partisan gerrymandering is the issue, they reason, perhaps it needs a partisan-based solution. Originating in a Wisconsin state court, the plaintiffs presented a case in October 2017 based on a new measure of partisan bias proposed by Nicholas Stephanopoulos and Eric McGhee called efficiency gap. The formula is best summarized as the difference in the total number of wasted votes for each party—including votes cast for a losing candidate and surplus votes cast for a winning candidate—over the total votes cast:

 ✕ TraditionalForm[EG==(HoldForm@(Subscript[lost, A]+Subscript[surplus, A])-HoldForm@(Subscript[lost, B]+Subscript[surplus, B]))/(total votes)]

By assuming equal population per district and a two-party system, this formula is conveniently reduced to the difference between a party’s seat margin (percentage of seats over 50%) and twice its vote margin:

 ✕ TraditionalForm[EG=="seat margin" - 2 *"vote margin"]

From the data I collected, I can easily compute the seat margins and vote margins:

 ✕ SeatMargin[electiondata_]:=With[{pv=PartyVotes[electiondata]},N@(Counts[Flatten@Keys[TakeLargest[#,1]&/@pv]]-Length@pv/2)/Length@pv] VoteMargin[electiondata_]:=N@#/Total[#]&@Merge[PartyVotes[electiondata],Total]-.5

For congressional districts, the efficiency gap is given in seats. Here's an implementation of the simplified efficiency gap formula with positive numbers indicating a Democratic advantage and negative indicating a Republican advantage:

 ✕ EfficiencyGap[electiondata_]:=Length[GroupBy[electiondata,"District"]] *(KeySort[SeatMargin[electiondata]]-2 KeySort[VoteMargin[electiondata]])

From a legal standpoint, the argument is that the “wasted” votes constitute a violation of the voters’ rights under the Equal Protection Clause. According to the authors, an advantage of two or more seats could indicate district manipulation in a given state. The paper points out a few states with large congressional efficiency gaps in recent cycles:

 ✕ Table[With[{data=GroupBy[RepresentativeVotesDataset[state,{1998,2016}],"Year"]},DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[EfficiencyGap@data[[i]],{i,Length@data}][[All,1]]}],PlotTheme->"Scientific"]],{state,{"Michigan","Michigan","North Carolina","Ohio","Pennsylvania","Texas","Virginia"}}]

Although Gill v. Whitford deals with state legislative districts, Wisconsin’s congressional districts seem to show a strong trend toward Republican dominance over the past 20 years as well:

 ✕ widata=GroupBy[RepresentativeVotesDataset["Wisconsin",{1998,2016}],"Year"];
 ✕ DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[EfficiencyGap@widata[[i]],{i,Length@widata}][[All,1]]}],PlotTheme->"Scientific"]

You can see this effect on the maps, where many previously contentious areas now run more solidly red:

 ✕ widists=Table[CongressionalMapData[i]["Wisconsin"],{i,2000,2016,4}];
 ✕ wivotes=Table[PartyVotes[RepresentativeVotesDataset["Wisconsin",i]],{i,2000,2016,4}];
 ✕ Grid[{Text/@Range[2000,2016,4], Table[GeoRegionValuePlot[Thread[Values[widists[[i]]]->(KeySort@N[#/Total[#]]&/@wivotes[[i]])[[All,1]]], PlotLegends->None, ColorFunction->(Blend[{Red,Blue},#]&), ImageSize->100],{i,Length@widists}]}]

In Benisek v. Lamone (coming from Maryland), the legal argument instead hinges on the First Amendment: casting a vote is considered a form of expression (i.e. speech), and the claim is that the offending district “dilutes the votes” of Republican voters, thus reducing the value of those voters’ speech. While this case presents no particular standard for computing the extent of partisan gerrymandering, it does provide a fresh legal route for applying any standard that might be instated. Either way, the efficiency gap test shows a rather prominent Democratic trend in Maryland:

 ✕ mddata=GroupBy[RepresentativeVotesDataset["Maryland",{1998,2016}],"Year"];
 ✕ DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[KeySort@EfficiencyGap@mddata[[i]],{i,Length@mddata}][[All,1]]}],PlotTheme->"Scientific"]

And a look at the district in question shows that its latest map is far from compact:

 ✕ mddists=Table[CongressionalMapData[i]["Maryland"],{i,2000,2016,8}];
 ✕ mdvotes=Table[PartyVotes[RepresentativeVotesDataset["Maryland",i]],{i,2000,2016,8}];
 ✕ Grid[{Text/@Range[2000,2016,8],GeoGraphics/@Transpose[{GeoStyling[Blend[{Red,Blue},#]]&/@(KeySort@N[#/Total[#]]&/@mdvotes[[All,6]])[[All,1]],mddists[[All,6]]}]}]

The Supreme Court also picked up another case in January 2018, this time about racial gerrymandering. Like Bush v. Vera, the case comes from Texas, but this time it’s centered around negative racial gerrymandering. Republican lawmakers are appealing a lower court’s ruling that the state’s latest maps discriminate against racial minorities. The efficiency gap doesn’t exactly translate to this case, but one could conceive of a similar measure based on the wasted votes of racial minorities.

Suffice it to say, the gerrymandering issue is coming to a head. With these three cases combined—as well as recent decisions in North Carolina and Pennsylvania, a ballot initiative in Michigan and all kinds of academic discussions around the country—the stage is set for the Supreme Court to make changes in how redistricting is regulated. Unfortunately, they’ve opted to pass on both partisan gerrymandering cases on technical grounds, so we will likely have to wait until next session to get a major decision.

Gerrymandering is a complex subject with a deep history, and this post only scratches the surface. Exploring with the Wolfram Language helped me pull everything together easily and discover a lot of intricacies I wouldn’t have otherwise found. Now that I've collected all the data in one place, I invite you to do your own exploration. Go find out the history of your district, explore measures of fairness and partition states as you see fit—just don’t forget to go out and vote this November!

RELATED POSTS

 I love this post Brian! Great work! Hopefully Eric Holder will see it! Posted by Jeremy Sykes    June 26, 2018 at 12:57 pm
 Brian This is an excellent article and perfect timing as regards the recent Supreme court ruling as regards gerrymandered Texas districts. Clearly the Supremes needed to have an article like this or they wouldn’t have decided to reject basic democracy by allowing votes to be given unequal representation. Posted by Michael    June 26, 2018 at 1:48 pm
 I love this article. Posted by Raz Kumar    June 30, 2018 at 5:43 am
 Cool. Thanks for showing what wonders Mathematica can do ! Posted by selvaarab    July 3, 2018 at 2:53 am
 It’s worth noting that the districts drawn in the Maryland case aren’t compact because the state itself isn’t compact. Perhaps the compactness metric should compare the district to the circumscribing shape *minus* the points outside the state. Posted by Scott Brickner    July 3, 2018 at 7:36 pm
 Hi Brian, great post. Unfortunately, when I try to download the file I get the following: errorMsg “User does not have access to this file.” Would it be possible to fix it? Many thanks in advance Posted by Ruben Garcia Berasategui    July 10, 2018 at 8:01 pm
 Hi Ruben: I have tested the link and it seems to download fine. This is likely a permissions issue on your computer. I’ll check with a few individuals here at Wolfram and see if they might be able to offer any other possible reasons for you receiving this message. Posted by Wolfram Blog    July 11, 2018 at 1:31 pm
 I had the same issue as Ruben. You can download the notebook if you log into the Wolfram Cloud, provided you have a Wolfram account. Posted by Dave Middleton    July 12, 2018 at 10:56 am
 Thanks for the tip Dave! Now it works. Posted by Ruben Garcia Berasategui    July 12, 2018 at 8:25 pm