The Ultimate Team Generator with the Wolfram Language
Every summer, I play in a recreational Ultimate Frisbee league—just “Ultimate” to those who play. It’s a fun, relaxed, coed league where I tend to win more friends than games.
The league is organized by volunteers, and one year, my friend and teammate Nate was volunteered to coordinate it. A couple weeks before the start of the season, Nate came to me with some desperation in his voice over making the teams. The league allows each player to request to play with up to eight other players—disparagingly referred to as their “baggage.” And Nate discovered that with over 100 players in a league, each one requesting a different combination of teammates, creating teams that would please everyone seemed to become more complicated by the minute.
Luckily for him, the Wolfram Language has a suite of graph and network tools for things like social media. I recognized that this seemingly overwhelming problem was actually a fairly simple graph problem. I asked Nate for the data, spent an evening working in a notebook and sent him the teams that night.
Using the Wolfram Language worked so well that—though it’s been years since I first helped out Nate, and the league coordinator has changed—I can count on an annual email volunteering me to make the teams again. And each year, I’ve been able to dig out my notebook and make teams, regularly adding improvements along the way.
Until Nate showed me his problem, I didn’t realize how tricky a situation this could be. Because baggage requests don’t have to be mutual, you can end up with chains of connected players that are larger than the acceptable size of a team. By just looking at Nate’s spreadsheet, it was nearly impossible to divine which baggage requests needed to be denied to make teams.
In addition to determining which baggage requests to decline, the process involves importing and interpreting datasets, grouping cores of players so that teams have similar metrics and exporting results for the league to distribute.
Some Notes
I’ve anonymized the data here, which was fun to do with Wolfram|Alpha. In only a couple lines of code, I replaced the all of the players’ names with notable people of the same gender from the Wolfram Knowledgebase. You can find the code to create this “dummy data” in the downloadable notebook for this post.
In the graph visualizations, I deliberately omitted the players’ names. I wanted to avoid the taint of giving myself an advantage, as I’m also playing in this league. Typically, I don’t know which team I am on until the very end. If any other players in the league are reading this and have doubts, allow my combined 2016–2018 win-loss record of 7–38 serve as definitive proof that if there is a bias, it is decidedly anti-Bob.
Importing the Data
My first step is to grab the data from the league website. There are two sets of data to import: a list of registered players and a list of baggage requests.
Here I have stored anonymized copies of the player and baggage data as cloud objects:
Engage with the code in this post by downloading the Wolfram Notebook
✕
$urls = {"https://www.wolframcloud.com/obj/bobs/ Ultimate2019DummyPlayers.html", "https://www.wolframcloud.com/obj/bobs/Ultimate2019DummyBaggage.html"}; |
Player List
Since I started work on the Wolfram Data Repository a few years ago, I’ve learned a universal truth: any general, automated data importer will quickly fail to automatically import real data. In the real world, it’s all edge cases.
Naively optimistic nonetheless, I attempt to import the player data directly from the webpage using the automated tools in Import[url,"FullData"].
The failure originates with two columns with checkboxes defined by custom CSS that are not properly captured by the HTML importer. Here is the code that failed:
✕
autoplayers = Import[$urls[[1]], "FullData"][[2, 2, 1]]; Short[autoplayers] |
Interpret the Data
I use these interpretation tools for converting each column into the Wolfram Data Framework (WDF).
✕
interpretRawData[raw_] := interpretRow /@ getDataRows[raw] interpretRow[row_] := MapIndexed[interpretValue[##] &, row] interpretValue[val_, {Key[k_]}] := interpretValue[val, k] interpretValue[val_, "Entered"] := DateObject[{val, {"Month", ".", "Day", ".", "YearShort"}}] interpretValue[val_, "Name"] := ImportString[StringReplace[StringTrim[val], "n" | "t" -> ""], "HTML"] interpretValue[val_, "age" | "pow" | "Mem ID"] := ToExpression[val] interpretValue[val_, "exp" | "skl" | "ath"] := ToExpression[StringDrop[val, 1]] interpretValue[val_, "paid" | "mgr"] := Interpreter["Boolean"][val] interpretValue[val_, _] := val getDataKeys[raw_] := StringReplace[First@First[raw], Whitespace -> ""] getDataRows[raw_] := With[{k = getDataKeys[raw]}, AssociationThread[k -> #] & /@ raw[[2]]] |
This shows that none of the players have paid. Since I paid, I know there’s a problem!
✕
dataFromAutoImport = interpretRawData[autoplayers]; |
✕
Lookup[dataFromAutoImport, "paid"] // Counts |
As in most real-world data problems, some manual data work is required. In the following code, I retrieve the data with a web request and then import it into WDF by manually parsing the HTML and formatting the values.
✕
resp = URLRead[First@$urls] |
✕
html = resp["Body"]; |
The response contains a raw HTML document:
✕
Snippet[html, 4] |
"<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
" http://www.w3.org/TR/html4/loose.dtd " >
< html xmlns = " http://www.w3.org/TR/REC - html40 " >
< head > "
To parse the HTML directly, I need more utilities.
✕
getFirstTable[html_] := First[StringCases[html, "<table" ~~ Shortest[___] ~~ "</table"]] getTableRows[table_] := StringCases[table, "<tr" ~~ Shortest[___] ~~ "</tr"] getTableKeys[rows_] := StringReplace[ StringCases[First[rows], "<th" ~~ Shortest[___] ~~ ">" ~~ k : Shortest[__] ~~ "</th" :> k], "<" ~~ Shortest[___] ~~ ">" -> ""] getRowValues[row_] := StringReplace[ StringCases[row, "<td" ~~ Shortest[___] ~~ ">" ~~ k : Shortest[___] ~~ "</td" :> k], "<" ~~ cont : Shortest[___] ~~ ">" :> If[StringFreeQ[cont, "check"], "", "True"]] importPlayerTable[html_] := With[{rows = getTableRows[getFirstTable[html]]}, With[{keys = getTableKeys[rows]}, interpretRow[AssociationThread[keys -> getRowValues[#]]] & /@ Rest[rows] ] ] |
✕
importeddata = importPlayerTable[html]; |
Now, "paid" shows the correct values.
✕
Lookup[importeddata, "paid"] // Counts |
Dataset
I like working with Dataset. It makes it easy to query the data as well as provides a nice visualization in a table.
✕
playerdata = Dataset[importeddata] |
In case manually parsing HTML was not messy enough, the chummy Ultimate community of St. Louis has another trick. Players know that instead of properly entering their requests into the registration form, they can just email the league coordinator and tell them what they want. To help sort out these emails, I made a utility function for finding players by name or ID.
✕
findPlayer[str_String] := playerdata[ Select[StringContainsQ[#Name, str, IgnoreCase -> True] &]] findPlayer[n_Integer] := playerdata[Select[#["Mem ID"] === n &]] playerPosition[n_Integer] := First@Flatten@FirstPosition[playerdata, n] |
For example, Maria Garcia accidentally registered as a manager (or captain). Then she emailed to say she does not want that responsibility.
✕
findPlayer["Garcia"] |
✕
p = playerPosition[2924] |
✕
importeddata[[p, "mgr"]] = False |
After several other manual adjustments, I recreate the dataset.
✕
playerdata = Dataset[importeddata]; |
Then I get a list of the unique IDs that we will use to create a graph.
✕
ids = Normal[playerdata[All, "Mem ID"]]; |
✕
Length[ids] |
Baggage
For the baggage data, my optimism pays off; the automatic importing works!
✕
rawbaggage = Import[$urls[[2]], "Data"]; |
✕
Length[rawbaggage] |
The data is very simple, just pairs of ID numbers.
✕
Short[rawbaggage] |
✕
edges = DirectedEdge @@@ rawbaggage; |
Create a Graph Including Baggage
Now we’ve reached the interesting part, where the Graph visualization and manipulation makes a hard problem easy.
I start by creating one large graph, where each vertex is a player and each edge is a baggage request. Because baggage requests need not be mutual, it’s a directed graph.
To add more information to the visualization, I style the graph using green for men, orange for women and large vertices for captains. The slightly darker colors represent people who have not yet paid. I label the vertices with the unique member ID ("Mem ID") property.
Initialize Symbols
✕
$numTeams = 8; coreIDs = {}; teams = teamlist = List /@ Range[$numTeams]; selected = 0; |
Graph Styling Tools
✕
teamSummary[tn_] := teamsummary[tn, teams[[tn]]] teamsummary[tn_, team_] := With[{rows = playerdata[Select[MemberQ[team, #["Mem ID"]] &]]}, Style["Team " <> ToString[tn] <> ": " <> ToString[Length[team]] <> " Playersn" <> TextString@Normal@Counts[rows[All, "Gender"]] <> "nMedian Age = " <> ToString[N@Median[rows[All, "age"]]] <> "nMean Power = " <> ToString[N@Mean[rows[All, "pow"]]], 14]] Clear[$genderColors, $bigCaptains, $nameLabels, $idLabels]; $idLabels := ($idLabels = Thread[ids -> ids]); $genderColors := ($genderColors = Normal[playerdata[ All, #["Mem ID"] -> If[! TrueQ[#paid], Darker, Identity]@ Switch[#Gender, "Male", Green, "Female", Orange, _, Print["g" -> #Gender]; Black] &]]) $bigCaptains := ($bigCaptains = Normal[playerdata[All, #["Mem ID"] -> If[TrueQ[#mgr], 1, .4] &]]) $nameLabels := ($nameLabels = Normal[playerdata[ All, #["Mem ID"] -> StringTake[#Name, UpTo[25]] &]]) |
Immediately there’s a big problem: about a third of the eight-team league is connected in a continuous “baggage chain.” This is pretty typical of my yearly experience. In the worst year so far, over half the players were connected to each other.
✕
originalgraph = Graph[ids, edges, VertexLabels -> $idLabels, VertexStyle -> $genderColors, VertexSize -> $bigCaptains] |
Determine Desired Team Shape
Before I start cutting those groups apart, I compute some quick statistics about how the average team should look.
Number of Players
✕
N[Length[ids]/$numTeams] |
Number of Women per Team
✕
N@Length[playerdata[Select[#Gender == "Female" &]]]/$numTeams |
Average Age and Power
✕
{N@playerdata[Mean, "age"], playerdata[Median, "age"]} |
✕
N@playerdata[Mean, "pow"] |
Baggage-Breaking Chains
These simple statistics give me an idea of what size group I can allow to stay. To split the large graph into connected groups that I call “cores,” I use WeaklyConnectedGraphComponents.
✕
connectedgraphs = WeaklyConnectedGraphComponents[originalgraph]; |
✕
VertexCount /@ connectedgraphs |
✕
subgraph = First@connectedgraphs |
Sometimes it’s hard to pick out the exact edges from looking at the graph. That’s where EdgeList comes in handy.
✕
EdgeList[subgraph, 2180 [DirectedEdge] _ | _ [DirectedEdge] 2180] |
Once I gather a list of edges that I think are smart to cut, I use HighlightGraph to verify. The list here includes two edges I picked from the second connected component as well.
✕
cuts = {4032 [DirectedEdge] 7, 2065 [DirectedEdge] 11, 3156 [DirectedEdge] 11, 2180 [DirectedEdge] 3098, 2180 [DirectedEdge] 2851, 2180 [DirectedEdge] 3547, 744 [DirectedEdge] 629, 744 [DirectedEdge] 645}; |
✕
HighlightGraph[subgraph, DirectedEdge @@@ cuts] |
After removing those edges, the graph looks much more manageable.
✕
modifiedgraph = EdgeDelete[originalgraph, DirectedEdge @@@ cuts] |
Group Cores into Teams
I’m done being the bad guy, splitting people up. Now I get to be the nice guy, bringing folks together. This involves grouping the connected subgraphs, or what I call “cores” of players, intelligently to make nice teams.
The problem of how to group those cores into teams is not trivial. There are several dimensions to consider:
- Each team should have similar numbers of men and women
- Each team should have a captain
- Teams should be as evenly matched as possible
The last item is tricky. The registration data contains three subjective, self-evaluation metrics: “athleticism,” “skill” and “experience,” as well as a metric called “power,” which is a linear weighting of the other three. However, these tend to be better measures of humility (or perhaps gamesmanship) than actual ability. The most objective measure that has some bearing on ability is age. This year, players in the league range from 15 to 58. Experience shows that teams made up of too many players at either the young or old ends of the range tend to be less competitive.
I’ve played around with methods to automate this process by optimizing a utility function, but have not had success yet. Maybe I’ll do that in a year or two and write a follow-up. Maybe you, dear reader, will do it for me. For now, I’ll show you the notebook GUI I made for manually sorting cores into teams.
✕
coreIDs = WeaklyConnectedComponents[modifiedgraph]; |
For a Fun Visual: CommunityGraphPlot
✕
CommunityGraphPlot[modifiedgraph, Labeled[#, Style[ToString[Length[#]] <> " Players", 24]] & /@ Take[coreIDs, $numTeams], VertexLabels -> $idLabels, VertexStyle -> $genderColors, VertexSize -> $bigCaptains, ImageSize -> 700] |
Create a GUI for Building Teams from Cores
First, I define a list of buttons for selecting teams, showing the team graph on each button.
✕
dynamicTeamGraphButtons[fullgraph_] := Dynamic[Button[ With[{team = Flatten[coreIDs[[teamlist[[#]]]]]}, Subgraph[fullgraph, team, PlotLabel -> teamsummary[#, team], VertexLabels -> $idLabels, VertexStyle -> $genderColors, VertexSize -> $bigCaptains, ImageSize -> 200] ], selected = #, Appearance -> If[selected == #, "Pressed", Automatic]], TrackedSymbols :> {teamlist, selected}] & /@ Range[$numTeams] |
To remove a core from the selected team:
✕
moveCore[tn_, selected_] := (teamlist = teamlist /. tn -> Nothing) /; MemberQ[teamlist[[selected]], tn] |
To add a core to the selected team:
✕
moveCore[tn_, selected_] := (teamlist = teamlist /. tn -> Nothing; AppendTo[teamlist[[selected]], tn]) |
To create a brief summary grid for a core:
✕
coreInfoSmall[ids_] := Module[{rows = playerdata[Select[MemberQ[ids, #["Mem ID"]] &]], gender}, gender = Lookup[Normal@Counts[rows[All, "Gender"]], {"Female", "Male"}, 0]; Grid[{{"f", Style[gender[[1]], Orange]}, {"m", Style[gender[[2]], Green]}, {"a", N@rows[Median, "age"]}, {"p", N@rows[Mean, "pow"]}}, Spacings -> 0 ]] |
To create a button for adding/removing a core:
✕
coreButtons[coreids_] := Dynamic[Button[Row[{#, " ", coreInfoSmall[coreids[[#]]]}], moveCore[#, selected], Appearance -> Which[ MemberQ[teamlist[[selected]], #], "Pressed", MemberQ[Flatten[teamlist], #], "Palette", True, "DialogBox" ]], TrackedSymbols :> {teamlist, selected}] & /@ Range[Length[coreids]] |
And the full GUI:
✕
makeGUI[graph_, coreids_] := Panel@Grid[{{Grid[ Partition[dynamicTeamGraphButtons[graph], UpTo[Ceiling[$numTeams/2]]]]}, {coreButtons[coreids]}}] |
With this, we have the GUI. Initially, the largest eight cores are each put into a team. On top, the GUI shows the current state of the teams with graphs and brief summaries of the important metrics. There’s a graph for each team along with the number of men and women, and then two statistics, the median age and mean power. The graphs and summaries are buttons that can be used to select a team from which to add or remove cores of players.
Beneath that is a row of buttons for the cores. Each button shows the same metrics as the team summaries. This makes it pretty easy to smartly match up teams with cores that will bring them closer to the league-wide average.
✕
makeGUI[modifiedgraph, coreIDs] |
I start by adding cores to teams to make sure each team has a captain and at least four women. For similar cores like 18 and 19—each of which have one woman, one man and no captains—I choose which core goes on which team, in order to even out the median age.
✕
makeGUI[modifiedgraph, coreIDs] |
Finally, I add the men to the teams so that each team has approximately the same number, while also trying to level the age and power values.
✕
makeGUI[modifiedgraph, coreIDs] |
Results
Now I’ve grouped all the players into teams! I reformat those lists from core IDs to player IDs.
The variable teamlist is a list of all the cores for each team.
✕
teamlist |
I turn that into a list of player IDs for each team.
✕
getTeam[tl_] := Flatten[coreIDs[[tl]]] |
✕
teams = getTeam /@ teamlist |
I always double-check that no one was missed and no one was mistakenly included by comparing the team lists with the original list of player IDs.
✕
Complement[Sort[Flatten[teams]], Sort[ids]] |
✕
Complement[Sort[ids], Sort[Flatten[teams]]] |
Analysis
For some basic sanity tests, I make a dataset for each team.
✕
teamdatasets = Function[{core}, playerdata[Select[MemberQ[core, #["Mem ID"]] &]]] /@ teams; |
Then I find the captains for each team. Team five is missing a captain, so someone will need to be volunteered.
✕
Dataset[Association[{"Captain" -> #[ Select[#["mgr"] &] /* (StringRiffle[#, "; "] &), "Name"], Normal[#[Counts, "Gender"]]}] & /@ teamdatasets] |
Next, I compare age and each self-evaluation metric with box whisker charts. Team seven may have a minor athleticism deficit, but overall the parity is good.
✕
makechart[param_] := BoxWhiskerChart[(Labeled[#, Mean[DeleteMissing@#], Bottom] &[ N@ToExpression[ StringReplace[ ToString[Normal[#[All, param]]], {"A" -> "", "E" -> "", "S" -> ""}]] /. Null -> Missing[]]) & /@ teamdatasets, PlotLabel -> param, ImageSize -> Large, ChartLabels -> Range[6]] |
✕
makechart /@ {"age", "pow", "ath", "skl"} |
See the Real (Fake) Names
Now that I’m happy with the teams, I can finally view the names without fear of biasing myself.
✕
Framed[Subgraph[originalgraph, teams[[#]], VertexLabels -> $nameLabels, ImageSize -> 500, PlotLabel -> "Team " <> ToString[#]]] & /@ Range[$numTeams] |
Wrapping Up with Export
When I’m done, I export the teams as CSV files and email them to the coordinator. Personally, I’d rather export them to cloud objects and send the links out. But some people love email attachments.
✕
CreateDirectory[ dir = FileNameJoin[{NotebookDirectory[], "slua2019/v7"}], CreateIntermediateDirectories -> True]; dir |
✕
files = With[{roster = teamdatasets[[#]]}, Export[FileNameJoin[{dir, "team" <> ToString[#] <> ".csv"}], roster]] & /@ Range[$numTeams]; zip = CreateArchive[dir] SendMail[<|"To" -> "leaguecoordinator@example.com", "Subject" -> "Check these teams", "Attachments" -> zip|>] |
And that’s how the Ultimate teams are made step by step. Of course, this isn’t limited to building Ultimate teams; you can apply this method to help organize other groups of people with complicated systems or “baggage.” There are several steps—like sorting cores and importing and exporting data—where the Wolfram Language provides a convenient tool, and one step (splitting baggage chains) where the Wolfram Language turns an overwhelming problem into a simple task. What do you think, Mr. Bird?
✕
ResourceFunction["BirdSay"][ Column@{"That's cool!", First@WebImageSearch["Ultimate Frisbee", "Thumbnails", MaxItems -> 1]}] |
Start building your own teams with Wolfram|One, the first fully cloud-desktop hybrid, integrated computation platform. |
If there is a thing that could save this World and this civilization, it is TEAMWORK at its basic factor.
I can see clearly how a lot of human problems can be solved by synergisticaly teaming people. This article, written as a curiosity, its only a face of a multifaceted part of psicological science that hold power and depth to change history. It has done it and it will keep doing it.
Thanks for sharing this development. I really love the fact that I have stumbled upon it!