In past blog posts, we’ve talked about the Wolfram Language’s builtin, highlevel functionality for 3D printing. Today we’re excited to share an example of how some more general functionality in the language is being used to push the boundaries of this technology. Specifically, we’ll look at how computation enables 3D printing of very intricate sugar structures, which can be used to artificially create physiological channel networks like blood vessels.
Let’s think about how 3D printing takes a virtual design and brings it into the physical world. You start with some digital or analytical representation of a 3D volume. Then you slice it into discrete layers, and approximate the volume within each layer in a way that maps to a physical printing process. For example, some processes use a digital light projector to selectively polymerize material. Because the projector is a 2D array of pixels that are either on or off, each slice is represented by a binary bitmap. For other processes, each layer is drawn by a nozzle or a laser, so each slice is represented by a vector image, typically with a fixed line width.
In each case, the volume is represented as a stack of images, which, again, is usually an approximation of the desired design. Greater fidelity can be achieved by increasing the resolution of the printer—that is, the smallest pixel or thinnest line it can create. However, there is a practical limit, and sometimes a physical limit to the resolution. For example, in digital light projection a pixel cannot be made much smaller than the wavelength of the light used. Therefore, for some kinds of designs, it’s actually easier to achieve higher fidelity by modifying the process itself. Suppose, for example, you want to make a connected network of cylindrical rods with arbitrary orientation (there is a good reason to do this—we’ll get to that). Any process based on layers or pixels will produce some approximation of the cylinders. You might instead devise a process that is better suited to making this shape.
One type of 3D printing, termed fused deposition modeling, deposits material through a cylindrical nozzle. This is usually done layer by layer, but it doesn’t have to be. If the nozzle is translated in 3D, and the material can be made to stiffen very quickly upon exiting, then you have an elegant way of making arbitrarily oriented cylinders. If you can get new cylinders to stick to existing cylinders, then you can make very interesting things indeed. This nonplanar deposition process is called directwrite assembly, wireframe printing or freeform 3D printing.
Things that you would make using freeform 3D printing are best represented not as solid volumes, but as structural frames. The data structure is actually a graph, where the nodes of the graph are the joints, and the edges of the graph are the beams in the frame. In the following image, you’ll see the conversion of a model to a graph object. Directed edges indicate the corresponding beam can only be drawn in one direction. An interesting computational question is, given such a frame, how do you print it? More precisely, given a machine that can “draw” 3D beams, what sequence of operations do you command the machine to perform?
First, we can distinguish between motions where we are drawing a beam and motions where we are moving the nozzle without drawing a beam. For most designs, it will be necessary to sometimes move the nozzle without drawing a beam. In this discussion, we won’t think too hard about these nonprinting motions. They take time, but, at least in this example, the time it takes to print is not nearly as important as whether the print actually succeeds or fails catastrophically.
We can further define the problem as follows. We have a set of beams to be printed, and each beam is defined by two joints, . Give a sequence of beams and a printing direction for each beam (i.e. ) that is consistent with the following constraints:
1) Directionality: for each beam, we need to choose a direction so that the nozzle doesn’t collide with that beam as it’s printed.
2) Collision: we have to make sure that as we print each beam, we don’t hit a previously printed beam with the nozzle.
3) Connection: we have to start each beam from a physical surface, whether that be the printing substrate or an existing joint.
Let’s pause there for a moment. If these are the only three constraints, and there are only three axes of motion, then finding a sequence that is consistent with the constraints is straightforward. To determine whether printing beam B would cause a collision with beam A, we first generate a volume by sweeping the nozzle shape along the path coincident with beam B to form the 3D region . If RegionDisjoint[R, A] is False, then printing beam B would cause a collision with beam A. This means that beam A has to be printed first.
Here’s an example from the RegionDisjoint reference page to help illustrate this. Red walls collide with the cow and green walls do not:
✕
cow=ExampleData[{\"Geometry3D\",\"Cow\"},\"MeshRegion\"]; 
✕
w1=Hyperplane[{1,0,0},0.39]; w2=Hyperplane[{1,0,0},0.45]; 
✕
wallColor[reg_,wall_]:=If[RegionDisjoint[reg,wall],Green,Red] 
✕
Show[cow,Graphics3D[{{wallColor[cow,w1],w1},{wallColor[cow,w2],w2}}],PlotRangePadding>.04] 
Mimicking the logic from this example, we can make a function that takes a swept nozzle and finds the beams that it collides with. Following is a Wolfram Language command that visualizes nozzlebeam collisions. The red beams must be drawn after the green one to avoid contact with the blue nozzle as it draws the green beam:
✕
HighlightNozzleCollisions[,{{28,0,10},{23,0,10}}] 
For a printer with three axes of motion, it isn’t particularly difficult to compute collision constraints between all the pairs of beams. We can actually represent the constraints as a directed graph, with the nodes representing the beams, or as an adjacency matrix, where a 1 in element (, ) indicates that beam must precede beam . Here’s the collision matrix for the bridge:
A feasible sequence exists, provided this precedence graph is acyclic. At first glance, it may seem that a topological sort will give such a feasible sequence; however, this does not take the connection constraint into consideration, and therefore nonanchored beams might be sequenced. Somewhat surprisingly, TopologicalSort can often yield a sequence with very few connection violations. For example, in the topological sort, only the 12th and 13th beams violate the connection constraint:
✕
ordering=TopologicalSort[AdjacencyGraph[SparseArray[Specified elements: 2832 Dimensions: {135,135}]]] 
Instead, to consider all three aforementioned constraints, you can build a sequence in the following greedy manner. At each step, print any beam such that: (a) the beam can be printed starting from either the substrate or an existing joint; and (b) all of the beam’s predecessors have already been printed. There’s actually a clever way to speed this up: go backward. Instead of starting at the beginning, with no beams printed, figure out the last beam you’d print. Remove that last beam, then repeat the process. You don’t have to compute collision constraints for a beam that’s been removed. Keep going until all the beams are gone, then just print in the reverse removal order. This can save a lot of time, because this way you never have to worry about whether printing one beam will make it impossible to print a later beam due to collision. For a threeaxis printer this isn’t a big deal, but for a four or fiveaxis robot arm it is.
So the assembly problem under collision, connection and directionality constraints isn’t that hard. However, for printing processes where the material is melted and solidifies by cooling, there is an additional constraint. This is shown in the following video:
See what happened? The nozzle is hot, and it melts the existing joint. Some degree of melting is unfortunately necessary to fuse new beams to existing joints. We could add scaffolding or try to find some physical solution, but we can circumvent it in many cases by computation alone. Specifically, we can find a sequence that is not only consistent with collision, connection and directionality constraints, but that also never requires a joint to simultaneously support two cantilevered beams. Obviously some things, like the tree we tried to print previously, are impossible to print under this constraint. However, it turns out that some very intimidatinglooking designs are in fact feasible.
We approach the problem by considering the assembly states. A state is just the set of beams that has been assembled, and contains no information about the order in which they were assembled. Our goal is to find a path from the start state to the end state. Because adjacent states differ by the presence of a single beam, each path corresponds to a unique assembly sequence. For small designs, we can actually generate the whole graph. However, for large designs, exhaustively enumerating the states would take forever. For illustrative purposes, here’s a structure where the full assembly state is small enough to enumerate. Note that some states are unreachable or are a dead end:
Note that, whether you start at the beginning and go forward or start at the end and work backward, you can find yourself in a dead end. These dead ends are labeled G and H in the figure. There might be any number of dead ends, and you may have to visit all of them before you find a sequence that works. You might never find a sequence that works! This problem is actually NP complete—that is, you can’t know if there is a feasible sequence without potentially trying all of them. The addition of the cantilever constraint is what makes the problem hard. You can’t say for sure if printing a beam is going to make it impossible to assemble another beam later. What’s more, going backward doesn’t solve that problem: you can’t say for sure if removing a beam is going to make it impossible to remove a beam later due to the cantilever constraint.
The key word there is “potentially.” Usually you can find a sequence without trying everything. The algorithm we developed searches the assembly graph for states that don’t contain cantilevers. If you get to one of these states, it doesn’t mean a full sequence exists. However, it does mean that if a sequence exists, you can find one without backtracking past this particular cantileverfree state. This essentially divides the problem into a series of much smaller NPcomplete graph search problems. Except in contrived cases, these can be solved quickly, enabling construction of very intricate models:
✕FindFreeformPath[,Monitor>Full]

So that mostly solves the problem. However, further complicating matters is that these slender beams are about as strong as you might expect. Gravity can deform the construct, but there is actually a much larger force attributable to the flow of material out of the nozzle. This force can produce catastrophic failure, such as the instability shown here:
However, it turns out that intelligent sequencing can solve this problem as well. Using models developed for civil engineering, it is possible to compute at every potential step the probability that you’re going to break your design. The problem then becomes not one of finding the shortest path to the goal, but of finding the safest path to the goal. This step requires inversion of large matrices and is computationally intensive, but with the Wolfram Language’s fast builtin solvers, it becomes feasible to perform this process hundreds of thousands of times in order to find an optimal sequence.
So that’s the how. The next question is, “Why?” Well, the problem is simple enough. Multicellular organisms require a lot of energy. This energy can only be supplied by aerobic respiration, a fancy term for a cascade of chemical reactions. These reactions use oxygen to produce the energy required to power all higher forms of life. Nature has devised an ingenious solution: a complex plumbing system and an indefatigable pump delivering oxygenrich blood to all of your body’s cells, 24/7. If your heart doesn’t beat at least once every couple seconds, your brain doesn’t receive enough oxygenrich blood to maintain consciousness.
We don’t really understand superhighlevel biological phenomena like consciousness. We can’t, as far as we can tell, engineer a conscious array of cells, or even of transistors. But we understand pretty well the plumbing that supports consciousness. And it may be that if we can make the plumbing and deliver oxygen to a sufficiently thick slab of cells, we will see some emergent phenomena. A conscious brain is a long shot, a functional piece of liver or kidney decidedly less so. Even a small piece of vascularized breast or prostate tissue would be enormously useful for understanding how tumors metastasize.
The problem is, making the plumbing is hard. Cells in a dish do selforganize to an extent, but we don't understand such systems well enough to tell a bunch of cells to grow into a brain. Plus, as noted, growing a brain sort of requires attaching it to a heart. Perhaps if we understand the rules that govern the generation of biological forms, we can generate them at will. We know that with some simple mathematical rules, one can generate very complex, interesting structures—the stripes on a zebra, the venation of a leaf. But going backward, reverseengineering the rule from the form, is hard, to say the least. We have mastered the genome and can program single cells, but we are novices at best when it comes to predicting or programming the behavior of cellular ensembles.
An alternative means of generating biological forms like vasculature is a bit cruder—just draw the form you want, then physically place all the cells and the plumbing according to your blueprint. This is bioprinting. Bioprinting is exciting because it reduces the generation of biological forms into a set of engineering problems. How do we make a robot put all these cells in the right place? These days, any sentence that starts with “How do we make a robot...” probably has an answer. In this case, however, the problem is complicated by the fact that, while the robot or printer is working, the cells that have already been assembled are slowly dying. For really big, complex tissues, either you need to supply oxygen to the tissue as you assemble it or you need to assemble it really fast.
One approach of the really fast variety was demonstrated in 2009. Researchers at Cornell used a cotton candy machine to meltspin a pile of sugar fibers. They cast the sugar fibers in a polymer, dissolved them out with water and made a vascular network in minutes, albeit with little control over the geometry. A few years later, researchers at University of Pennsylvania used a hacked desktop 3D printer to draw molten sugar fibers into a lattice and show that the vascular casting approach was compatible with a variety of cellladen gels. This was more precise, but not quite freeform. The next step, undertaken in a collaboration between researchers at the University of Illinois at Urbana–Champaign and Wolfram Research, was to overcome the physical and computational barriers to making really complex designs—in other words, to take sugar printing and make it truly freeform.
We’ve described the computational aspects of freeform 3D printing in the first half of this post. The physical side is important too.
First, you need to make a choice of material. Prior work has used glucose or sucrose—things that are known to be compatible with cells. The problem with these materials is twofold: One, they tend to burn. Two, they tend to crystallize while you’re trying to print. If you’ve ever left a jar of honey or maple syrup out for a long time, you can see crystallization in action. Crystals will clog your nozzle, and your print will fail. Instead of conventional sugars, this printer uses isomalt, a lowcalorie sugar substitute. Isomalt is less prone to burning or crystallizing than other sugarlike materials, and it turns out that cells are just as OK with isomalt as they are with real sugar.
Next, you need to heat the isomalt and push it out of a tiny nozzle under high pressure. You have to draw pretty slowly—the nozzle moves about half a millimeter per second—but the filament that is formed coincides almost exactly with the path taken by the nozzle. Right now it’s possible to be anywhere from 50 to 500 micrometers, a very nice range for blood vessels.
So the problems of turning a design into a set of printer instructions, and of having a printer that is sufficiently precise to execute them, are more or less solved. This doesn’t mean that 3Dprinted organs are just around the corner. There are still problems to be solved in introducing cells in and around these vascular molds. Depending on the ability of the cells to selforganize, dumping them around the mold or flowing them through the finished channels might not be good enough. In order to guide development of the cellular ensemble into a functional tissue, more precise patterning may be required from the outset; direct cell printing would be one way to do this. However, our understanding of selforganizing systems increases every day. For example, last year researchers reproduced the first week of mouse embryonic development in a petri dish. This shows that in the right environment, with the right mix of chemical signals, cells will do a lot of the work for us. Vascular networks deliver oxygen, but they can also deliver things like drugs and hormones, which can be used to poke and prod the development of cells. In this way, bioprinting might enable not just spatial but also temporal control of the cells’ environment. It may be that we use the vascular network itself to guide the development of the tissue deposited around it. Cardiologists shouldn’t expect a 3Dprinted heart for their next patients, but scientists might reasonably ask for a 3Dprinted sugar scaffold for their next experiments.
So to summarize, isomalt printing offers a route to making interesting physiological structures. Making it work requires a certain amount of mechanical and materials engineering, as one might expect, but also a surprising amount of computational engineering. The Wolfram Language provides a powerful tool for working with geometry and physical models, making it possible to extend freeform bioprinting to arbitrarily large and complex designs.
To learn more about our work, check out our papers: a preprint regarding the algorithm (to appear in IEEE Transactions on Automation Science and Engineering), and another preprint regarding the printer itself (published in Additive Manufacturing).
This work was performed in the Chemical Imaging and Structures Laboratory under the principal investigator Rohit Bhargava at the University of Illinois at Urbana–Champaign.
Matt Gelber was supported by fellowships from the Roy J. Carver Charitable Trust and the Arnold and Mabel Beckman Foundation. We gratefully acknowledge the gift of isomalt and advice on its processing provided by Oliver Luhn of Südzucker AG/BENEOPalatinit GmbH. The development of the printer was supported by the Beckman Institute for Advanced Science and Technology via its seed grant program.
We also would like to acknowledge Travis Ross of the Beckman Institute Visualization Laboratory for help with macrophotography of the printed constructs. We also thank the contributors of the CAD files on which we based our designs: GrabCAD user M. G. Fouché, 3D Warehouse user Damo and Bibliocas user limazkan (Javier Mdz). Finally, we acknowledge Seth Kenkel for valuable feedback throughout this project.
The story started with a conversation about data with some of the Bloodhound team, which is trying to create a 1000 mph car. I offered to spend an hour or two looking at some sample data to give them some ideas of what might be done. They sent me a curious binary file that somehow contained the output of 32 sensors recorded from a single subsonic run of the ThrustSSC car (the current holder of the world land speed record).
The first thing I did was code the information that I had been given about the channel names and descriptions, in a way that I could easily query:
✕
channels={"SYNC">"Synchronization signal","D3fm">"Rear left active suspension position","D5fm">"Rear right active suspension position","VD1">"Unknown","VD2">"Unknown","L1r">"Load on front left wheel","L2r">"Load on front right wheel","L3r">"Load on rear left wheel","L4r">"Load on rear right wheel","D1r">"Front left displacement","D2r">"Front right displacement","D4r">"Rear left displacement","D6r">"Rear right displacement","Rack1r">"Steering rack displacement rear left wheel","Rack2r">"Steering rack displacement rear right wheel","PT1fm">"Pitot tube","Dist">"Distance to go (unreliable)","RPM1fm">"RPM front left wheel","RPM2fm">"RPM front right wheel","RPM3fm">"RPM rear left wheel","RPM4fm">"RPM rear right wheel","Mach">"Mach number","Lng1fm">"Longitudinal acceleration","EL1fm">"Engine load left mount","EL2fm">"Engine load right mount","Throt1r">"Throttle position","TGTLr">"Turbine gas temperature left engine","TGTRr">"Turbine gas temperature right engine","RPMLr">"RPM left engine spool","RPMRr">"RPM right engine spool","NozLr">"Nozzle position left engine","NozRr">"Nozzle position right engine"}; 
✕
SSCData[]=First/@channels; 
✕
SSCData[name_,"Description"]:=Lookup[channels,name,Missing[]]; TextGrid[{#,SSCData[#,"Description"]}&/@SSCData[],Frame>All] 
Then on to decoding the file. I had no guidance on format, so the first thing I did was pass it through the 200+ fully automated import filters:
✕
DeleteCases[Map[Import["BLK1_66.dat",#]&,$ImportFormats],$Failed] 
Thanks to the automation of the Import command, that only took a couple of minutes to do, and it narrowed down the candidate formats. Knowing that there were channels and repeatedly visualizing the results of each import and transformation to see if they looked like realworld data, I quickly tumbled on the following:
✕
MapThread[Set,{SSCData/@SSCData[],N[Transpose[Partition[Import["BLK1_66.dat","Integer16"],32]]][[All,21050;;1325]]}]; 
✕
Row[ListPlot[SSCData[#],PlotLabel>#,ImageSize>170]&/@SSCData[]] 
The ability to automate all 32 visualizations without worrying about details like plot ranges made it easy to see when I had gotten the right import filter and combination of Partition and Transpose. It also let me pick out the interesting time interval quickly by trial and error.
OK, data in, and we can look at all the channels and immediately see that SYNC and Lng1fm contain nothing useful, so I removed them from my list:
✕
SSCData[] = DeleteCases[SSCData[], "SYNC"  "Lng1fm"]; 
The visualization immediately reveals some very similarlooking plots—for example, the wheel RPMs. It seemed like a good idea to group them into similar clusters to see what would be revealed. As a quick way to do that, I used an idea from social network analysis: to form graph communities based on the relationship between individual channels. I chose a simple family relationship—streams with a correlation with of at least 0.4, weighted by the correlation strength:
✕
correlationEdge[{v1_,v2_}]:=With[{d1=SSCData[v1],d2=SSCData[v2]}, If[Correlation[d1,d2]^2<0.4,Nothing,Property[UndirectedEdge[v1,v2],EdgeWeight>Correlation[d1,d2]^2]]]; 
✕
edges = Map[correlationEdge, Subsets[SSCData[], {2}]]; CommunityGraphPlot[Graph[ Property[#, {VertexShape > Framed[ListLinePlot[SSCData[#], Axes > False, Background > White, PlotRange > All], Background > White], VertexLabels > None, VertexSize > 2}] & /@ SSCData[], edges, VertexLabels > Automatic], CommunityRegionStyle > LightGreen, ImageSize > 530] 
I ended up with three main clusters and five uncorrelated data streams. Here are the matching labels:
✕
CommunityGraphPlot[Graph[ Property[#, {VertexShape > Framed[Style[#, 7], Background > White], VertexLabels > None, VertexSize > 2}] & /@ SSCData[], edges, VertexLabels > Automatic], CommunityRegionStyle > LightGreen, ImageSize > 530] 
Generally it seems that the right cluster is speed related and the left cluster is throttle related, but perhaps the interesting one is the top, where jet nozzle position, engine mount load and front suspension displacement form a group. Perhaps all are thrust related.
The most closely aligned channels are the wheel RPMs. Having all wheels going at the same speed seems like a good thing at 600 mph! But RPM1fm, the frontleft wheel is the least correlated. Let’s look more closely at that:
✕
TextGrid[ Map[SSCData[#, "Description"] &, MaximalBy[Subsets[SSCData[], {2}], Abs[Correlation[SSCData[#[[1]]], SSCData[#[[2]]]]] &, 10]], Frame > All] 
I have no units for any instruments and some have strange baselines, so I am not going to assume that they are calibrated in an equivalent way. That makes comparison harder. But here I can call on some optimization to align the data before we compare. I rescale and shift the second dataset so that the two sets are as similar as possible, as measured by the Norm of the difference. I can forget about the details of optimization, as FindMinimum takes care of that:
✕
alignedDifference[d1_,d2_]:=With[{shifts=Quiet[FindMinimum[Norm[d1(a d2+b),1],{a,b}]][[2]]},d1(a #+b&/.shifts)/@d2]; 
Let’s look at a closely aligned pair of values first:
✕
ListLinePlot[MeanFilter[alignedDifference[SSCData["RPM3fm"],SSCData["RPM4fm"]],40],PlotRange>All,PlotLabel>"Difference in rear wheel RPMs"] 
Given that the range of RPM3fm was around 0–800, you can see that there are only a few brief events where the rear wheels were not closely in sync. I gradually learned that many of the sensors seem to be prone to very short glitches, and so probably the only real spike is the briefly sustained one in the fastest part of the run. Let’s look now at the front wheels:
✕
ListLinePlot[MeanFilter[alignedDifference[SSCData["RPM1fm"],SSCData["RPM2fm"]],40],PlotRange>All,PlotLabel>"Difference in front wheel RPMs"] 
The differences are much more prolonged. It turns out that desert sand starts to behave like liquid at high velocity, and I don’t know what the safety tolerances are here, but that frontleft wheel is the one to worry about.
I also took a look at the difference between the front suspension displacements, where we see a more worrying pattern:
✕
ListLinePlot[MeanFilter[alignedDifference[SSCData["D1r"],SSCData["D2r"]],40],PlotRange>All,PlotLabel>"Difference in front suspension displacements"] 
Not only is the difference a larger fraction of the data ranges, but you can also immediately see a periodic oscillation that grows with velocity. If we are hitting some kind of resonance, that might be dangerous. To look more closely at this, we need to switch paradigms again and use some signal processing tools. Here is the Spectrogram of the differences between the displacements. The Spectrogram is just the magnitude of the discrete Fourier transforms of partitions of the data. There are some subtleties about choosing the partitioning size and color scaling, but by default that is automated for me. We should read it as time along the axis, frequency along the , and darker values are greater magnitude:
✕
Spectrogram[alignedDifference[SSCData["D1r"],SSCData["D2r"]],PlotLabel>"Difference in front suspension displacements"] 
We can see the vibration as a dark line from 2000 to 8000, and that its frequency seems to rise early in the run and then fall again later. I don’t know the engineering interpretation, but I would suspect that this reduces the risk of dangerous resonance compared to constant frequency vibration.
It seems like acceleration should be interesting, but we have no direct measurement of that in the data, so I decided to infer that from the velocity. There is no definitive accurate measure of velocity at these speeds. It turned out that the Pitot measurement is quite slow to adapt and smooths out the features, so the better measure was to use one of the wheel RPM values. I take the derivative over a 100sample interval, and some interesting features pop out:
✕
ListLinePlot[Differences[SSCData["RPM4fm"], 1, 100], PlotRange > {100, 80}, PlotLabel > "Acceleration"] 
The acceleration clearly goes up in steps and there is a huge negative step in the middle. It only makes sense when you overlay the position of the throttle:
✕
ListLinePlot[ {MeanFilter[Differences[SSCData["RPM4fm"],1,100],5], MeanFilter[SSCData["Throt1r"]/25,10]}, PlotLabel>"Acceleration vs Throttle"] 
Now we see that the driver turns up the jets in steps, waiting to see how the car reacts before he really goes for it at around 3500. The car hits peak acceleration, but as wind resistance builds, acceleration falls gradually to near zero (where the car cruises at maximum speed for a while before the driver cuts the jets almost completely). The wind resistance then causes the massive deceleration. I suspect that there is a parachute deployment shortly after that to explain the spikiness of the deceleration, and some real brakes at 8000 bring the car to a halt.
I was still pondering vibration and decided to look at the load on the suspension from a different point of view. This wavelet scalogram turned out to be quite revealing:
✕
WaveletScalogram[ContinuousWaveletTransform[SSCData["L1r"]],PlotLabel>"Suspension frequency over time"] 
You can read it the same as the Spectrogram earlier, time along , and frequency on the axis. But scalograms have a nice property of estimating discontinuities in the data. There is a major pair of features at 4500 and 5500, where higherfrequency vibrations appear and then we cross a discontinuity. Applying the scalogram requires some choices, but again, the automation has taken care of some of those choices by choosing a MexicanHatWavelet[1] out of the dozen or so wavelet choices and the choice of 12 octaves of resolution, leaving me to focus on the interpretation.
I was puzzled by the interpretation, though, and presented this plot to the engineering team, hoping that it was interesting. They knew immediately what it was. While this run of the car had been subsonic, the top edge of the wheel travels forward at twice the speed of the vehicle. These features turned out to detect when that top edge of the wheel broke the sound barrier and when it returned through the sound barrier to subsonic speeds. The smaller features around 8000 correspond to the deployment of the physical brakes as the car comes to a halt.
There is a whole sequence of events that happen in a data science project, but broadly they fall into: data acquisition, analysis, deployment. Deployment might be setting up automated report generation, creating APIs to serve enterprise systems or just creating a presentation. Having only offered a couple of hours, I only had time to format my work into a slide show notebook. But I wanted to show one other deployment, so I quickly created a dashboard to recreate a simple cockpit view:
✕
CloudDeploy[ With[{data = AssociationMap[ Downsample[SSCData[#], 10] &, {"Throt1r", "NozLr", "RPMLr", "RPMRr", "Dist", "D1r", "D2r", "TGTLr"}]}, Manipulate[ Grid[List /@ { Grid[{{ VerticalGauge[data[["Throt1r", t]], {2000, 2000}, GaugeLabels > "Throttle position", GaugeMarkers > "ScaleRange"], VerticalGauge[{data[["D1r", t]], data[["D2r", t]]}, {1000, 2000}, GaugeLabels > "Displacements"], ThermometerGauge[data[["TGTLr", t]] + 1600, {0, 1300}, GaugeLabels > Placed[ "Turbine temperature", {0.5, 0}]]}}, ItemSize > All], Grid[{{ AngularGauge[data[["RPMLr", t]], {0, 2000}, GaugeLabels > "RPM L", ScaleRanges > {1800, 2000}], AngularGauge[data[["RPMRr", t]], {0, 2000}, GaugeLabels > "RPM R", ScaleRanges > {1800, 2000}] }}, ItemSize > All], ListPlot[{{data[["Dist", t]], 2}}, PlotMarkers > Magnify["", 0.4], PlotRange > {{0, 1500}, {0, 10}}, Axes > {True, False}, AspectRatio > 1/5, ImageSize > 500]}], {{t, 1, "time"}, 1, Length[data[[1]]], 1}]], "SSCDashboard", Permissions > "Public"] 
In this little meander through the data, I have made use of graph theory, calculus, signal processing and wavelet analysis, as well as some classical statistics. You don’t need to know too much about the details, as long as you know the scope of tools available and the concepts that are being applied. Automation takes care of many of the details and helps to deploy the data in an accessible way. That’s multiparadigm data science in a nutshell.
In this video, Yule describes how the power and flexibility of the Wolfram Language were essential in creating Alex, a centralized hub for accessing and maintaining his team’s computational knowledge:
Consultants at Assured Flow Solutions use a variety of computations for analyzing oil and gas production issues involving both pipeline simulations and realworld lab testing. Yule’s first challenge was to put all these methods and techniques into a consistent framework—essentially trying to answer the question “How do you collect and manage all this intellectual property?”
Prior to Alex, consultants had been pulling from dozens of Excel spreadsheets scattered across network drives, often with multiple versions, which made it difficult to find the right tool for a particular task. Yule started by systematically replacing these with faster, more robust Wolfram Language computations. He then consulted with subject experts in different areas, capturing their knowledge as symbolic code to make it usable by other employees.
Yule deployed the toolkit as a cloudaccessible package secured using the Wolfram Language’s builtin encoding functionality. Named after the ancient Library of Alexandria, Alex quickly became the canonical source for the company’s algorithms and data.
Utilizing the flexible interface features of the Wolfram Language, Yule then built a front end for Alex. On the left is a pane that uses highlevel pattern matching to search and navigate the available tools. Selected modules are loaded in the main window, including interactive controls for precise adjustment of algorithms and parameters:
Yule included additional utilities for copying and exporting data, loading and saving settings, and reporting bugs, taking advantage of the Wolfram Language’s file and emailhandling abilities. The interface itself is deployed as a standalone Wolfram Notebook using the EnterpriseCDF standard, which provides access to all the company’s intellectual property without requiring a local Wolfram Language installation.
This centralization of tools has completely changed the way Assured Flow Solutions views data analytics and visualizations. In addition to providing quick, easy access to the company’s codebase, Alex has greatly improved the speed, accuracy and consistency of results. And using the Wolfram Language’s symbolic framework adds the flexibility to work with any kind of input. “It doesn’t matter if you’re loading in raw data, images, anything—it all has the same feel to it. Everything’s an expression in the Wolfram Language,” says Yule.
With the broad deployment options of the Wolfram Cloud, consultants can easily share notebooks and results for internal collaboration. They have also begun deploying instant APIs, allowing client applications to utilize Wolfram Language computations without exposing source code.
Overall, Yule prefers the Wolfram Language to other systems because of its versatility—or, as he puts it, “the ability to write one line of code that will accomplish ten things at once.” Its unmatched collection of builtin algorithms and connections makes it “a really powerful alternative to things like Excel.” Combining this with the secure hosting and deployment of the Wolfram Cloud, Wolfram technology provides the ideal environment for an enterprisewide computation hub like Alex.
Find out more about Andrew Yule and other exciting Wolfram Language applications on our Customer Stories pages.
]]>FOALE AEROSPACE is the brainchild of astronaut Dr. Mike Foale and his daughter Jenna Foale. Mike is a man of many talents (pilot, astrophysicist, entrepreneur) and has spent an amazing 374 days in space! Together with Jenna (who is currently finishing her PhD in computational fluid dynamics), he was able to build a complex machine learning system at minimal cost. All their development work was done inhouse, mainly using the Wolfram Language running on the desktop and a Raspberry Pi. FOALE AEROSPACE’s system, which it calls the Solar Pilot Guard (SPG), is a solarcharged probe that identifies and helps prevent lossofcontrol (LOC) events during airplane flight. Using sensors to detect changes in the acceleration and air pressure, the system calculates the probability of each data point (an instance in time) to be infamily (normal flight) or outoffamily (nonnormal flight/possible LOC event), and issues the pilot voice commands over a Bluetooth speaker. The system uses classical functions to interpolate the dynamic pressure changes around the airplane axes; then, through several layers of Wolfram’s automatic machine learning framework, it assesses when LOC is imminent and instructs the user on the proper countermeasures they should take.
You can see the system work its magic in this short video on the FOALE AEROSPACE YouTube channel. As of the writing of this blog, a few versions of the SPG system have been designed and built: the 2017 version (talked about extensively in a Wolfram Community post by Brett Haines) won the bronze medal at the Experimental Aircraft Association’s Founder’s Innovation Prize. In the year since, Mike has been working intensely to upgrade the system from both a hardware and software perspective. As you can see in the following image, the 2018 SPG has a new streamlined look, and is powered by solar cells (which puts the “S” in “SPG”). It also connects to an offtheshelf Bluetooth speaker that sits in the cockpit and gives instructions to the pilot.
While the probe required some custom hardware and intense design to be so easily packaged, the FOALE AEROSPACE team used offtheshelf Wolfram Language functions to create a powerful machine learning–based tool for the system’s software. The core of the 2017 system was a neural network–based classifier (built using Wolfram’s Classify function), which enabled the classification of flight parameters into infamily and outoffamily flight (possible LOC) events. In the 2018 system, the team used a more complex algorithm involving layering different machine learning functions together to create a semiautomatic pipeline. The combined several layers of supervised and unsupervised learning result in a semiautomated pipeline for dataset creation and classification. The final deployment is again a classifier that classifies infamily and outoffamily (LOC) flights, but this time in a more automatic and robust way.
To build any type of machine learning application, the first thing we need is the right kind of data. In the case at hand, what was needed was actual flight data—both from normal flight patterns and from nonnormal flight patterns (the latter leading to LOC events). To do this, one would need to set up the SPG system, start recording with it and take it on a flight. During this flight, it would need to sample both normal flight data and nonnormal/LOC events, which means Mike needed to intentionally make his aircraft lose control, over and over again. If this sounds dangerous, it’s because it is, so don’t try this at home. During such a flight, the SPG records acceleration and air pressure data across the longitudinal, vertical and lateral axes (x, y, z). From these inputs, the SPG can calculate the acceleration along the axes, the sideslip angle (β—how much it is moving sideways), the angle of attack (α—the angle between the direction of the noise and the horizontal reference plane) and the relative velocity (of the airplane to the air around it)—respectively, Ax, Ay, Az, β, α and Vrel in the following plot:
A plot of the flight used as the training set. Note that the vertical axis is inverted so a lower value corresponds to an increase in quantity.
Connecting the entire system straight to a Raspberry Pi running the Wolfram Language made gathering all this data and computing with it ridiculously easy. Looking again at the plot, we already notice that there is a phase of almoststeady values (up to 2,000 on the horizontal axis) and a phase of fluctuating values (2,000 onward). Our subject matter expert, Mike Foale, says that these correspond to runway and flight time, respectively. Now that we have some raw data, we need to process and clean it up in order to learn from it.
Taking the same dataset, we first remove any data that isn’t interesting (for example, anything before the 2,000th data point). Now we can replot the data:
In the 2017 system, the FOALE AEROSPACE team had to manually curate the right flight segments that correspond to LOC events to create a dataset. This was a laborintensive process that became semiautomated in the 2018 system.
We now take the (lightly) processed data and start applying the needed machine learning algorithms to it. First, we will cluster the training data to create infamily and outoffamily clusters. To assess which clusters are infamily and which are outoffamily, we will need a human subject matter expert. We will then train the first classifier using those clusters as classes. Now we take a new dataset and, using the first classifier we made, filter out any infamily events (normal flight). Finally, we will cluster the filtered data (with some subject matter expert help) and use the resulting clusters as classes in our final classifier. This final classifier will be used to indicate LOC events while in flight. A simplified plot of the process is given here:
We start by taking the processed data and clustering it (an unsupervised learning approach). Following is a 3D plot of the clusters resulting from the use of FindClusters (specifying we want to find seven clusters). As you can see, the automatic color scheme is very helpful in visualizing the results. Mike, using his subject matter expertise, assesses groups 1, 2, 3, 6 and 7, which represent normal flight data. Group 5 (pink) is the LOC group, and group 4 (red) is the highvelocity normal flight:
To distinguish the LOC cluster from the others, Mike needed to choose more than two cluster groups. After progressively increasing the number of clusters with FindClusters, seven clusters were chosen to reduce the overlap of LOC group 5 from the neighboring groups 1 and 7, which are normal. A classifier trained with clearly distinguishable data will perform better and produce fewer false positives.
Using this clustered data, we can now train a classifier that will classify infamily flight data and outoffamily flight data (Low/High α—groups 4, 5). This infamily/outoffamily flight classifier will become a powerful machine learning tool in processing the next flight’s data. Using the Classify function and some clever preprocessing, we arrive at the following three class classifiers. The three classes are normal flight (Normal), high α flight (High) and low α flight (Low).
We now take data from a later flight and process it as we did earlier. Here is the resulting plot of that data:
Using our first classifier, we now classify the data as representing an infamily flight or an outoffamily flight. If it is infamily (normal flight), we exclude it from the dataset, as we are only looking for outoffamily instances (representing LOC events). With only nonnormal data remaining, let’s plot the probability of that data being normal:
It is interesting to note that more than half of the remaining data points have less than a 0.05 probability of being normal. Taking this new, refined dataset we apply another layer of clustering, which results in the following plot:
We now see two main groups: group 3, which Mike explains as corresponding with thermaling; and group 1, which is the highspeed flight group. Thermaling is the act of using rising air columns to gain altitude. This involves flying in circles inside the air column (at speeds so slow it’s close to a stall), so it’s not surprising that β has a wide distribution during this phase. Groups 1 and 6 are also considered to be normal flight. Group 7 corresponds to LOC (a straight stall without sideslip). Groups 4 and 5 are imminent stalls with sideslip, leading to a left or right incipient spin and are considered to be LOC. Group 2 is hidden under group 1 and is a very highspeed flight close to the structural limits of the aircraft, so it’s also LOC.
Using this data, we can construct a new, secondgeneration classifier with three classes, low α (U), high α (D) and normal flight (N). These letters refer to the action required by the pilot—U means “pull up,” D means “push down” and N means “do nothing.” It is interesting to note that while the older classifier required days of training, this new filtered classifier only needed hours (and also greatly improved the speed and accuracy of the predictions, and reduced the occurrences of false positives).
As a final trial, Mike went on another flight and maintained a normal flight pattern throughout the entire flight. He later took the recorded data and plotted the probability of it being entirely normal using the secondgeneration classifier. As we can see here, there were no false positives during this flight:
Mike now wanted to test if the classifier would correctly predict possible LOC events. He went on another flight and, again, went into LOC events. Taking the processed data from that flight and plotting the probability of it being normal using the secondgeneration classifier results in the following final plot:
It is easy to see that some events were not classified as normal, although most of them were. Mike has confirmed these events correspond to actual LOC events.
Mike’s development work is a great demonstration as to how machine learning–based applications are going to affect everything that we do, increasing safety and survivability. This is also a great case study to showcase where and why it is so important to keep human subject matter experts in the loop.
Perhaps one of the most striking components of the SPG system is the use of the Wolfram Language on a Raspberry Pi Zero to connect to sensors, record inflight data and run a machine learning application to compute when LOC is imminent—all on a computer that costs $5. Additional details on Mike’s journey can be found on his customer story page.
Just a few years ago, it would have been unimaginable for any one person to create such complex algorithms and deploy them rapidly in a realworld environment. The recent boom of the Internet of Things and machine learning has been driving great developmental work in these fields, and even after its 30th anniversary, the Wolfram Language has continued to be at the cutting edge of programming. Through its highlevel abstractions and deep automation, the Wolfram Language has enabled a wide range of people to use the power of computation everywhere. There are many great products and projects left to be built using the Wolfram Language. Perhaps today is the day to start yours with a free trial of WolframOne!
]]>According to the state of California, some 200,000 residents of the state have unsafe drinking water coming out of their taps. While the Safe Drinking Water Data Challenge focuses on California, data science solutions could have impacts and applications for providing greater access to potable water in other areas with similar problems.
The goal of this post is to show how Wolfram technologies make it easy to grab data and ask questions of it, so we’ll be taking a multiparadigm approach and allowing our analysis to be driven by those questions in an exploratory analysis, a way to quickly get familiar with the data.
Details on instructional resources, documentation and training are at the bottom of this post.
To get started, let’s walk through one of the datasets that has been added to the Wolfram Data Repository, how to access it and how to visually examine it using the Wolfram Language.
We’ll first define and grab data on urban water supply and production using ResourceData:
✕
uwsdata = ResourceData["California Urban Water Supplier Monitoring Reports"] 
What we get back is a nice structured data frame with several variables and measurements that we can begin to explore. (If you’re new to working with data in the Wolfram Language, there’s a fantastic and useful primer on Association and Dataset written by one of our power users, which you can check out here.)
Let’s first check the dimensions of the data:
✕
uwsdata//Dimensions 
We can see that we have close to 19,000 rows of data with 33 columns. Let’s pull the first column and row to get a sense of what we might want to explore:
✕
uwsdata[1,1;;33] 
(We can also grab the data dictionary from the California Open Data Portal using Import.)
✕
Import["https://data.ca.gov/sites/default/files/Urban_Water_Supplier_Monitoring_Data_Dictionary.pdf"] 
Reported water production seems like an interesting starting point, so let’s dig in using some convenient functions—TakeLargestBy and Select—to examine the top ten water production levels by supplier for the last reporting period:
✕
top10=TakeLargestBy[Select[uwsdata,#ReportingMonth==DateObject[{2018,4,15}]&],#ProductionReported&,10] 
Unsurprisingly, we see very populous regions of the state of California having the highest levels of reported water production. Since we have already defined our topten dataset, we can now look at other variables in this subset of the data. Let’s visualize which suppliers have the highest percentages of residential water use with BarChart. We will use the top10 definition we just created and use All to examine every row of the data by the column "PercentResidentialUse":
✕
BarChart[top10[All, "PercentResidentialUse"], ColorFunction > "SolarColors", ChartLabels > Normal[top10[All, "SupplierName"]], BarOrigin > Left] 
You’ll notice that I used ColorFunction to indicate higher values as brighter colors. (There are many pallettes to choose from.) Just as a brief exploration, let’s look at these supplier districts by population served:
✕
BarChart[top10[All,"PopulationServed"],ColorFunction>"SolarColors",ChartLabels>Normal[top10[All,"SupplierName"]],BarOrigin>Left] 
The Eastern Municipal Water District is among the smallest of these in population, but we’re looking at percentages of residential water use, which might indicate there is less industrial or agricultural use of water in that district.
Since we’re looking at safe drinking water data, let’s explore penalties against water suppliers for regulatory violations. We’ll use the same functions as before, but this time we’ll take the top five and then see what we can find out about a particular district with builtin data:
✕
top5= TakeLargestBy[Select[uwsdata,#ReportingMonth==DateObject[{2018,4,15}]&],#PenaltiesRate &,5] 
So we see the City of San Bernardino supplier has the highest penalty rate out of our top five. Let’s start looking at penalty rates for the City of San Bernardino district. We have other variables that are related, such as complaints, warnings and followups. Since we’re dealing with temporal data, i.e. penalties over time, we might want to use TimeSeries functionality, so we’ll go ahead and start defining a few things, including our date range (which is uniform across our data) and the variables we just mentioned. We’ll also use Select to pull production data for the City of San Bernardino only:
✕
dates=With[{sbdata=Select[uwsdata,#SupplierName=="City of San Bernardino" &]},sbdata[All,"ReportingMonth"]//Normal]; 
A few things to notice here. First, we used the function With to combine some definitions into more compact code. We then used Normal to transform the dates to a list so they’re easier to manipulate for time series.
Basically, what we said here is, “With data from the supplier named City of San Bernardino, define the variable dates as the reporting month from that data and turn it into a list.” Once you can start to see the narrative of your code, the better you can start programming at the rate of your thought, kind of like regular typing, something the Wolfram Language is very well suited for.
Let’s go ahead and define our penaltyrelated variables:
✕
{prate,warn,follow,complaints}=Normal[sbdata[All,#]]&/@Normal[{"PenaltiesRate","Warnings","FollowUps","Complaints"}]; 
So we first put our variables in order in curly brackets and used # (called “slot,” though it’s tempting to call it “hashtag”!) as a placeholder for a later argument. So, if we were to read this line of code, it would be something like, “For these four variables, use all rows of the San Bernardino data, make them into a list and define each of those variables with the penalty rate, warnings, followups and complaints columns, in that order, as a list. In other words, extract those columns of data as individual variables.”
Since we’ll probably be using TimeSeries a good bit with this particular data, we can also go ahead and define a function to save us time down the road:
✕
ts[v_]:=TimeSeries[v,{dates}] 
All we’ve said here is, “Whenever we type ts[], whatever comes in between the brackets will be plugged into the right side of the function where v is.” So we have our TimeSeries function, and we went ahead and put dates in there so we don’t have to continually associate a range of values with each of our date values every time we want to make a time series. We can also go ahead and define some style options to save us time with visualizations:
style = {PlotRange > All, Filling > Axis, Joined > False, Frame > False};
Now with some setup out of the way (this can be tedious, but it’s important to stay organized and efficient!), we can generate some graphics:
✕
With[{tsP=ts[#]&/@{prate,warn,follow,complaints}},DateListPlot[tsP,style]] 
So we again used With to make our code a bit more compact and used our ts[] time series function and went a level deeper by using # again to apply that time series function to each of those four variables. Again, in plain words, “With this variable, take our time series function and apply it to these four variables that come after &. Then, make a plot of those time series values and apply the style we defined to it.”
We can see some of the values are flat along the x axis. Let’s take a look at the range of values in our variables and see if we can improve upon this:
✕
Max[#]&/@{prate,warn,follow,complaints} 
We can see that the penalty rate has a massively higher maximum value than our other variables. So what should we do? Well, we can log the values and visualize them all in one go with DateListLogPlot:
✕
With[{tsP=ts[#]&/@{prate,warn,follow,complaints}},DateListLogPlot[tsP,style]] 
So it appears that the enforcement program didn’t really get into full force until sometime after 2015, and following preliminary actions, penalties started being issued on a massive scale. Penaltyrelated actions appear to also increase during summer months, perhaps when production is higher, something we’ll examine and confirm a little later. Let’s look at warnings, followups and complaints on their own:
✕
With[{tsP2=ts[#]&/@{warn,follow,complaints}},DateListPlot[tsP2,PlotLegends>{"Warnings","Followups","Complaints"},Frame>False]] 
We used similar code to the previous graphic, but this time we left out our defined style and used PlotLegends to help us see which variables apply to which values. We can visualize this a little differently using StackedDateListPlot:
✕
With[{tsP2=ts[#]&/@{warn,follow,complaints}},StackedDateListPlot[tsP2,PlotLegends>{"Warnings","Followups","Complaints"},Frame>False]] 
We see a strong pattern here of complaints, warnings and followups occurring in tandem, something not all too surprising but that might indicate the effectiveness of reporting systems.
So far, we’ve looked at one city and just a few variables in exploratory analysis. Let’s shift gears and take a look at agriculture. We can grab another dataset in the Wolfram Data Repository to very quicky visualize agricultural land use with a small chunk of code:
✕
GeoRegionValuePlot[ResourceData["California Crop Mapping"][GroupBy["County"],Total,"Acres"]] 
We can also visualize agricultural land use a different way using GeoSmoothHistogram with a GeoBackground option:
✕
GeoSmoothHistogram[ResourceData["California Crop Mapping"][GroupBy["County"],Total,"Acres"],GeoBackground>"Satellite",PlotLegends>Placed[Automatic,Below]] 
Between these two visualizations, we can clearly see California’s central valley has the highest levels of agricultural land use.
Now let’s use our TakeLargestBy function again to grab the top five districts by agricultural water use from our dataset:
✕
TakeLargestBy[Select[uwsdata,#ReportingMonth==DateObject[{2018,4,15}]&],#AgricultureReported &,5] 
✕
$Failed 
So for the last reporting month, we see the Rancho California Water District has the highest amount of agricultural water use. Let’s see if we can find out where in California that is by using WebSearch:
✕
WebSearch["rancho california water district map"] 
✕
$Failed 
Looking at the first link, we can see that the water district serves the city of Temecula, portions of the city of Murrieta and Vail Lake.
One of the most convenient features of the Wolfram Language is the knowledge that’s built directly into the language. (There’s a nice Wolfram U training course about the Wolfram Data Framework you can check out here.)
Let’s grab a map and a satellite image to see what sort of terrain we’re dealing with:
✕
GeoGraphics[Entity["Lake", "VailLake::6737y"],ImageSize>600] GeoImage[Entity["Lake", "VailLake::6737y"],ImageSize>600] 
This looks fairly rural and congruent with our data showing higher levels of agricultural water use, but this is interestingly enough not in the central valley where agricultural land use is highest, something to perhaps note for future exploration and examination.
Let’s now use WeatherData to get rainfall data for the city of Temecula, since it is likely coming from the same weather station as Vail Lake and Murrieta:
✕
temecula=WeatherData[Entity["City", {"Temecula", "California", "UnitedStates"}],"TotalPrecipitation",{{2014,6,15},{2018,4,15},"Month"}]; 
We can also grab water production and agricultural use for the district and see if we have any correlations going on with weather and water use—a fairly obvious guess, but it’s always nice to show something with data. Let’s go ahead and define a legend variable first:
✕
legend=PlotLegends>{"Water Production","Agricultural Usage","Temecula Rainfall"}; 
✕
ranchoprod=With[{ranchodata=Select[uwsdata,#SupplierName=="Rancho California Water District" &]},ranchodata[All,"ProductionReported"]//Normal]; 
✕
ranchoag=ranchodata[All,"AgricultureReported"]//Normal; 
✕
With[{tsR=ts[#]&/@{ranchoprod,ranchoag}},DateListLogPlot[{tsR,temecula},legend,style]] 
We’ve logged some values here, but we could also manually rescale to get a better sense of the comparisons:
✕
With[{tsR=ts[#]&/@{ranchoprod,ranchoag}/2000},DateListPlot[{tsR,temecula},legend,style]] 
And we can indeed see some dips in water production and agricultural use when rainfall increases, indicating that both usage and production are inversely correlated with rainfall and, by definition, usage and production are correlated with one another.
One variable that might be useful to examine in the dataset is whether or not a district is under mandatory restrictions on outdoor irrigation. Let’s use Classify and its associated functions to measure how we can best predict bans on outdoor irrigation to perhaps inform what features water districts could focus on for water conservation. We’ll begin by using RandomSample to split our data into training and test sets:
✕
data=RandomSample@d; 
✕
training=data[[;;10000]]; 
✕
test=data[[10001;;]]; 
We’ll now build a classifier with the outcome variable defined as mandatory restrictions:
✕
c=Classify[training>"MandatoryRestrictions"] 
We have a classifier function returned, and the Wolfram Language automatically chose GradientBoostedTrees to best fit the data. If we were sure we wanted to use something like logistic regression, we could easily specify which algorithm we’d like to use out of several choices.
But let’s take a closer look at what our automated model selection came up with using ClassifierInformation:
✕
ClassifierInformation[c] 
✕
ClassifierInformation[c,"MethodDescription"] 
We get back a general description of the algorithm chosen and can see the learning curves for each algorithm, indicating why gradient boosted trees was the best fit. Let’s now use ClassifierMeasurements with our test data to look at how well our classifier is behaving:
✕
cm=ClassifierMeasurements[c,test>"MandatoryRestrictions"] 
✕
cm["Accuracy"] 
Ninetythree percent is acceptable for our purposes in exploring this dataset. We can now generate a plot to see what the rejection threshold is for achieving a higher accuracy in case we want to think about improving upon that:
✕
cm["AccuracyRejectionPlot"] 
And let’s pull up the classifier’s confusion matrix to see what we can glean from it:
✕
cm["ConfusionMatrixPlot">{True,False}] 
It looks like the classifier could be improved for predicting False. Let’s get the Fscore to be sure:
✕
cm["FScore"] 
Again, not too terrible with predicting that at a certain point in time a given location will be under mandatory restrictions for outdoor irrigation based on the features in our dataset. As an additional line of inquiry, we could use FeatureExtraction as a preprocessing step to see if we can improve our accuracy. But for this exploration, we see that we could indeed examine conditions under which a given district might be required to restrict outdoor irrigation and give us information on what water suppliers or policymakers might want to pay the most attention to in water conservation.
So far, we’ve looked at some of the top waterproducing districts, areas with high penalty rates and how other enforcement measures compare, the impact of rainfall on agricultural water use with some builtin data and how we might predict what areas will fall under mandatory restrictions on outdoor irrigation—a nice starting point for further explorations.
Think you’re up for the Safe Drinking Water Data Challenge? Try it out for yourself! You can send an email to partnerprogram@wolfram.com and mention the Safe Drinking Water Data Challenge in the subject line to get a license to WolframOne. You can also access an abundance of free training resources for data science and statistics at Wolfram U. In case you get stuck, you can check out the following resources, or go over to Wolfram Community and make sure to post your analysis there as well.
Additional resources:
We look forward to seeing what problems you can solve with some creativity and data science with the Wolfram Language.
In order to make this data more accessible and easily computable, we created an internal version of the MGP data using the Wolfram Language’s entity framework. Using this dataset within the Wolfram Language allows one to easily make computations and visualizations that provide interesting and sometimes unexpected insights into mathematicians and their works. Note that for the time being, these entities are defined only in our private dataset and so are not (yet) available for general use.
The search interface to the MGP is illustrated in the following image. It conveniently allows searches based on a number of common fields, such as parts of a mathematician’s name, degree year, Mathematics Subject Classification (MSC) code and so on:
For a quick look at the available data from the MGP, consider a search for the prolific mathematician Paul Erdős made by specifying his first and last names in the search interface. It gives this result:
Clicking the link in the search result returns a list of available data:
Note that related mathematicians (i.e. advisors and advisees) present in the returned database results are hyperlinked. In contrast, other fields (such as school, degree years and so on), are not. Clearly, the MGP catalogs a wealth of information of interest to anyone wishing to study the history of mathematicians and mathematical research. Unfortunately, only relatively simple analyses of the underlying data are possible using a webbased search interface.
For those readers not familiar with the Wolfram Language entity framework, we begin by giving a number of simple examples of its use to obtain information about the "MGPPerson" entities we created. As a first simple computation, we use the EntityValue function to obtain a count of the number of people in the "MGPPerson" domain:
✕
EntityValue["MGPPerson","EntityCount"] 
Note that this number is smaller than the 230,000+ present in the database due to subsequent additions to the MGP. Similarly, we can return a random person:
✕
person=RandomEntity["MGPPerson"] 
Mousing over an “entity blob” such as in the previous example gives a tooltip showing the underlying Wolfram Language representation.
We can also explicitly look at the internal structure of the entity:
✕
InputForm[person] 
Copying, pasting and evaluating that expression to obtain the formatted version again:
✕
Entity["MGPPerson","94172"] 
We now extract the domain, canonical name and common name of the entity programmatically:
✕
Through[{EntityTypeName,CanonicalName,CommonName}[person]]//InputForm 
We can simultaneously obtain a set of random people from the "MGPPerson" domain:
✕
RandomEntity["MGPPerson",10] 
To obtain a list of properties available in the "MGPPerson" domain, we again use EntityValue:
✕
properties=EntityValue["MGPPerson","Properties"] 
As we did for entities, we can view the internal structure of the first property:
✕
InputForm[First[properties]] 
We can also view the string of canonical names of all the properties:
✕
CanonicalName[properties] 
The URL to the relevant MGP page is available directly as its own property, which can be done concisely as:
✕
EntityValue[person,"MathematicsGenealogyProjectURL"] 
… with an explicit EntityProperty wrapper:
✕
EntityValue[person,EntityProperty["MGPPerson","MathematicsGenealogyProjectURL"]] 
… or using a curried syntax:
✕
person["MathematicsGenealogyProjectURL"] 
We can also return multiple properties:
✕
person[{"AdvisedBy","Degrees","DegreeDates","DegreeSchoolEntities"}] 
Another powerful feature of the Wolfram Language entity framework is the ability to create an implicitly defined Entity class:
✕
EntityClass["MGPPerson","Surname">"Nelson"] 
Expanding this class, we obtain a list of people with the given surname:
✕
SortBy[EntityList[EntityClass["MGPPerson","Surname">"Nelson"]],CommonName] 
To obtain an overview of data for a given person, we can copy and paste from that list and query for the "Dataset" property using a curried property syntax:
✕
Entity["MGPPerson", "174871"]["Dataset"] 
As a first simple computation, we use the Wolfram Language function NestGraph to produce a tengenerationdeep mathematical advisor tree for mathematician Joanna “Jo” Nelson:
✕
NestGraph[#["AdvisedBy"]&,Entity["MGPPerson", "174871"],10,VertexLabels>Placed["Name",After,Rotate[#,30 Degree,{3.2,0}]&]] 
Using an implicitly defined EntityClass, let’s now look up people with the last name “Hardy”:
✕
EntityList[EntityClass["MGPPerson","Surname">"Hardy"]] 
Having found the Hardy we had in mind, it is now easy to make a mathematical family tree for the descendants of G. H. Hardy, highlighting the root scholar:
✕
With[{scholar=Entity["MGPPerson", "17806"]}, HighlightGraph[ NestGraph[#["Advised"]&,scholar,2,VertexLabels>Placed["Name",After,Rotate[#,30 Degree,{3.2,0}]&],ImageSize>Large,GraphLayout>"RadialDrawing"], scholar] ] 
A fun example of the sort of computation that can easily be performed using the Wolfram Language is visualizing the distribution of mathematicians based on first and last initials:
✕
Histogram3D[Select[Flatten[ToCharacterCode[#]]&/@Map[RemoveDiacritics@StringTake[#,1]&,DeleteMissing[EntityValue["MGPPerson",{"GivenName","Surname"}],1,2],{2}],(65<=#[[1]]<=90&&65<=#[[2]]<=90)&],AxesLabel>{"given name","surname"},Ticks>({#,#,Automatic}&[Table[{j,FromCharacterCode[j]},{j,65,90}]])] 
As one might expect, mathematician initials (as well as those of all people in general) are not uniformly distributed with respect to the alphabet.
The Wolfram Language contains a powerful set of functionality involving geographic computation and visualization. We shall make heavy use of such functionality in the following computations.
It is interesting to explore the movement of mathematicians from the institutions where they received their degrees to the institutions at which they did their subsequent advising. To do so, first select mathematicians who received a degree in the 1980s:
✕
p1980=Select[DeleteMissing[EntityValue["MGPPerson",{"Entity",EntityProperty["MGPPerson","DegreeDates"]}],1,2],1980 
Find where their students received their degrees:
✕
unitransition[person_]:=Module[{ds="DegreeSchoolEntities",advisoruni,adviseeunis},advisoruni=person[ds]; adviseeunis=#[ds]&/@DeleteMissing[Flatten[{person["Advised"]}]]; {advisoruni,adviseeunis}] 
Assume the advisors were local to the advisees:
✕
moves=Union[Flatten[DeleteMissing[Flatten[Outer[DirectedEdge,##]&@@@(unitransition/@Take[p1980,All]),2],2,1]]]; 
Now show the paths of the advisors:
✕
GeoGraphics[{Thickness[0.001],Opacity[0.1],Red,Arrowheads[0.01],Arrow@GeoPath[List@@#]&/@moves},GeoRange>"World",GeoBackground>"StreetMapNoLabels"]//Quiet 
We can also perform a number of computations involving mathematical degrees. As with the "MGPPerson" domain, we first briefly explore the contents of the "MGPDegree" domain and show how to access them.
To begin, show a count of the number of theses in the "MGPDegree" domain:
✕
EntityValue["MGPDegree","EntityCount"] 
List five random theses from the "MGPDegree" domain:
✕
RandomEntity["MGPDegree",5] 
Show available "MGPDegree" properties:
✕
EntityValue["MGPDegree","Properties"] 
Return a dataset of an "MGPDegree" entity:
✕
Entity["MGPDegree", "120366"]["Dataset"] 
Moving on, we now visualize the historical numbers of PhDs awarded worldwide:
✕
DateListLogPlot[phddata={#[[1,1]],Length[#]}&/@GatherBy[Cases[EntityValue["MGPDegree",{"Date","DegreeType"}],{_DateObject,"Ph.D."}],First], PlotRange>{DateObject[{#}]&/@{1800,2010},All}, GridLines>Automatic] 
We can now make a fit to the number of new PhD mathematicians over the period 1875–1975:
✕
fit=Fit[Select[{#1["Year"],1. Log[2,#2]}&@@@phddata,1875<#[[1]]<1975&],{1,y},y] 
This gives a doubling time of about 1.5 decades:
✕
Quantity[1/Coefficient[fit,y],"Years"] 
Let’s write a utility function to visualize the number of degrees conferred by a specified university over time:
✕
DegreeCountHistogram[school_,bin_,opts___]:=DateHistogram[DeleteMissing[EntityValue[EntityList[EntityClass["MGPDegree","SchoolEntity">school]],"Date"]], bin,opts] 
Look up the University of Chicago entity of the "University" type in the Wolfram Knowledgebase:
✕
Interpreter["University"]["university of chicago"] 
Show the number of degrees awarded by the University of Chicago, binned by decade:
✕
DegreeCountHistogram[Entity["University", "UniversityOfChicago::726rv"],"Decades"] 
... and by year:
✕
DegreeCountHistogram[Entity["University", "UniversityOfChicago::726rv"],"Years",DateTicksFormat>"Year"] 
Now look at the national distribution of degrees awarded. Begin by again examining the structure of the data. In particular, there exist PhD theses with no institution specified in "SchoolEntity" but a country specified in "SchoolLocation":
✕
TextGrid[Take[Cases[phds=EntityValue["MGPDegree",{"Entity","DegreeType","SchoolEntity","SchoolLocation"}],{_,"Ph.D.",_Missing,_List}],5],Dividers>All] 
There also exist theses with more than a single country specified in "SchoolLocation":
✕
TextGrid[Cases[phds,{_,"Ph.D.",_Missing,_List?(Length[#]!=1&)}],Dividers>All] 
Tally the countries (excluding the pair of multiples):
✕
TextGrid[Take[countrytallies=Reverse@SortBy[Tally[Cases[phds,{_,"Ph.D.",_,{c_Entity}}:>c]],Last],UpTo[10]],Alignment>{{Left,Decimal}},Dividers>All] 
A total of 117 countries are represented:
✕
Length[countrytallies] 
Download flag images for these countries from the Wolfram Knowledgebase:
✕
Take[flagdata=Transpose[{EntityValue[countrytallies[[All,1]],"Flag"],countrytallies[[All,2]]}],5] 
Create an image collage of flags, with the flags sized according to the number of math PhDs:
✕
ImageCollage[Take[flagdata,40],ImagePadding>3] 
As another example, we can explore degrees awarded by a specific university. For example, extract mathematics degrees that have been awarded at the University of Miami since 2010:
✕
Length[umiamidegrees=EntityList[ EntityClass["MGPDegree",{ "SchoolEntity">Entity["University", "UniversityOfMiami::9c2k9"], "Date"> GreaterEqualThan[DateObject[{2010}]]} ]]] 
Create a timeline visualization:
✕
TimelinePlot[Association/@Rule@@@EntityValue[umiamidegrees,{"Advisee","Date"}],ImageSize>Large] 
Now consider recent US mathematics degrees. Select the theses written at US institutions since 2000:
✕
Length[USPhDs=Cases[Transpose[{ EntityList["MGPDegree"], EntityValue["MGPDegree","SchoolLocation"], EntityValue["MGPDegree","Date"] }], { th_, loc_?(ContainsExactly[{Entity["Country", "UnitedStates"]}]),DateObject[{y_?(GreaterEqualThan[2000])},___] }:>th ]] 
Make a table showing the top US schools by PhDs conferred:
✕
TextGrid[Take[schools=Reverse[SortBy[Tally[Flatten[EntityValue[USPhDs,"SchoolEntity"]]],Last]],12],Alignment>{{Left,Decimal}},Dividers>All] 
Map schools to their geographic positions:
✕
geopositions=Rule@@@DeleteMissing[Transpose[{EntityValue[schools[[All,1]],"Position"],schools[[All,2]]}],1,2]; 
Visualize the geographic distribution of US PhDs :
✕
GeoBubbleChart[geopositions,GeoRange>Entity["Country", "UnitedStates"]] 
Show mathematician thesis production as a smooth kernel histogram over the US:
✕
GeoSmoothHistogram[Flatten[Table[#1,{#2}]&@@@geopositions],"Oversmooth",GeoRange>GeoVariant[Entity["Country", "UnitedStates"],Automatic]] 
We now make some explorations of the titles of mathematical theses.
To begin, extract theses authored by people with the surname “Smith”:
✕
Length[smiths=EntityList[EntityClass["MGPPerson","Surname">"Smith"]]] 
Create a WordCloud of words in the titles:
✕
WordCloud[DeleteStopwords[StringRiffle[EntityValue[DeleteMissing[Flatten[EntityValue[smiths,"Degrees"]]],"ThesisTitle"]]]] 
Now explore the titles of all theses (not just those written by Smiths) by extracting thesis titles and dates:
✕
tt=DeleteMissing[EntityValue["MGPDegree",{"Date","ThesisTitle"}],1,2]; 
The average string length of a thesis is remarkably constant over time:
✕
DateListPlot[{#[[1,1]],Round[Mean[StringLength[#[[All,1]]]]]}&/@SplitBy[Sort[tt],First], PlotRange>{DateObject[{#}]&/@{1850,2010},All}] 
The longest thesis title on record is this giant:
✕
SortBy[tt,StringLength[#[[2]]]&]//Last 
Motivated by this, extract explicit fragments appearing in titles:
✕
tex=Cases[ImportString[#,"TeX"]&/@Flatten[DeleteCases[StringCases[#2,Shortest["$"~~___~~"$"]]&@@@tt,{}]],Cell[_,"InlineFormula",___],∞]//Quiet; 
... and display them in a word cloud:
✕
WordCloud[DisplayForm/@tex] 
Extract types of topological spaces mentioned in thesis titles and display them in a ranked table:
✕
TextGrid[{StringTrim[#1],#2}&@@@Take[Select[Reverse[SortBy[Tally[Flatten[DeleteCases[StringCases[#2,Shortest[" ",((LetterCharacter"_")..)~~(" space""Space ")]]&@@@tt,{}]]],Last]], Not[StringMatchQ[#[[1]],(" of "  " in " " and "" the "  " on ")~~__]]&],12],Dividers>All,Alignment>{{Left,Decimal}}] 
Get all available Mathematics Subject Classification (MSC) category descriptions for mathematics degrees conferred by the University of Oxford and construct a word cloud from them:
✕
WordCloud[DeleteMissing[EntityValue[EntityList[EntityClass["MGPDegree","SchoolEntity">Entity["University", "UniversityOfOxford::646mq"]]],"MSCDescription"]],ImageSize>Large] 
Explore the MSC distribution of recent theses. To begin, Iconize a list to use that holds MSC category names that will be used in subsequent examples:
✕
mscnames=List; 
Extract degrees awarded since 2010:
✕
Length[degrees2010andlater=Cases[Transpose[{EntityList["MGPDegree"],EntityValue["MGPDegree","Date" ]}],{th_,DateObject[{y_?(GreaterEqualThan[2010])},___]}:>th]] 
Extract the corresponding MSC numbers:
✕
degreeMSCs=DeleteMissing[EntityValue[degrees2010andlater,"MSCNumber"]]; 
Make a pie chart showing the distribution of MSC category names and numbers:
✕
With[{counts=Sort[Counts[degreeMSCs],Greater][[;;20]]},PieChart[Values[counts],ChartLegends>(Row[{#1,": ",#2," (",#3,")"}]&@@@(Flatten/@Partition[Riffle[Keys@counts,Partition[Riffle[(Keys@counts/.mscnames),ToString/@Values@counts],2]],2])),ChartLabels>Placed[Keys@counts,"RadialCallout"],ChartStyle>24,ImageSize>Large]] 
Extract the MSC numbers for theses since 1990 and tally the combinations of {year, MSC}:
✕
msctallies=Tally[Sort[Cases[DeleteMissing[EntityValue["MGPDegree",{"Date","MSCNumber"}],1,2], {DateObject[{y_?(GreaterEqualThan[1990])},___],msc_}:>{y,msc}]]] 
Plot the distribution of MSC numbers (mouse over the graph in the attached notebook to see MSC descriptions):
✕
Graphics3D[With[{y=#[[1]],msc=ToExpression[#[[2]]],off=1/3},Tooltip[Cuboid[{mscoff,yoff,0},{msc+off,y+off,#2}], #[[2]]/.mscnames]]&@@@msctallies,BoxRatios>{1,1,0.5},Axes>True, AxesLabel>{"MSC","year","thesis count"},Ticks>{None,Automatic,Automatic}] 
Most students do research in the same area as their advisors. Investigate systematic transitions from MSC classifications of advisors’ works to those of their students. First, write a utility function to create a list of MSC numbers for an advisor’s degrees and those of each advisee:
✕
msctransition[person_]:=Module[{msc="MSCNumber",d="Degrees",advisormsc,adviseemscs,dm=DeleteMissing}, advisormsc=#[msc]&/@person[d]; adviseemscs=#[msc]&/@Flatten[#[d]&/@dm[Flatten[{person["Advised"]}]]]; dm[{advisormsc,{#}}&/@DeleteCases[adviseemscs,Alternatives@@advisormsc],1,2]] 
For example, for Maurice Fréchet:
✕
TextGrid[msctransition[Entity["MGPPerson", "17947"]]/.mscnames,Dividers>All] 
Find MSC transitions for degree dates after 1988:
✕
transitiondata=msctransition/@Select[DeleteMissing[ EntityValue["MGPPerson",{"Entity","DegreeDates"}],1,2],Min[#["Year"]&/@#[[2]]]>1988&][[All,1]]; 
✕
transitiondataaccumulated=Tally[Flatten[Apply[Function[{a,b},Outer[DirectedEdge,a,b]], Flatten[Take[transitiondata,All],1],{1}],2]]/.mscnames; 
✕
toptransitions=Select[transitiondataaccumulated,Last[#]>10&]/.mscnames; 
✕
Grid[Reverse[Take[SortBy[transitiondataaccumulated,Last],10]],Dividers>Center,Alignment>Left] 
✕
msctransitiongraph=Graph[First/@toptransitions,EdgeLabels>Placed["Name",Tooltip],VertexLabels>Placed["Name",Tooltip],GraphLayout>"HighDimensionalEmbedding"]; 
✕
With[{max=Max[Last/@toptransitions]}, HighlightGraph[msctransitiongraph,Style[#1,Directive[Arrowheads[0.05(#2/max)^.5],ColorData["DarkRainbow"][(#2/max)^6.],Opacity[(#2/max)^.5],Thickness[0.005(#2/max)^.5]]]&@@@transitiondataaccumulated]] 
Construct a list of directed edges from advisors to their students:
✕
Length[advisorPairs=Flatten[Function[{a,as},DirectedEdge[a,#]&/@as]@@@DeleteMissing[EntityValue["MGPPerson",{"Entity","Advised"}],1,2]]] 
Some edges are duplicated because the same studentadvisor relationship exists for more than one degree:
✕
SelectFirst[Split[Sort[advisorPairs]],Length[#]>1&] 
For example:
✕
(EntityValue[Entity["MGPPerson", "110698"],{"AdvisedBy","Degrees"}]/.e:Entity["MGPDegree",_]:>{e,e["DegreeType"]}) 
So build an explicit advisor graph by uniting the {advisor, advisee} pairs:
✕
advisorGraph=Graph[Union[advisorPairs],GraphLayout>None] 
The advisor graph contains more than 3,500 weakly connected components:
✕
Length[graphComponents=WeaklyConnectedGraphComponents[advisorGraph]] 
Visualize component sizes on a loglog plot:
✕
ListLogLogPlot[VertexCount/@graphComponents,Joined>True,Mesh>All,PlotRange>All] 
Find the size of the giant component (about 190,000 people):
✕
VertexCount[graphComponents[[1]]] 
Find the graph center of the secondlargest component:
✕
GraphCenter[UndirectedGraph[graphComponents[[2]]]] 
Visualize the entire secondlargest component:
✕
Graph[graphComponents[[2]],VertexLabels>"Name",ImageSize>Large] 
Identify the component in which David Hilbert resides:
✕
FirstPosition[VertexList/@graphComponents,Entity["MGPPerson", "7298"]][[1]] 
Show Hilbert’s students:
✕
With[{center=Entity["MGPPerson", "7298"]},HighlightGraph[Graph[Thread[center>AdjacencyList[graphComponents[[1]],center]],VertexLabels>"Name",ImageSize>Large],center]] 
As it turns out, the mathematician Gaston Darboux plays an even more central role in the advisor graph. Here is some detailed information about Darboux, whose 1886 thesis was titled “Sur les surfaces orthogonales”:
✕
Entity["MGPPerson", "34254"] ["PropertyAssociation"] 
And here is a picture of Darboux:
✕
Show[WikipediaData["Gaston Darboux","ImageList"]//Last,ImageSize>Small] 
Many mathematical constructs are named after Darboux:
✕
Select[EntityValue["MathWorld","Entities"],StringMatchQ[#[[2]],"*Darboux*"]&] 
... and his name can even be used in adjectival form:
✕
StringCases[Normal[WebSearch["Darbouxian *",Method > "Google"][All,"Snippet"]], "Darbouxian"~~" " ~~(LetterCharacter ..)~~" " ~~(LetterCharacter ..)]//Flatten//DeleteDuplicates // Column 
Many wellknown mathematicians are in the subtree starting at Darboux. In particular, in the directed advisor graph we find a number of recent Fields Medal winners. Along the way, we also see many wellknown mathematicians such as Laurent Schwartz, Alexander Grothendieck and Antoni Zygmund:
✕
{path1,path2,path3,path4}=(DirectedEdge@@@Partition[FindShortestPath[graphComponents[[1]],Entity["MGPPerson", "34254"],#],2,1])&/@ {Entity["MGPPerson", "13140"],Entity["MGPPerson", "22738"],Entity["MGPPerson", "43967"],Entity["MGPPerson", "56307"]} 
Using the data from the EntityStore, we build the complete subgraph starting at Darboux:
✕
adviseeedges[pList_]:=Flatten[Function[p,DirectedEdge[Last[p],#]&/@ DeleteMissing[Flatten[{Last[p][advised]}]]]/@pList] 
✕
advgenerations=Rest[NestList[adviseeedges,{Null>Entity["MGPPerson", "34254"]},7]]; 
✕
alladv=Flatten[advgenerations]; 
It contains more than 14,500 mathematicians:
✕
Length[Union[Cases[alladv,_Entity,∞]]]1 
Because it is a complicated graph, we display it in 3D to avoid overcrowded zones. Darboux sits approximately in the center:
✕
gr3d=Graph3D[alladv,GraphLayout>"SpringElectricalEmbedding"] 
We now look at the degree centrality of the nodes of this graph in a loglog plot:
✕
ListLogLogPlot[Tally[DegreeCentrality[gr3d]]] 
Let’s now highlight the path to that plot for Fields Medal winners:
✕
style[path_,color_]:=Style[#,color,Thickness[0.004]]&/@path 
✕
HighlightGraph[gr3d, Join[{Style[Entity["MGPPerson", "34254"],Orange,PointSize[Large]]}, style[path1,Darker[Red]],style[path2,Darker[Yellow]],style[path3,Purple], style[path4,Darker[Green]]]] 
Geographically, Darboux’s descendents are distributed around the whole world:
✕
makeGeoPath[e1_e2_] := With[{s1=e1["DegreeSchoolEntities"],s2=e2["DegreeSchoolEntities"],d1=e1["DegreeDates"],d2=e2["DegreeDates"],color=ColorData["DarkRainbow"][(Mean[{#1[[1,1,1]],#2[[1,1,1]]}]1870)/150]&}, If[MemberQ[{s1,s2,d1,d2},_Missing,∞]s1===s2,{},{Thickness[0.001],color[d1,d2],Arrowheads[0.012],Tooltip[Arrow[GeoPath[{s1[[1]],s2[[1]]}]], Grid[{{"","advisor","advisee"},{"name",e1,e2},Column/@{{"school"},s1,s2}, Column/@{{"degree date"},d1,d2}},Dividers>Center]]}]] 
Here are the paths from the advisors’ schools to the advisees’ schools after four and six generations:
✕
GeoGraphics[makeGeoPath/@Flatten[Take[advgenerations,4]],GeoBackground>"StreetMapNoLabels",GeoRange>"World"]//Quiet 
✕
GeoGraphics[makeGeoPath /@ Flatten[Take[advgenerations, 6]], GeoBackground > "StreetMapNoLabels", GeoRange > "World"] // Quiet 
Extract a list of advisors and the dates at which their advisees received their PhDs:
✕
Take[AdvisorsAndStudentPhDDates=SplitBy[Sort[Flatten[Thread/@Cases[EntityValue["MGPDegree",{"Advisors","DegreeType","Date"}],{l_List,"Ph.D.",DateObject[{y_},___]}:>{l,y}],1]],First],5] 
This list includes multiple student PhD dates for each advisor, so select the dates of the first students’ PhDs only:
✕
Take[AdvisorsAndFirstStudentPhDDates=DeleteCases[{#[[1,1]],Min[DeleteMissing[#[[All,2]]]]}&/@AdvisorsAndStudentPhDDates,{_,Infinity}],10] 
Now extract a list of PhD awardees and the dates of their PhDs:
✕
Take[PhDAndDates=DeleteCases[Sort[Cases[EntityValue["MGPDegree",{"Advisee","DegreeType","Date"}],{p_,"Ph.D.",DateObject[{y_},___]}:>{p,y}]],{_Missing,_}],10] 
Note that some advisors have more than one PhD:
✕
Select[SplitBy[PhDAndDates,First],Length[#]>1&]//Take[#,5]&//Column 
For example:
✕
Entity["MGPPerson", "100896"]["Degrees"] 
... who has these two PhDs:
✕
EntityValue[%,{"Date","DegreeType","SchoolName"}] 
While having two PhDs is not unheard of, having three is unique:
✕
Tally[Length/@SplitBy[PhDAndDates,First]] 
In particular:
✕
Select[SplitBy[PhDAndDates,First],Length[#]===3&] 
Select the first PhDs of advisees and make a set of replacement rules to their first PhD dates:
✕
Take[FirstPhDDateRules=Association[Thread[Rule@@@SplitBy[PhDAndDates,First][[All,1]]]],5] 
Now replace advisors by their first PhD years and subtract from the year of their first students’ PhDs:
✕
Take[times=Subtract@@@(AdvisorsAndFirstStudentPhDDates/.FirstPhDDateRules),10] 
The data contains a small number of discrepancies where students allegedly received their PhDs prior to their advisors:
✕
SortBy[Select[Transpose[{AdvisorsAndFirstStudentPhDDates[[All,1]],AdvisorsAndFirstStudentPhDDates/.FirstPhDDateRules}],GreaterEqual@@#[[2]]&],Subtract@@#[[2]]&]//Take[#,10]& 
Removing these problematic points and plotting a histogram reveals the distribution of years between advisors’ and first advisees’ PhDs:
✕
Histogram[Cases[times,_?Positive]] 
We hope you have found this computational exploration of mathematical genealogy of interest. We thank Mitch Keller and the Mathematics Genealogy Project for their work compiling and maintaining this fascinating and important dataset, as well as for allowing us the opportunity to explore it using the Wolfram Language. We hope to be able to freely expose a Wolfram Data Repository version of the MGP dataset in the near future so that others may do the same.
]]>A couple of weeks ago I shared a package for controlling the Raspberry Pi version of Minecraft from Mathematica (either on the Pi or from another computer). You can control the Minecraft API from lots of languages, but the Wolfram Language is very well aligned to this task—both because the rich, literate, multiparadigm style of the language makes it great for learning coding, and because its highlevel data and computation features let you get exciting results very quickly.
Today, I wanted to share four fun Minecraft project ideas that I had, together with simple code for achieving them. There are also some ideas for taking the projects further.
If you haven’t already installed the MinecraftLink package, follow the instructions here.
The Minecraft world is made up of blocks of different colors and textures. If we arrange these appropriately, we can use the colors to create grainy pictures. I want to automate this process of converting pictures to Minecraft blocks.
The first step is to start a new world in Minecraft on the Raspberry Pi, then load the MinecraftLink package:
✕
< 
If you are using Mathematica on a different computer than the Pi (as I am), you need to connect the two together using the IP address or the name of the Raspberry Pi (your address will be different than mine, shown here). If you are running the code on the Pi, you can skip this step:
✕
MinecraftConnect["10.10.163.22"] 
The MinecraftLink package automatically installs some MinecraftBlock Entity data from the Wolfram Data Repository.
Some of those entities include images that I can analyze to figure out the block’s average color. First, I need to select all the entities that have images available. But I found out the hard way that we have to remove a few of those blocks for various reasons: the transparency of blocks like glass and cobweb just look bad, and some blocks must be removed because of the game physics of Minecraft. Soft blocks like sand fall off the picture, fire only exists on top of certain blocks and water spreads all over the picture, so those are all removed from the list.
✕
EntityList[Entity["MinecraftBlock", "Image" > ImageQ]] 
✕
available=Complement[EntityList[Entity["MinecraftBlock", "Image" > ImageQ]],{Entity["MinecraftBlock", "Glass"],Entity["MinecraftBlock", "Leaves"],Entity["MinecraftBlock", "Cobweb"],Entity["MinecraftBlock", "Sand"],Entity["MinecraftBlock", "Gravel"],Entity["MinecraftBlock", "Snow"],Entity["MinecraftBlock", "Fire"],Entity["MinecraftBlock", "WaterStationary"]}] 
Here are the images that we have:
✕
Magnify[{#["Image"],#}&/@available,0.6] 
Most blocks (subject to lighting) are the same on all faces, but a few have different textures on their side faces than their top faces. I plan to look at all blocks from the side, so I want to figure out what the blocks’ average sideface color is. To do this, I created the following mask for the position of the sideface pixels of the gold block:
✕
mask = Erosion[ DominantColors[CloudGet["https://wolfr.am/xJ2pPzQS"], 4, "CoverageImage"][[2]], 2] 
Because all the images have the same shape and viewpoint, I can apply that mask to every block to pick out their frontface pixels:
✕
mask RemoveAlphaChannel[Entity["MinecraftBlock","WoodBirch"]["Image"]] 
To make sure I am using a likeforlike measurement, I remove the transparency layer (AlphaChannel) and put them all into the same color space. Then I just ask for the average pixel value and convert that back to an average color (working in HSB color gives more perceptually correct averaging of colors):
✕
averageColor[block_]:=Hue[ImageMeasurements[ColorConvert[RemoveAlphaChannel[block["Image"],LightBlue],"HSB"],"Mean",Masking>mask]] 
✕
colors=Map[averageColor,available] 
We can now look at our available palette:
✕
ChromaticityPlot[colors] 
You can see from this plot of colors in the visible color space that we have rather poor coverage of highsaturation colors, and something of a gap around the green/cyan border, but this is what we have to work with.
Now we need a function that takes a color and picks out the block name that is nearest in color. The Wolfram Language already knows about perceptual color distance, so Nearest handles this directly:
✕
getName[color_]:=First[Nearest[MapThread[Rule,{colors,available}],color]]; 
For example, the block that is closest to pure red is wool orange:
✕
getName[Red] 
Now we need a function that will take a picture and drop its resolution to make it more “blocky” and simplify the image to use only the colors that are available to us:
✕
toBlockColors[img_,size_]:=ColorQuantize[ImageResize[img,size],colors]; 
Let’s now apply that to a wellknown picture:
✕
toBlockColors[CloudGet["https://wolfr.am/xJ2fHJWp"], 50] 
Now we just have to count through the pixels of that image, find the name of the block with the nearest color to the pixel and place it in the corresponding place in the Minecraft world:
✕
putPicture[{x0_,y0_,z0_},img_]:= Block[{dims=ImageDimensions[img]}, Do[ MinecraftSetBlock[{dims[[1]]x+x0,y+y0,z0},getName[RGBColor[ImageValue[img,{x,y}]]]], {x,dims[[1]]},{y,dims[[2]]}]]; 
Find a big open space...
And run the program on a simple image:
✕
putPicture[{30, 0, 0}, toBlockColors[CloudGet["https://wolfr.am/xJ2fHJWp"], 50]] 
You can use Import to bring images into the system, but the Wolfram Language provides lots of images as part of its Entity system. For example, you can fetch famous works of art:
✕
Entity["Artwork", "AmericanGothic::GrantWood"]["Image"] 
Here is a detail from American Gothic (Grant Wood’s sister) in blocks:
✕
putPicture[{30,0,0},toBlockColors[ImageTake[Entity["Artwork", "AmericanGothic::GrantWood"]["Image"],{25,75},{10,50}],50]] 
You can even, at an incredibly low frame rate, make an outdoor movie theater by streaming frames from your webcam onto the wall. Here is me working on this blog post!
✕
While[True,putPicture[{30,0,0},toBlockColors[CurrentImage[],50]]] 
An interesting challenge would be to reverse this process to generate a map of the Minecraft world. You would need to scan the surface block type at every position in the Minecraft world and use the color map created to color a single pixel of the output map.
This project sounds quite hard, but thanks to the builtin data in the Wolfram Language, it is actually very simple.
Let’s suppose I want to create my home country of the United Kingdom in Minecraft. All I need to do is place a grid of blocks at heights that correspond to heights of the land in the United Kingdom. We can get that data from the Wolfram Language with GeoElevationData:
✕
ListPlot3D[Reverse/@GeoElevationData[Entity["Country", "UnitedKingdom"]],PlotRange>{10000,20000},Mesh>False] 
You will see that the data includes underwater values, so we will need to handle those differently to make the shape recognizable. Also, we don’t need anywhere near as much resolution (GeoElevationData can go to a resolution of a few meters in some places). We need something more like this:
✕
ListPlot3D[Reverse/@GeoElevationData[Entity["Country", "UnitedKingdom"],GeoZoomLevel>3],PlotRange>{0,5000},Mesh>False] 
Now let’s make that into blocks. Let’s assume I will choose the minimum and maximum heights of our output. For any given position, I need to create a column of blocks. If the height is positive, this should be solid blocks up to the height, and air above. If the height is negative, then it is solid up to the point, water above that until we reach a given sea level value and then air above that.
✕
createMapColumn[{x_,y_,z_},seaLevel_,min_,max_]:=(MinecraftSetBlock[{{x,min,z},{x,y,z}},"Dirt"];If[y>=seaLevel,MinecraftSetBlock[{{x,y,z},{x,max,z}},Entity["MinecraftBlock", "Air"]], MinecraftSetBlock[{{x,y,z},{x,seaLevel1,z}},Entity["MinecraftBlock", "WaterStationary"]];MinecraftSetBlock[{{x,seaLevel,z},{x,max,z}},"Air"]]); 
Now we just need to create a column for each position in our elevation data.
All the work is in transforming the numbers. The reversing and transposing is to get the coordinates to line up with the compass correctly, QuantityMagnitude gets rid of units, and the rest is vertical scaling:
✕
MinecraftElevationPlot[data0_,{x0_,seaLevel_,z0_}, maxHeight_:5]:= Block[{data=QuantityMagnitude[Reverse[Map[Reverse,Transpose[data0]]]],scale, min,dims}, dims=Dimensions[data]; scale= maxHeight/Max[Flatten[data]]; min= Round[scale*Min[Flatten[data]]]; Do[createMapColumn[{Round[x0+i],Floor[scale data[[i,j]]+seaLevel],z0+j},Round[seaLevel], seaLevel+min,Round[maxHeight+seaLevel]],{i,dims[[1]]},{j,dims[[2]]}]] 
Before we start, we can use the following code to clear a large, flat area to work on and put the camera high in the air above the action:
✕
MinecraftSetBlock[{{40,10,40},{40,0,40}},"Grass"]; MinecraftSetBlock[{{40,0,40},{40,50,40}},"Air"]; 
✕
MinecraftSetCamera["Fixed"]; MinecraftSetCamera[{0,25,0}] 
And now we can place the map:
✕
MinecraftElevationPlot[GeoElevationData[Entity["Country", "UnitedKingdom"],GeoZoomLevel>2],{15,0,15},5] 
You can just see that the land is higher in mountainous Scotland. You can see that better with the camera in the usual position, but the coastline becomes harder to see.
Alternatively, here is the view of the north ridge of Mount Everest, as seen from the summit:
✕
MinecraftSetCamera["Normal"] MinecraftElevationPlot[GeoElevationData[GeoDisk[Entity["Mountain", "MountEverest"],3 mi],GeoZoomLevel>9],{15,18,15},30] 
A nicer version of this might switch materials at different heights to give you snowcapped mountains, or sandy beaches. I will leave that for you to add.
If you are unlucky enough to need an MRI or CT scan, then you might end up with 3D image data such as this builtin example:
✕
Show[ExampleData[{"TestImage3D","CThead"}],BoxRatios>1] 
This might seem complex, but it is actually a simpler problem than the photo renderer we did earlier, because color isn’t very meaningful in the CT world. Our simplest approach is just to drop the scan’s resolution, and convert it into either air or blocks.
✕
Binarize[ImageResize[ExampleData[{"TestImage3D","CThead"}],{80,80,80}]] 
We can easily find the coordinates of all the solid voxels, which we can use to place blocks in our world:
✕
Position[ImageData[%],1] 
We can wrap all of that into a single function, and add in an initial position in the Minecraft world. I added the small pause because if you run this code from a desktop computer, it will flood the Minecraft server with more requests than it can handle, and Minecraft will drop some of the blocks:
✕
fixCoordinates[{a_,b_,c_}]:={c,a,b} (*Mapping coordinate systems*) 
✕
minecraftImage3D[img_Image3D,pos_List,block_,threshold_:Automatic]:=( MinecraftSetBlock[{pos,pos+ImageDimensions[img]},"Air"];Map[(Pause[0.01];MinecraftSetBlock[pos+#,block])&,fixCoordinates/@Position[ImageData[Binarize[img,threshold]],1]];) 
And here it is in action with the head:
✕
minecraftImage3D[ ImageResize[ExampleData[{"TestImage3D","CThead"}],{40,40,40}],{0,40,0},"GoldBlock"] 
But one thing to understand with 3D images is that there is information “inside” the image at every level, so if we change the threshold for binarizing then we can pick out just the denser bone material and make a skull:
✕
Binarize[ImageResize[ExampleData[{"TestImage3D","CThead"}],{80,80,80}],0.4] 
✕
minecraftImage3D[ ImageResize[ExampleData[{"TestImage3D","CThead"}],{40,40,40}],{0,40,0},"GoldBlock",0.4] 
An interesting extension would be to establish three levels of density and use the glass block type to put a transparent skin on the skull. I will leave that for you to do. You can find DICOM images on the web that can be imported into the Wolfram Language with Import, but beware—some of those can be quite large files.
The final project is about creating new game behavior. My idea is to create a special block combination that triggers an automatic action. Specifically, when you place a gold block on top of a glowstone block, a large pyramid will be built for you.
The first step is to scan the surface blocks around a specific point for gold and return a list of surface gold block positions found:
✕
scanForGold[{x0_,y0_,z0_}]:=Block[{goldPos={}, height = MinecraftGetHeight[{x,z}]}, Do[Pause[0.1];If[MinecraftGetBlock[{x,height1,z}]===Entity["MinecraftBlock","GoldBlock"],AppendTo[goldPos,{x,height1,z}]],{x,x01,x0+1},{z,z01,z0+1}]; goldPos]; 
Next, we look under each of the gold blocks that we found and see if any have glowstone under them:
✕
checkGoldForGlowstone[goldPos_]:=FirstCase[goldPos,{x_,y_,z_}/;MinecraftGetBlock[{x,y1,z}]===Entity["MinecraftBlock","GlowstoneBlock"]] 
Now we need a function that performs the resulting actions. It posts a message, removes the two special blocks and sets the pyramid:
✕
pyramidActions[found_]:=(MinecraftChat["Building Pyramid"]; MinecraftSetBlock[{found,found{0,1,0}},"Air"]; MinecraftSetBlock[found{0,1,0},"GoldBlock",Pyramid[],RasterSize>12]); 
We can now put all of that together into one function that scans around the current player and runs the actions on the first matching location. The PreemptProtect is a bit subtle. Because I am going to run this as a background task, I need to make sure that I don’t perform two actions at once, as the messages going back and forth to the Minecraft server may get muddled:
✕
pyramidCheck[]:=PreemptProtect[Block[{found=checkGoldForGlowstone[scanForGold[MinecraftGetTile[]]]},If[Not[MissingQ[found]],pyramidActions[found]]]] 
All that is left is to run this code repeatedly every five seconds:
✕
task=SessionSubmit[ScheduledTask[pyramidCheck[],5]] 
I place the blocks like this...
... walk up within one block of the special column and wait for a couple of seconds, until this happens...
To stop the task, you can evaluate...
✕
TaskRemove[task] 
Well, that’s the end of my short series on Minecraft coding in the Wolfram Language. There are lots of fun knowledge domains and computation areas that I could have injected into Minecraft. I had thought of using reading surface blocks and constructing a 3D minimap of the world using the Wolfram Language’s data visualization, or creating a solar system of orbiting planets using the differential equation solver. I also considered creating a terrain generator using 3D cellular automata or fractals. But I do have a day job to do, so I will leave those for others to try. Do post your own project ideas or code on Wolfram Community.
The first step is to make sure that you have all the right components. Make sure that you have the latest version of Raspbian and the Wolfram Language. You do this by connecting your Raspberry Pi to the network, opening the Terminal app and typing the following:
sudo aptget update.
sudo aptget distupgrade
Now open Mathematica on the Pi, or another computer, and type:
✕
PacletInstall["MinecraftLink"] 
… followed by Shift + Return to evaluate it. If all went well, we are ready to start.
The MinecraftLink library adds a small set of new commands to the Wolfram Language for connecting to a running Raspberry Pi Minecraft game.
Start by launching Minecraft on the Raspberry Pi, and then start a fresh game or open an existing one. In the Wolfram Language, load the library by evaluating the following:
✕
< 
This extends the Wolfram Language with the following new commands:
✕
?MinecraftLink`* 
You can find documentation on these by evaluating MinecraftHelp[] after you have installed the link.
You can control a Minecraft game running on the Raspberry Pi from the Wolfram Language running on the same Raspberry Pi, or from any other computer that has a network connection to the Pi. If you’re connecting from a different computer, you must now tell the Wolfram Language the name or IP address of the Raspberry Pi where Minecraft is running...
✕
MinecraftConnect["10.10.163.22"] 
You don’t need to do this if both programs are on the same machine, but if you need to reset the connection, you can use MinecraftConnect or MinecraftConnect["localhost"].
Let’s test to see if that worked by evaluating the following code:
✕
MinecraftChat["Hello from the Wolfram Language"] 
You should see the message appear briefly in the game chat area:
We need to find out where we are in the Minecraft world. Minecraft uses a simple {x, y, z} coordinate system, where x and z are the horizontal directions (x is left/right if you have just started the game) and y is the vertical direction. If you have started a fresh game, you will be near to {0, 0, 0}. You can see the coordinates in the topleft corner of the screen, but to get them programmatically you can use:
✕
MinecraftGetPosition[] 
We can teleport the character to a new location (in this case, up in the air) with:
✕
MinecraftSetPosition[{0,50,0}] 
If we have just started a game, then 10 blocks in front of us is {0, 10, 0}. But depending on how mountainous the terrain is, that block might be above or below ground. We can find the surface level with:
✕
y=MinecraftGetHeight[{0,8}] 
We can test that by looking at the block at that position. It should be Air.
✕
pos={0,y,8} 
✕
MinecraftGetBlock[pos] 
And the block below it should be something solid:
✕
MinecraftGetBlock[pos{0,1,0}] 
Now we can start building. We can place blocks of any type—for example, "Wood":
✕
MinecraftSetBlock[pos,"Wood"] 
We remove them by just overwriting the block with something else, such as "Air":
✕
MinecraftSetBlock[pos,"Air"] 
But if you want a full undo, you must precede your changes with:
✕
MinecraftSave[] 
And then if you don’t like your changes, you can undo them with:
✕
MinecraftRestore[] 
The list of the 156 available Minecraft block names is in the symbol $MinecraftBlockNames:
✕
Short[$MinecraftBlockNames,5] 
One reason to use the Wolfram Language for this is that it handles all kinds of interesting 2D and 3D objects, and I have set up the SetBlock command to handle these fairly automatically. For example, let’s paint a letter X in the sky in gold.
✕
MinecraftSetBlock[pos,"GoldBlock","X"] 
We can remove it again by replacing it with "Air":
✕
MinecraftSetBlock[pos,"Air","X"] 
By default, rasterized content will be 12 blocks wide, so if you need more detail, you can increase that with an option:
✕
MinecraftSetBlock[pos,"GoldBlock","",RasterSize>50] 
✕
MinecraftSetBlock[pos,"Air","",RasterSize>50] 
Anything you can create in the Wolfram Language can be made into blocks. Here is a plot of the function Sin[x]:
✕
MinecraftSetBlock[pos,"Dirt",Plot[Sin[x],{x,0,12},Axes>False],RasterSize>18] 
✕
MinecraftSetBlock[pos,"Air",Plot[Sin[x],{x,0,12},Axes>False],RasterSize>18] 
You can also control the orientation of rasterized images with an option Orientation.
If the content is a 3D geometry, then it will be rasterized in 3D:
✕
MinecraftSetBlock[pos,"Wood",Sphere[],RasterSize>50] 
✕
MinecraftSetBlock[pos,"Air",Sphere[],RasterSize>50] 
There are lots of 3D geometry primitives, and they can be combined in many ways. Here are some cuboids, a pyramid and a sphere to make a house:
✕
(*Main house frame*)MinecraftSetBlock[{pos,pos+{8,3,8}},"Wood"]; (*Windows*)MinecraftSetBlock[{pos+{1,0,0},pos+{7,3,8}},"Glass"]; (*Make it hollow*)MinecraftSetBlock[{pos+{1,0,1},pos+{7,3,7}},"Air"]; (*Add a doorway*)MinecraftSetBlock[{pos+{4,0,0},pos+{4,1,0}},"Air"]; (*Add a roof*) MinecraftSetBlock[pos+{0,4,0},"WoodPlanksSpruce",Pyramid[],RasterSize>12]; (*Decorate with gold ball*) MinecraftSetBlock[pos+{3,8,2},"GoldBlock",Sphere[],RasterSize>5];) 
OK, I’m not much of an architect! We can look at our creation from the air by controlling the camera:
✕
MinecraftSetCamera["Fixed"]; MinecraftSetCamera[{0,25,6}]; 
✕
MinecraftSetCamera["Normal"] 
Finally, we can interact with blocks that you hit using the right mouse button while holding a sword. The left mouse button just places and smashes blocks, but the right mouse button creates events that wait for us to read and act on them. You can read these with:
✕
MinecraftHitHistory[] 
This shows that since the game started, I have done two of these special hits, each time on the same block at {–1, 2, 2}, on face number 1 (the top of the block). I am player 1, but there could be multiple players. I can fetch these pieces of information by position and name. For example, HitHistory[–1] is the last hit, and we extract its "Position" information and use that coordinate in MinecraftGetBlock to discover the type of block that was most recently hit:
✕
MinecraftGetBlock[HitHistory[1]["Position"]] 
And we can clear the data with:
✕
MinecraftClearHits[] 
As a simple example, let’s monitor this list every second and create an explosive “super hit.” I will define the explosion first. It is a function that takes a position and places a large sphere of air at that position:
✕
explosion[event_]:=MinecraftSetBlock[event["Position"]{2,2,2},"Air",Ball[],RasterSize>8]; 
Now I create a scheduled task to run every second, and apply that function to the hit history:
✕
task=SessionSubmit[ ScheduledTask[ Map[explosion,MinecraftHitHistory[]];MinecraftClearHits[],1]] 
And now when I strike the ground in front of my house with my sword, using the right mouse button, a huge hole appears...
I can remove the monitoring task with:
✕
TaskRemove[task] 
There are a few more commands in the MinecraftLink package that you can read about in the documentation after you have installed the link.
As well as giving you a simple programming interface to Minecraft, similar to other languages, the Wolfram Language contains hundreds of highlevel functions that let you develop much more exciting projects quickly; you might want to check out some of the 3D geometry, 3D image processing and builtin data sources as a starting point.
I will return soon with a few projects of my own.
In the US House of Representatives, each state is assigned a number of representatives based on its population through the process of apportionment (or reapportionment). On the surface, the rules for this process are simple: each state gets at least one representative, and representative seats must be redistributed at least once per decennial census.
Apportionment has been tried using various mathematical methods throughout history. Since the 1940 Census, representatives have been assigned using the method of equal proportions (the Huntington–Hill method). This means that the next available slot goes to the state with the highest priority , defined as:
✕
TraditionalForm[Subscript[A, n]==P/GeometricMean[{n,n1}]] 
… where P is the population of the state and n is the number of districts already assigned to the state. You might recognize the denominator as the geometric mean of and . It’s straightforward to implement symbolically:
✕
Priority[pop_,n_]:=N[pop/GeometricMean[{n,n1}]] 
Formula in hand, I’d like to run a simulation to compare to the current apportionment plan. First I’ll pull the 2010 population data from the Wolfram Knowledgebase (excluding the District of Columbia):
✕
states=Complement[all US states with District of Columbia administrative divisions["Entities"],{Entity["AdministrativeDivision", {"DistrictOfColumbia", "UnitedStates"}]}]; 
✕
statenames=StringDrop[#["Name"],15]&/@states; 
✕
popdata=AssociationThread[statenames>Table[QuantityMagnitude[Dated[s,2010]["Population"]],{s,states}]]; 
✕
RandomChoice[Normal@popdata] 
It’s worth noting that these population counts are slightly different from the official reapportionment numbers, which include overseas residents for each state. The discrepancy is too small to make a difference in my apportionment computations, but it could be a topic for a more detailed exploration.
To start my simulation, I give each state one representative. The initial 50 are actually assigned before applying the formula, so I’ll set those initial priority values at Infinity:
✕
init=Thread[statenames>∞]; 
From there, districts are assigned based on successively smaller priority values. Historically, no state has received more than 55 seats, so I’ll set the upper limit at 60:
✕
pvalues=Flatten@Table[Normal@Priority[popdata,i],{i,2,60}]; 
Since only 435 seats are available, the rest can be dropped:
✕
app=TakeLargestBy[Join[init,pvalues],Values[#]&,435]; 
Here’s a function that displays the apportionment data on a map:
✕
DistrictWeightMap[apportionment_]:=GeoRegionValuePlot[KeyMap[Interpreter["USState"],apportionment]//Normal,GeoRange>Entity["Country", "UnitedStates"],GeoProjection>"Mercator",GeoLabels>(Text[Style[#4,FontFamily>"Arabic Transparent",White,Medium],#3]&), ImageSize>1200,ColorFunction>(Which[#<.02,GrayLevel[0.6],.02<#<=.15,Darker@Blue,.15<#<=.4,Darker@Orange,#>.4,Darker@Red]&),PlotRange>{0,Length@apportionment}, PlotLegends>Histogram] 
My simulation yields the following apportionment counts, which match the current plan exactly:
✕
DistrictWeightMap[app] 
(I’ve left off Alaska and Hawaii here for easier viewing, but they have one and two districts, respectively.)
The priority formula can be used to simulate the effect of population shifts on the next apportionment stage. For instance, the continuing decline in Illinois population could cause the state to lose a seat in the next cycle—currently, it gets the 423rd seat:
✕
Position[Normal[app],"Illinois"][[1,1]] 
✕
Priority[#,17]&/@{#,#40000.}&@popdata["Illinois"] 
Here are a few states likely to overtake that particular slot:
✕
Select[pvalues,10000 
Indeed, Texas is likely receive an additional seat (its 37th) based on its explosive growth since 2010:
✕
Priority[#,38]&/@{#,#+3000000}&@popdata["Texas"] 
✕
Take[app,10] 
Here’s a quick function I made to show differences in apportionment counts:
✕
DistrictDifferenceMap[newapp_,oldapp_]:=GeoRegionValuePlot[Quiet[Normal@KeyMap[Interpreter["USState"],Merge[{newapp,oldapp},Subtract@@#&]]/.{Subtract[a__]:>a}], GeoProjection>"Mercator", GeoRange>Entity["Country", "UnitedStates"], ImageSize>540, GeoLabels>(Text[Style[#4,"Text",White,10,FontFamily>"Arabic Transparent"],#3]&),ColorRules>({_?Positive>Green,_?Negative>Red,_>Gray})] 
If I reapportion using the current population estimates, Texas and South Carolina both gain a seat, while Wisconsin and Pennsylvania both lose one:
✕
latestpopdata=AssociationThread[statenames>Table[QuantityMagnitude[s["Population"]],{s,states}]]; 
✕
latestpvalues=Flatten@Table[Normal@Priority[latestpopdata,i],{i,2,60}]; 
✕
latestapp=TakeLargestBy[Join[init,latestpvalues],Values[#]&,435]; 
✕
DistrictDifferenceMap[ReverseSort@Counts[Keys@latestapp],ReverseSort@Counts[Keys@app]] 
But population growth also affects the apportionment process in other ways. While the number of districts increased steadily for many years, it has remained essentially constant at 435 since 1913 (a limit codified in 1929). As a result, modern representatives have much larger constituencies than in the past—on average, these have nearly quadrupled over the last century:
✕
uspophistory=Dated[Entity["Country", "UnitedStates"],All]["Population"]; 
✕
DateListPlot[TimeSeriesWindow[uspophistory,{"1918",Today}]/435.,ColorFunction>"DarkRainbow",PlotRange>Full,PlotTheme>"Detailed"] 
Some states also end up with much more populous districts than others. Many have argued that this violates the established “one person, one vote” principle by, for instance, giving voters in Wyoming (with around 500,000 voters per representative) more federal voting power than those in Montana (with about 900,000 voters per representative):
✕
popperdist=ReverseSort@Association@Table[Interpreter["USState"][s]>N[popdata[s]/Counts[Keys@app][s]],{s,statenames}]; 
✕
GeoRegionValuePlot[popperdist,GeoProjection>"Mercator",GeoRange>Entity["Country", "UnitedStates"],ColorFunction>"TemperatureMap"] 
A Congressional Apportionment Amendment was drafted (but never ratified) that set an initial guideline of “[not] less than one Representative for every forty thousand persons.” Here’s what the breakdown would look like if we used that guideline today:
✕
newapp=TakeLargestBy[Join[init,Flatten@Table[Normal[Priority[#,i]&/@popdata],{i,2,1000}]],Values[#]&,Floor[Total[popdata]/40000.]]; 
✕
DistrictWeightMap[newapp] 
While the original limitation of 40,000 citizens per representative is perhaps no longer viable (one can imagine how chaotic an 8,000member legislature would be), adding more seats certainly reduces the population spread among the districts:
✕
newpopperdist=ReverseSort@Association@Table[Interpreter["USState"][s]>N[popdata[s]/Counts[Keys@newapp][s]],{s,statenames}]; 
✕
GeoRegionValuePlot[newpopperdist,GeoProjection>"Mercator",GeoRange>Entity["Country", "UnitedStates"],ColorFunction>"TemperatureMap"] 
Of course, apportionment is just the first step. Adding more seats would also mean adding more districts—and that would likely make the next stage a lot more complicated.
Since populations migrate and fluctuate, government officials are constitutionally required to redraw congressional districts following reapportionment. On its surface, this seems straightforward: divide each state into areas of equal population. But the reality can be deceptively complex.
First, using a naive approach, the number of ways to divide a population into equal parts is huge. Suppose you wanted to split a group of 50 people into nonoverlapping groups of 10:
✕
Times@@Binomial[Range[50.,10,10],10]/2 
This issue scales up with the size of the population; with the current population of the US, the number of ways to divide it into 435 equal districts (ignoring all other constraints) is truly astounding:
✕
(Times@@Binomial[Range[#1,#2,#2],#2]/#2!)&@@{QuantityMagnitude[Entity["Country", "UnitedStates"]["Population"]],435} 
Then there’s the problem of actually drawing sensible districts with roughly equal population in each state. Congressional maps are usually drawn and approved by state legislatures, who must meet varying requirements for contiguousness, compactness and other qualities associated with “fair” districts. In a recent Wolfram Community post, Professor Marco Thiel explores a computational approach to drawing unbiased districts; here is how his algorithm splits up Iowa:
The latest district maps are available through the Wolfram Knowledgebase:
✕
current=KeyDrop[GroupBy[EntityList["USCongressionalDistrict"],#["USState"]&],{"DistrictOfColumbia",Missing["NotApplicable"],Missing["NotAvailable"]}]; 
This makes it easy to (roughly) check the equal population requirement; the districts within each state differ by less than one percent, on average:
✕
distpop=Table[DeleteMissing[#["Population"]&/@current[s]],{s,statenames}]; 
✕
Mean@Table[If[Length[v]>1,N@StandardDeviation[v]/Mean@v,0.],{v,distpop}] 
In some cases, the maps have a nice geometric aesthetic, with shapes that fit together like a children’s puzzle. This type of map tends to follow county lines, only straying when necessary to satisfy the equal population requirement. The quintessential example of this is Iowa:
✕
iacounties=EntityClass["AdministrativeDivision", "USCountiesIowa"]; 
✕
Show[GeoListPlot[List/@current["Iowa"],PlotLegends>None],GeoListPlot[iacounties,PlotStyle>Directive[EdgeForm[Blue],FaceForm[Opacity[0]]]]] 
Unfortunately, this isn’t the case in most states. By contrast, here is North Carolina’s notoriously jagged map:
✕
nccounties=EntityClass["AdministrativeDivision", "USCountiesNorthCarolina"]; 
✕
Show[GeoListPlot[List/@current["NorthCarolina"],PlotLegends>None],GeoListPlot[nccounties,PlotStyle>Directive[EdgeForm[Blue],FaceForm[Opacity[0]]]]] 
This kind of irregular shape is considered one of the main indications of deliberate manipulation of districts (and indeed, North Carolina’s map is currently being contested in court), but that’s not to say that every oddly shaped district is gerrymandered. Crooked borders often evolve slowly as the demography of areas change subtly over time.
I wanted to dig a bit deeper, so I thought I’d take a look at historical congressional maps that are readily available for viewing and analysis. The TIGER/Shapefile format (with a .zip extension) can be directly imported for a GeoGraphics object containing all the districts combined:
✕
Import["http://cdmaps.polisci.ucla.edu/shp/districts001.zip"] 
For inspecting individual districts, the files also contain detailed data in keyvalue pairs. I like using Association in this situation, since it lets me reference elements by name:
✕
c1=Association@First@Import["http://cdmaps.polisci.ucla.edu/shp/districts001.zip","Data"]; 
✕
Keys@c1 
The "LabeledData" element contains ordered information about individual districts:
✕
ld=Association@c1["LabeledData"]; Keys@ld 
From there, I can create entries that associate each state name with its district numbers and geometry:
✕
entries= <#[[1]]><ToExpression[#[[2]]]>#[[3]]>>&/@ Transpose[{ ld["STATENAME"], ld["DISTRICT"], Polygon[Cases[#,_GeoPosition,All]]&/@c1["Geometry"] }]; 
Lastly, I consolidate all entries from each state:
✕
statenames=Union[Keys@entries]//Flatten; 
✕
districts=Association@Table[Merge[Sort@Select[entries,StringMatchQ[First@Keys@#,s]&],Association],{s,statenames}]; 
From here, I can easily examine districts on a perstate basis:
✕
GeoListPlot[List/@Values@districts["Virginia"],PlotLegends>None] 
Here is the original districting map for the entire country, drawn by the First United States Congress:
✕
Show@@Table[GeoListPlot[List/@Values@d,PlotLegends>None],{d,districts}] 
I put it all together into a function that can import the full district data from any past Congress into an Association for easy exploration:
✕
CongressionalMapData[congressnumber_]:= Module[{baseURL="http://cdmaps.polisci.ucla.edu/shp/",raw,ld,entries,statenames},raw=Association@First@Import[baseURL<>"districts"<>StringPadLeft[ToString[congressnumber],3,"0"]<>".zip","Data"]; ld=Association@raw["LabeledData"]; entries= <#[[1]]><ToExpression[#[[2]]]>#[[3]]>>&/@Transpose[{ld["STATENAME"],ld["DISTRICT"], Polygon[Cases[#,_GeoPosition,All]]&/@raw["Geometry"]}]; statenames=Union[Keys@entries]//Flatten; Association@Table[Merge[Sort@Select[entries,StringMatchQ[First@Keys@#,s]&],Association],{s,statenames}] ] 
Rather than having to reference by Congress number, it’s easier to reference by year:
✕
CongressNumber[year_]:=Floor[(year1787)/2.] CongressionalMapData[year_?(#>1700&)]:=CongressionalMapData[CongressNumber[year]] 
Lastly, here’s a function for visualizing all districts in a state:
✕
DistrictMap[statedata_]:=GeoListPlot[Table[{s},{s,statedata}],GeoLabels>(Tooltip[#1,FirstPosition[statedata,#1][[1,1]]]&),PlotLegends>None] 
This makes it easy to look at individual districts for a given state and year. Looking at the map from 100 years ago, I was surprised to learn that Illinois used to have over 70% more districts:
✕
dist1918=CongressionalMapData[1918]; 
✕
N@Length@current["Illinois"]/Length@dist1918["Illinois"] 
This included one “atlarge” representative that represented the entire state, rather than a particular district or area. In this data, such districts are numbered “0”:
✕
GeoGraphics[dist1918["Illinois",0]] 
In general, it seems like the population has shifted away from the Midwest region:
✕
DistrictDifferenceMap[Length/@current,Length/@dist1918] 
Importing the full set of maps took me about 40 minutes and most of the RAM on my laptop:
✕
allmaps=Table[CongressionalMapData[cnum],{cnum,114}]; 
Here’s a complete history of reapportionment counts:
✕
frames = Table[{DistrictWeightMap[Length /@ Values /@ allmaps[[i]]], 1789 + 2 i  1}, {i, 114}]; ListAnimate[Labeled[#1, Style[#2, "Section"], Top] & @@@ frames] 
With the full dataset, I can look at the history of districts for a particular state, which can give some insights about its development. New York is an interesting case: it started with 10 districts, and continued gaining seats with population growth until it peaked at 44 districts in the mid20th century. Since then, it’s been losing seats due to population shifts and the 435member cap on the House. The map has changed nearly every tenyear cycle, indicating that internal demographics have shifted as well:
✕
nydists=Table[{i,allmaps[[CongressNumber[i],"New York"]]},{i,1793,2013,10}]; 
✕
ListAnimate[ Labeled[DistrictMap[#2], Style[ToString[#1] <> ": " <> Capitalize@IntegerName[Length[#2]] <> " Districts", "Section"], Top] & @@@ nydists, AnimationRepetitions > 1, AnimationRunning > False] 
New Hampshire sits on the other end of the spectrum, having gone through only minimal changes since its original map. It actually kept the same twodistrict plan for the eight cycles between 1880 and 1960. The simplest explanation is that, unlike New York, this state’s demographics have remained fairly constant (and its population growth average):
✕
nhdists=Table[{i,allmaps[[CongressNumber[i],"New Hampshire"]]},{i,1793,2013,10}]; 
✕
ListAnimate[ Labeled[DistrictMap[#2], Style[ToString[#1] <> ": " <> Capitalize@IntegerName[Length[#2]] <> If[Length[#2] == 1, " District", " Districts"], "Section"], Top] & @@@ nhdists, AnimationRepetitions > 1, AnimationRunning > False] 
The maps also illustrate some notable historical events. When the American Civil War broke out, Virginia seceded from the Union. But a group of Unionists in the northwestern part of the state broke from this decision, taking three districts from the state to form West Virginia:
✕
GeoListPlot[{Values[CongressionalMapData[1859]["Virginia"]], Values[CongressionalMapData[1863]["West Virginia"]]}] 
And of course, when the war was over, the two states remained separate:
✕
GeoListPlot[{Values[CongressionalMapData[1869]["Virginia"]], Values[CongressionalMapData[1869]["West Virginia"]]}] 
After the war, population counts grew in many southern states because of freed slaves, giving them more national voting power:
✕
dist1859=CongressionalMapData[1859]; dist1873=CongressionalMapData[1873]; 
✕
DistrictDifferenceMap[Length/@dist1873,Length/@dist1859] 
In the late 20th century, some states started adjusting maps to create majorityminority districts designed to ensure appropriate representation and voting power for minority groups (as required by the Voting Rights Act of 1965). Opponents of this practice claim that it constitutes racial gerrymandering; in some cases, the Supreme Court has agreed.
For instance, after gaining three seats in 1990, Texas attempted to draw new majorityminority districts to represent both Hispanic and African American voters. In Bush v. Vera, the court ruled that two of the new districts (the 29th and 30th) and one newly manipulated district (the 18th) violated compactness principles too severely:
✕
dist1993=CongressionalMapData[1993]; Row@Table[Labeled[GeoGraphics[{Green,dist1993["Texas",i]},ImageSize>150],Style[i,"Text",Darker@Green,Bold],Top],{i,{18,29,30}}] 
Legislators were forced to redraw the maps:
✕
dist1997=CongressionalMapData[1997]; Row@Table[Labeled[GeoGraphics[{Green,dist1997["Texas",i]},ImageSize>150],Style[i,"Text",Darker@Green,Bold],Top],{i,{18,29,30}}] 
This indicates that while some level of affirmative racial gerrymandering may be acceptable, the shape of a district must still be sensible. Of course, plenty of minoritymajority districts exist naturally because of concentrated minority populations. Many of these are in southern regions with large African American populations:
✕
mm=Import["https://en.wikipedia.org/wiki/List_of_majorityminority_United_States_congressional_districts","Data"]; aalist=mm[[1,1,4,3;;27]]; GeoRegionValuePlot[Table[ current[[StringDelete[aalist[[d,3]]," "],aalist[[d,4]]]]>Quantity[aalist[[d,2]]],{d,Length@aalist}],GeoRange>{{40.,25.}, {95.,75.}},GeoProjection>"Mercator"] 
There are also a number of southwest regions with Hispanic/Latinomajority districts:
✕
hisplist=mm[[1,1,6,3;;27]]; GeoRegionValuePlot[Table[ current[[StringDelete[hisplist[[d,3]]," "],hisplist[[d,4]]]]>Quantity[hisplist[[d,2]]],{d,Length@hisplist}],GeoRange>{{38,25},{120,95}},GeoProjection>"Mercator"] 
Maps are sometimes adjusted to preserve “communities of interest,” such as separating rural and urban populations or keeping a major business in the same district as its workers. States with large metropolitan areas can be especially difficult to draw lines for because of the high population density in those areas. Here in Illinois, the city of Chicago makes up a full 21% of the population:
✕
N[Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Population"]/Entity["AdministrativeDivision", {"Illinois", "UnitedStates"}]["Population"]] 
A look at the map shows that the city itself sprawls across nearly half the state’s 18 districts in order to distribute that population:
✕
Show[GeoListPlot[List/@Most[current["Illinois"]],PlotLegends>None], GeoGraphics[{FaceForm[Directive[Opacity[1.],Black]],EdgeForm[White],Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Polygon"]}],GeoRange>Entity["City", {"Chicago", "Illinois", "UnitedStates"}]] 
Looking at historic maps of the area, it’s clear this wasn’t always the case. Just after the Civil War, the state had 14 districts, with Chicago mostly enclosed in just one:
✕
dist1865=CongressionalMapData[1865]; Length@dist1865["Illinois"] 
✕
Show[GeoListPlot[List/@Values@dist1865["Illinois"],PlotLegends>None], GeoGraphics[{FaceForm[Directive[Opacity[1.],Black]],EdgeForm[White],Dated[Entity["City", {"Chicago", "Illinois", "UnitedStates"}],1823]["Polygon"]}],GeoRange>Entity["City", {"Chicago", "Illinois", "UnitedStates"}]] 
From this perspective, it’s also clear that the current Illinois districts fall into the “jagged” category, which is often the case when mapmakers start using more complex factors to draw their maps. Since modern redistricting is often done using heavyduty GIS software with detailed maps and highresolution census data overlays, it can be difficult to tell what the reasoning was for districts’ shapes—or what kinds of manipulation might have taken place. But there’s growing concern that these manipulations might be damaging to the democratic process.
Throughout America’s 200year history, countless legislative bodies (and at least one independent commission) have been accused of partisan gerrymandering—and some of these accusations have gone all the way to the Supreme Court. But to explore this issue effectively, I need to look at more than just maps.
I found comprehensive election data in PDF format from the Clerk of the House. I tried various methods for importing these; in the end I created a package that uses string patterns to sort through election information:
✕
< 
The package allows me to import election data by state and year (starting in 1998) as a Dataset:
✕
ildata=RepresentativeVotesDataset["Illinois",2014] 
To simplify analysis of elections, I also made a function to extract results for the two major parties (Republican and Democratic) in a given election cycle:
✕
PartyVotes[electiondata_]:=With[{votes=GroupBy[Select[electiondata,StringMatchQ[#["Party"],"Republican""Democrat"]&],"District"]},Table[<#["Party"]>(#["Votes"])&/@Normal@votes[i,All]>,{i,Length@votes}]] 
In Illinois, it’s well known that Chicago is often in political opposition to much of the (rural) rest of the state—and with the current map, many argue that the city holds undue power in the legislature. In the 2014 election, Democrats won 12 of the state’s 18 seats (66%) with just 51% of the total popular vote:
✕
ilvotes=PartyVotes[ildata]; Total@ilvotes/Total@ildata[[All,"Votes"]]//N 
✕
Show[GeoRegionValuePlot[Thread[Most[current["Illinois"]]>(KeySort@N[#/Total[#]]&/@ilvotes)[[All,1]]],ColorFunction>(Blend[{Red,Blue},#]&), PlotRange>{0,1}], GeoGraphics[{FaceForm[Directive[Opacity[1.],Green]],EdgeForm[White],Entity["City", {"Chicago", "Illinois", "UnitedStates"}]["Polygon"]}]] 
And aside from a few “purple” bistate areas, the irregular districts in Chicago appear to tip the balance for Democrats. While no case has been brought forth in Illinois, most critics point to the earmuffshaped fourth district as a prime example of extreme gerrymandering:
✕
GeoGraphics[{Green,Polygon@current[["Illinois",4]]}] 
Shapebased arguments have historically dominated in gerrymandering cases, and it’s easy to see why—literally. Anyone can look and get a general sense of how complex a shape is. But there are also some geometric tests for measuring the compactness of a district. In his Community post, Marco Thiel tests out a few techniques that involve computing the ratio of a region’s area to the area of a circumscribed shape:
The range considered acceptable for each test can be subjective, but each measure gives a value between 0 and 1. Looking at the distribution of each test among the states, you can get a good sense of what’s average:
✕
Multicolumn[{CloudGet["https://wolfr.am/vF9vSryp"], CloudGet["https://wolfr.am/vF9vSHh1"], CloudGet["https://wolfr.am/vF9vT2uh"], CloudGet["https://wolfr.am/vF9vTegh"]}, ItemSize > Full] 
Here are some of the least compact districts in the country, according to Marco’s computations:
Application of these and similar geometric tests has led several courts to strike down district maps that lack compactness (like in Texas). But there’s no single way to measure compactness, and some odd shapes are due to natural boundaries and other nonpolitical factors.
Aside from that, both legislators and courts have been reluctant to make any strong statements about partisan gerrymandering because of the inherent political implications: any law or ruling that seems to favor a particular party could be highly criticized. So the fact that the Supreme Court took two cases on this topic (in addition to one on racial gerrymandering) is a pretty big deal.
The first case, Gill v. Whitford, takes a practical approach to the problem: if partisan gerrymandering is the issue, they reason, perhaps it needs a partisanbased solution. Originating in a Wisconsin state court, the plaintiffs presented a case in October 2017 based on a new measure of partisan bias proposed by Nicholas Stephanopoulos and Eric McGhee called efficiency gap. The formula is best summarized as the difference in the total number of wasted votes for each party—including votes cast for a losing candidate and surplus votes cast for a winning candidate—over the total votes cast:
✕
TraditionalForm[EG==(HoldForm@(Subscript[lost, A]+Subscript[surplus, A])HoldForm@(Subscript[lost, B]+Subscript[surplus, B]))/(total votes)] 
By assuming equal population per district and a twoparty system, this formula is conveniently reduced to the difference between a party’s seat margin (percentage of seats over 50%) and twice its vote margin:
✕
TraditionalForm[EG=="seat margin"  2 *"vote margin"] 
From the data I collected, I can easily compute the seat margins and vote margins:
✕
SeatMargin[electiondata_]:=With[{pv=PartyVotes[electiondata]},N@(Counts[Flatten@Keys[TakeLargest[#,1]&/@pv]]Length@pv/2)/Length@pv] VoteMargin[electiondata_]:=N@#/Total[#]&@Merge[PartyVotes[electiondata],Total].5 
For congressional districts, the efficiency gap is given in seats. Here's an implementation of the simplified efficiency gap formula with positive numbers indicating a Democratic advantage and negative indicating a Republican advantage:
✕
EfficiencyGap[electiondata_]:=Length[GroupBy[electiondata,"District"]] *(KeySort[SeatMargin[electiondata]]2 KeySort[VoteMargin[electiondata]]) 
From a legal standpoint, the argument is that the “wasted” votes constitute a violation of the voters’ rights under the Equal Protection Clause. According to the authors, an advantage of two or more seats could indicate district manipulation in a given state. The paper points out a few states with large congressional efficiency gaps in recent cycles:
✕
Table[With[{data=GroupBy[RepresentativeVotesDataset[state,{1998,2016}],"Year"]},DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[EfficiencyGap@data[[i]],{i,Length@data}][[All,1]]}],PlotTheme>"Scientific"]],{state,{"Michigan","Michigan","North Carolina","Ohio","Pennsylvania","Texas","Virginia"}}] 
Although Gill v. Whitford deals with state legislative districts, Wisconsin’s congressional districts seem to show a strong trend toward Republican dominance over the past 20 years as well:
✕
widata=GroupBy[RepresentativeVotesDataset["Wisconsin",{1998,2016}],"Year"]; 
✕
DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[EfficiencyGap@widata[[i]],{i,Length@widata}][[All,1]]}],PlotTheme>"Scientific"] 
You can see this effect on the maps, where many previously contentious areas now run more solidly red:
✕
widists=Table[CongressionalMapData[i]["Wisconsin"],{i,2000,2016,4}]; 
✕
wivotes=Table[PartyVotes[RepresentativeVotesDataset["Wisconsin",i]],{i,2000,2016,4}]; 
✕
Grid[{Text/@Range[2000,2016,4], Table[GeoRegionValuePlot[Thread[Values[widists[[i]]]>(KeySort@N[#/Total[#]]&/@wivotes[[i]])[[All,1]]], PlotLegends>None, ColorFunction>(Blend[{Red,Blue},#]&), ImageSize>100],{i,Length@widists}]}] 
In Benisek v. Lamone (coming from Maryland), the legal argument instead hinges on the First Amendment: casting a vote is considered a form of expression (i.e. speech), and the claim is that the offending district “dilutes the votes” of Republican voters, thus reducing the value of those voters’ speech. While this case presents no particular standard for computing the extent of partisan gerrymandering, it does provide a fresh legal route for applying any standard that might be instated. Either way, the efficiency gap test shows a rather prominent Democratic trend in Maryland:
✕
mddata=GroupBy[RepresentativeVotesDataset["Maryland",{1998,2016}],"Year"]; 
✕
DateListPlot[Transpose[{DateRange[{1998},{2016},2yr],Table[KeySort@EfficiencyGap@mddata[[i]],{i,Length@mddata}][[All,1]]}],PlotTheme>"Scientific"] 
And a look at the district in question shows that its latest map is far from compact:
✕
mddists=Table[CongressionalMapData[i]["Maryland"],{i,2000,2016,8}]; 
✕
mdvotes=Table[PartyVotes[RepresentativeVotesDataset["Maryland",i]],{i,2000,2016,8}]; 
✕
Grid[{Text/@Range[2000,2016,8],GeoGraphics/@Transpose[{GeoStyling[Blend[{Red,Blue},#]]&/@(KeySort@N[#/Total[#]]&/@mdvotes[[All,6]])[[All,1]],mddists[[All,6]]}]}] 
The Supreme Court also picked up another case in January 2018, this time about racial gerrymandering. Like Bush v. Vera, the case comes from Texas, but this time it’s centered around negative racial gerrymandering. Republican lawmakers are appealing a lower court’s ruling that the state’s latest maps discriminate against racial minorities. The efficiency gap doesn’t exactly translate to this case, but one could conceive of a similar measure based on the wasted votes of racial minorities.
Suffice it to say, the gerrymandering issue is coming to a head. With these three cases combined—as well as recent decisions in North Carolina and Pennsylvania, a ballot initiative in Michigan and all kinds of academic discussions around the country—the stage is set for the Supreme Court to make changes in how redistricting is regulated. Unfortunately, they’ve opted to pass on both partisan gerrymandering cases on technical grounds, so we will likely have to wait until next session to get a major decision.
Gerrymandering is a complex subject with a deep history, and this post only scratches the surface. Exploring with the Wolfram Language helped me pull everything together easily and discover a lot of intricacies I wouldn’t have otherwise found. Now that I've collected all the data in one place, I invite you to do your own exploration. Go find out the history of your district, explore measures of fairness and partition states as you see fit—just don’t forget to go out and vote this November!
Download this post as a Wolfram Notebook.
]]>In a sense, you can view neural network regression as a kind of intermediary solution between true regression (where you have a fixed probabilistic model with some underlying parameters you need to find) and interpolation (where your goal is mostly to draw an eyepleasing line between your data points). Neural networks can get you something from both worlds: the flexibility of interpolation and the ability to produce predictions with error bars like when you do regression.
For those of you who already know about neural networks, I can give a very brief hint as to how this works: you build a randomized neural network with dropout layers that you train like you normally would, but after training you don’t deactivate the dropout layers and keep using them to sample the network several times while making predictions to get a measure of the errors. Don’t worry if that sentence didn’t make sense to you yet, because I will explain all of this in more detail.
To start, let’s do some basic neural network regression on the following data I made by taking points on a bell curve (e.g. the function ) and adding random numbers to it:
✕
exampleData = {{1.8290606952826973`, 0.34220332868351117`}, {0.6221091101205225`, 0.6029615713235724`}, {1.2928624443456638`, 0.14264805848673934`}, {1.7383127604822395`, \ 0.09676233458358859`}, {2.701795903782372`, 0.1256597483577385`}, {1.7400006797156493`, 0.07503425036465608`}, {0.6367237544480613`, 0.8371547667282598`}, {2.482802633037993`, 0.04691691595492773`}, {0.9566109777301293`, 0.3860569423794188`}, {2.551790012296368`, \ 0.037340684890464014`}, {0.6626176509888584`, 0.7670620756823968`}, {2.865357628008809`, 0.1120949485036743`}, \ {0.024445094773154707`, 1.3288343886644758`}, {2.6538667331049197`, \ 0.005468132072381475`}, {1.1353110951218213`, 0.15366247144719652`}, {3.209853579579198`, 0.20621896435600656`}, {0.13992534568622972`, 0.8204487134187859`}, {2.4013110392840886`, \ 0.26232722849881523`}, {2.1199290467312526`, 0.09261482926621102`}, {2.210336371360782`, 0.02664895740254644`}, {0.33732886898809156`, 1.1701573388517288`}, {2.2548343241910374`, \ 0.3576908508717164`}, {1.4077788877461703`, 0.269393680956761`}, {3.210242875591371`, 0.21099679051999695`}, {0.7898064016052615`, 0.6198835029596128`}, {2.1835077887328893`, 0.08410415228550497`}, {0.008631687647122632`, 1.0501425654209409`}, {2.1792531502694334`, \ 0.11606480328877161`}, {3.231947584552822`, 0.2359904673791076`}, \ {0.7980615888830211`, 0.5151437742866803`}} plot = ListPlot[exampleData, PlotStyle > Red] 
A regression neural network is basically a chain of alternating linear and nonlinear layers: the linear layers give your net a lot of free parameters to work with, while the nonlinear layers make sure that things don’t get boring. Common examples of nonlinear layers are the hyperbolic tangent, logistic sigmoid and the ramp function. For simplicity, I will stick with the Ramp nonlinearity, which simply puts kinks into straight lines (meaning that you get regressions that are piecewise linear):
✕
netRamp = NetChain[ {LinearLayer[100], Ramp, LinearLayer[100], Ramp, LinearLayer[]}, "Input" > "Real", "Output" > "Real" ]; trainedRamp = NetTrain[netRamp, <"Input" > exampleData[[All, 1]], "Output" > exampleData[[All, 2]]>, Method > "ADAM", LossFunction > MeanSquaredLossLayer[], TimeGoal > 120, TargetDevice > "GPU"]; Show[Plot[ trainedRamp[x], {x, 3.5, 3.5}, PlotLabel > "Overtrained network"], plot, ImageSize > Full, PlotRange > All] 
As you can see, the network more or less just follows the points because it doesn’t understand the difference between the trend and the noise in the data. In the range above, the mixup between trend and noise is particularly bad. The longer you train the network and the larger your linear layer, the stronger this effect will be. Obviously this is not what you want, since you’re really interested in fitting the trend of the data. Besides: if you really want to fit noise, you could just use interpolation instead. To prevent this overfitting of the data, you regularize (as explained in this tutorial) the network by using any or all of the following: a ValidationSet, regularization or a DropoutLayer. I will focus on the regularization coefficient and on dropout layers (in the next section you’ll see why), so let me briefly explain how they work:
To get a feeling of how these two methods regularize the regression, I made the following parameter sweeps of and :
✕
log\[Lambda]List = Range[5, 1, 1]; regularizedNets = NetTrain[ netRamp, <"Input" > exampleData[[All, 1]], "Output" > exampleData[[All, 2]]>, LossFunction > MeanSquaredLossLayer[], Method > {"ADAM", "L2Regularization" > 10^#}, TimeGoal > 20 ] & /@ log\[Lambda]List; With[{xvals = Range[3.5, 3.5, 0.1]}, Show[ ListPlot[ TimeSeries[Transpose@Through[regularizedNets[xvals]], {xvals}, ValueDimensions > Length[regularizedNets]], PlotLabel > "\!\(\*SubscriptBox[\(L\), \(2\)]\)regularized networks", Joined > True, PlotLegends > Map[StringForm["`1` = `2`", Subscript[\[Lambda], 2], HoldForm[10^#]] &, log\[Lambda]List] ], plot, ImageSize > 450, PlotRange > All ] ] 
✕
pDropoutList = {0.0001, 0.001, 0.01, 0.05, 0.1, 0.5}; dropoutNets = NetChain[ {LinearLayer[300], Ramp, DropoutLayer[#], LinearLayer[]}, "Input" > "Real", "Output" > "Real" ] & /@ pDropoutList; trainedDropoutNets = NetTrain[ #, <"Input" > exampleData[[All, 1]], "Output" > exampleData[[All, 2]]>, LossFunction > MeanSquaredLossLayer[], Method > {"ADAM"(*,"L2Regularization"\[Rule]10^#*)}, TimeGoal > 20 ] & /@ dropoutNets; With[{xvals = Range[3.5, 3.5, 0.1]}, Show[ ListPlot[ TimeSeries[Transpose@Through[trainedDropoutNets[xvals]], {xvals}, ValueDimensions > Length[trainedDropoutNets]], PlotLabel > "Dropoutregularized networks", Joined > True, PlotLegends > Map[StringForm["`1` = `2`", Subscript[p, drop], #] &, pDropoutList] ], plot, ImageSize > 450, PlotRange > All ] ] 
To summarize:
Both regularization methods mentioned previously were originally proposed as ad hoc solutions to the overfitting problem. However, recent work has shown that there are actually very good fundamental mathematical reasons why these methods work. Even more importantly, it has been shown that you can use them to do better than just produce a regression line! For those of you who are interested, I suggest reading this blog post by Yarin Gal. His thesis “Uncertainty in Deep Learning” is also well worth a look and is the main source for what follows in the rest of this post.
As it turns out, there is a link between stochastic regression neural networks and Gaussian processes, which are freeform regression methods that let you predict values and put error bands on those predictions. To do this, we need to consider neural network regression as a proper Bayesian inference procedure. Normally, Bayesian inference is quite computationally expensive, but as it conveniently turns out, you can do an approximate inference with minimal extra effort on top of what I already did above.
The basic idea is to use dropout layers to create a noisy neural network that is trained on the data as normal. However, I’m also going to use the dropout layers when doing predictions: for every value where I need a prediction, I will sample the network multiple times to get a sense of the errors in the predictions.
Furthermore, it’s good to keep in mind that you, as a newly converted Bayesian, are also dealing with priors. In particular, the network weights are now random variables with a prior distribution and a posterior distribution (i.e. the distributions before and after learning). This may sound rather difficult, so let me try to answer two questions you may have at this point:
Q1: Does that mean that I actually have to think hard about my prior now?
A1: No, not really, because it simply turns out that our old friend , the regularization coefficient, is really just the inverse standard deviation of the network prior weights: if you choose a larger , that means you’re only allowing small network weights.
Q2: So what about the posterior distribution of the weights? Don’t I have to integrate the predictions over the posterior weight distribution to get a posterior predictive distribution?
A2: Yes, you do, and that’s exactly what you do (at least approximately) when you sample the trained network with the dropout layers active. The sampling of the network is just a form of Monte Carlo integration over the posterior distribution.
So as you can see, being a Bayesian here really just means giving things a different name without having to change your way of doing things very much.
Let’s start with the simplest type of regression in which the noise level of the data is assumed constant across the x axis. This is also called homoscedastic regression (as opposed to heteroscedastic regression, where the noise is a function of x). It does not, however, mean that the prediction error will also be constant: the prediction error depends on the noise level but also on the uncertainty in the network weights.
So let’s get to it and see how this works out, shall we? First I will define my network with a dropout layer. Normally you’d put a dropout layer before every linear layer, but since the input is just a number, I’m omitting the first dropout layer:
✕
\[Lambda]2 = 0.01; pdrop = 0.1; nUnits = 300; activation = Ramp; net = NetChain[ {LinearLayer[nUnits], ElementwiseLayer[activation], DropoutLayer[pdrop], LinearLayer[]}, "Input" > "Real", "Output" > "Real" ] 
✕
trainedNet = NetTrain[ net, <"Input" > exampleData[[All, 1]], "Output" > exampleData[[All, 2]]>, LossFunction > MeanSquaredLossLayer[], Method > {"ADAM", "L2Regularization" > \[Lambda]2}, TimeGoal > 10 ]; 
Next, we need to produce predictions from this model. To calibrate the model, you need to provide a prior length scale l that expresses your belief in how correlated the data is over a distance (just like in Gaussian process regression). Together with the regularization coefficient , the dropout probability p and the number of training data points N, you have to add the following variance to the sample variance of the network:
The following function takes a trained net and samples it multiple times with the dropout layers active (using NetEvaluationMode → "Train"). It then constructs a time series object of the –1, 0 and +1σ bands of the predictions:
✕
sampleNet[net : (_NetChain  _NetGraph), xvalues_List, sampleNumber_Integer?Positive, {lengthScale_, l2reg_, prob_, nExample_}] := TimeSeries[ Map[ With[{ mean = Mean[#], stdv = Sqrt[Variance[#] + (2 l2reg nExample)/(lengthScale^2 (1  prob))] }, mean + stdv*{1, 0, 1} ] &, Transpose@ Select[Table[ net[xvalues, NetEvaluationMode > "Train"], {i, sampleNumber}], ListQ]], {xvalues}, ValueDimensions > 3 ]; 
Now we can go ahead and plot the predictions with 1σ error bands. The prior seems to work reasonably well, though in real applications you’d need to calibrate it with a validation set (just like you would with and p).
✕
l = 2; samples = sampleNet[trainedNet, Range[5, 5, 0.05], 200, {l, \[Lambda]2, pdrop, Length[exampleData]}]; Show[ ListPlot[ samples, Joined > True, Filling > {1 > {2}, 3 > {2}}, PlotStyle > {Lighter[Blue], Blue, Lighter[Blue]} ], ListPlot[exampleData, PlotStyle > Red], ImageSize > 600, PlotRange > All ] 
As you can see, the network has a tendency to do linear extrapolation due to my choice of the ramp nonlinearity. Picking different nonlinearities will lead to different extrapolation behaviors. In terms of Gaussian process regression, the choice of your network design influences the effective covariance kernel you’re using.
If you’re curious to see how the different network parameters influence the look of the regression, skip down a few paragraphs and try the manipulates, where you can interactively train your own network on data you can edit on the fly.
In heteroscedastic regression, you let the neural net try and find the noise level for itself. This means that the regression network outputs two numbers instead of one: a mean and a standard deviation. However, since the outputs of the network are real numbers, it’s easier if you use the logprecision instead of the standard deviation: :
✕
\[Lambda]2 = 0.01; pdrop = 0.1; nUnits = 300; activation = Ramp; regressionNet = NetGraph[ {LinearLayer[nUnits], ElementwiseLayer[activation], DropoutLayer[pdrop], LinearLayer[], LinearLayer[]}, { NetPort["Input"] > 1 > 2 > 3, 3 > 4 > NetPort["Mean"], 3 > 5 > NetPort["LogPrecision"] }, "Input" > "Real", "Mean" > "Real", "LogPrecision" > "Real" ] 
Next, instead of using a MeanSquaredLossLayer to train the network, you minimize the negative loglikelihood of the observed data. Again, you replace σ with the log of the precision and multiply everything by 2 to be in agreement with the convention of MeanSquaredLossLayer.
✕
FullSimplify[2* LogLikelihood[ NormalDistribution[\[Mu], \[Sigma]], {yobs}] /. \[Sigma] > 1/ Sqrt[Exp[log\[Tau]]], Assumptions > log\[Tau] \[Element] Reals] 
Discarding the constant term gives us the following loss:
✕
loss = Function[{y, mean, logPrecision}, (y  mean)^2*Exp[logPrecision]  logPrecision ]; netHetero = NetGraph[< "reg" > regressionNet, "negLoglikelihood" > ThreadingLayer[loss] >, { NetPort["x"] > "reg", {NetPort["y"], NetPort[{"reg", "Mean"}], NetPort[{"reg", "LogPrecision"}]} > "negLoglikelihood" > NetPort["Loss"] }, "y" > "Real", "Loss" > "Real" ] 
✕
trainedNetHetero = NetTrain[ netHetero, <"x" > exampleData[[All, 1]], "y" > exampleData[[All, 2]]>, LossFunction > "Loss", Method > {"ADAM", "L2Regularization" > \[Lambda]2} ]; 
Again, the predictions are sampled multiple times. The predictive variance is now the sum of the variance of the predicted mean + mean of the predicted variance. The priors no longer influence the variance directly, but only through the network training:
✕
sampleNetHetero[net : (_NetChain  _NetGraph), xvalues_List, sampleNumber_Integer?Positive] := With[{regressionNet = NetExtract[net, "reg"]}, TimeSeries[ With[{ samples = Select[Table[ regressionNet[xvalues, NetEvaluationMode > "Train"], {i, sampleNumber}], AssociationQ] }, With[{ mean = Mean[samples[[All, "Mean"]]], stdv = Sqrt[Variance[samples[[All, "Mean"]]] + Mean[Exp[samples[[All, "LogPrecision"]]]]] }, Transpose[{mean  stdv, mean, mean + stdv}] ] ], {xvalues}, ValueDimensions > 3 ] ]; 
Now you can plot the predictions with 1σ error bands:
✕
samples = sampleNetHetero[trainedNetHetero, Range[5, 5, 0.05], 200]; Show[ ListPlot[ samples, Joined > True, Filling > {1 > {2}, 3 > {2}}, PlotStyle > {Lighter[Blue], Blue, Lighter[Blue]} ], ListPlot[exampleData, PlotStyle > Red], ImageSize > 600, PlotRange > All ] 
Of course, it’s still necessary to do validation of this network; one network architecture might be much better suited to the data at hand than another, so there is still the need to use validation sets to decide which model you have to use and with what parameters. Attached to the end of this blog post, you’ll find a notebook with an interactive demo of the regression method I just showed. With this code, you can find out for yourself how the different model parameters influence the predictions of the network.
The code in this section shows how to implement the loss function described in the paper “Dropout Inference in Bayesian Neural Networks with AlphaDivergences” by Li and Gal. For an interpretation of the α parameter used in this work, see e.g. figure 2 in “BlackBox αDivergence Minimization” by HernándezLobato et al (2016).
In the paper by Li and Gal, the authors propose a modified loss function ℒ for a stochastic neural network to solve a weakness of the standard loss function I used above: it tends to underfit the posterior and give overly optimistic predictions. Optimistic predictions are a problem: when you fit your data to try and get a sense of what the real world might give you, you don’t want to be thrown a curveball afterwards.
During training, the training inputs (with indexing the training examples) are fed through the network K times to sample the outputs and compared to the training outputs . Given a particular standard loss function l (e.g. mean square error, negative log likelihood, crossentropy) and regularization function for the weights θ, the modified loss function ℒ is given as:
The parameter α is the divergence parameter, which is typically tuned to (though you can pick other values as well, if you want). It can be thought of as a “pessimism” parameter: the higher it is, the more the network will tend to err on the side of caution and the larger error estimates. Practically speaking, a higher α parameter makes the loss function more lenient to the presence of large losses among the K samples, meaning that after training the network will produce a larger spread of predictions when sampled. Literature seems to suggest that is a pretty good value to start with. In the limit α→0, the LogSumExp simply becomes the sample average over K losses.
As can be seen, we need to sample the network several times during training. We can accomplish this with NetMapOperator. As a simple example, suppose we want to apply a dropout layer times to the same input. To do this, we duplicate the input and then wrap a NetMapOperator around the dropout layer and map it over the duplicated input:
✕
input = Range[5]; NetChain[{ ReplicateLayer[10], NetMapOperator[ DropoutLayer[0.5] ] } ][input, NetEvaluationMode > "Train"] 
Next, define a net that will try to fit the data points with a normal distribution like in the previous heteroscedastic example. The output of the net is now a length2 vector with the mean and the log precision (we can’t have two output ports because we’re going to have wrap the whole thing into NetMapOperator):
✕
alpha = 0.5; pdrop = 0.1; units = 300; activation = Ramp; \[Lambda]2 = 0.01; (*L2 regularization coefficient*) k = 25; (* number of samples of the network for calculating the loss*) regnet = NetInitialize@NetChain[{ LinearLayer[units], ElementwiseLayer[activation], DropoutLayer[pdrop], LinearLayer[] }, "Input" > "Real", "Output" > {2} ]; 
You will also need a network element to calculate the LogSumExp operator that aggregates the losses of the different samples of the regression network. I implemented the αweighted LogSumExp by factoring out the largest term before feeding the vector into the exponent to make it more numerically stable. Note that I’m ignoring theterm since it’s a constant for the purpose of training the network.
✕
logsumexp\[Alpha][alpha_] := NetGraph[< "timesAlpha" > ElementwiseLayer[Function[alpha #]], "max" > AggregationLayer[Max, 1], "rep" > ReplicateLayer[k], "sub" > ThreadingLayer[Subtract], "expAlph" > ElementwiseLayer[Exp], "sum" > SummationLayer[], "logplusmax" > ThreadingLayer[Function[{sum, max}, Log[sum] + max]], "invalpha" > ElementwiseLayer[Function[(#/alpha)]] >, { NetPort["Input"] > "timesAlpha", "timesAlpha" > "max" > "rep", {"timesAlpha", "rep"} > "sub" > "expAlph" > "sum" , {"sum", "max"} > "logplusmax" > "invalpha" }, "Input" > {k} ]; logsumexp\[Alpha][alpha] 
Define the network that will be used for training:
✕
net\[Alpha][alpha_] := NetGraph[< "rep1" > ReplicateLayer[k],(* replicate the inputs and outputs of the network *) "rep2" > ReplicateLayer[k], "map" > NetMapOperator[regnet], "mean" > PartLayer[{All, 1}], "logprecision" > PartLayer[{All, 2}], "loss" > ThreadingLayer[ Function[{mean, logprecision, y}, (mean  y)^2*Exp[logprecision]  logprecision]], "logsumexp" > logsumexp\[Alpha][alpha] >, { NetPort["x"] > "rep1" > "map", "map" > "mean", "map" > "logprecision", NetPort["y"] > "rep2", {"mean", "logprecision", "rep2"} > "loss" > "logsumexp" > NetPort["Loss"] }, "x" > "Real", "y" > "Real" ]; net\[Alpha][alpha] 
… and train it:
✕
trainedNet\[Alpha] = NetTrain[ net\[Alpha][alpha], <"x" > exampleData[[All, 1]], "y" > exampleData[[All, 2]]>, LossFunction > "Loss", Method > {"ADAM", "L2Regularization" > \[Lambda]2}, TargetDevice > "CPU", TimeGoal > 60 ]; 
✕
sampleNet\[Alpha][net : (_NetChain  _NetGraph), xvalues_List, nSamples_Integer?Positive] := With[{regnet = NetExtract[net, {"map", "Net"}]}, TimeSeries[ Map[ With[{ mean = Mean[#[[All, 1]]], stdv = Sqrt[Variance[#[[All, 1]]] + Mean[Exp[#[[All, 2]]]]] }, mean + stdv*{1, 0, 1} ] &, Transpose@Select[ Table[ regnet[xvalues, NetEvaluationMode > "Train"], {i, nSamples} ], ListQ]], {xvalues}, ValueDimensions > 3 ] ]; 
✕
samples = sampleNet\[Alpha][trainedNet\[Alpha], Range[5, 5, 0.05], 200]; Show[ ListPlot[ samples, Joined > True, Filling > {1 > {2}, 3 > {2}}, PlotStyle > {Lighter[Blue], Blue, Lighter[Blue]} ], ListPlot[exampleData, PlotStyle > Red], ImageSize > 600, PlotRange > All ] 
I’ve discussed that dropout layers and the regularization coefficient in neural network training can actually be seen as components of a Bayesian inference procedure that approximates Gaussian process regression. By simply training a network with dropout layers like normal and then running the network several times in NetEvaluationMode → "Train", you can get an estimate of the predictive posterior distribution, which not only includes the noise inherently in the data but also the uncertainty in the trained network weights.
If you’d like to learn more about this material or have any questions you’d like to ask, please feel free to visit my discussion on Wolfram Community.