“We’ve just got to decide: is a chemical like a city or like a number?” I spent my day yesterday—as I have for much of the past 30 years—designing new features of the Wolfram Language. And yesterday afternoon one of my meetings was a fast-paced discussion about how to extend the chemistry capabilities of the language.
At some level the problem we were discussing was quintessentially practical. But as so often turns out to be the case for things we do, it ultimately involves some deep intellectual issues. And to actually get the right answer—and to successfully design language features that will stand the test of time—we needed to plumb those depths, and talk about things that usually wouldn’t be considered outside of some kind of philosophy seminar.
Part of the issue, of course, is that we’re dealing with things that haven’t really ever come up before. Traditional computer languages don’t try to talk directly about things like chemicals; they just deal with abstract data. But in the Wolfram Language we’re trying to build in as much knowledge about everything as possible, and that means we have to deal with actual things in the world, like chemicals.
We’ve built a whole system in the Wolfram Language for handling what we call entities. An entity could be a city (like New York City), or a movie, or a planet—or a zillion other things. An entity has some kind of name (“New York City”). And it has definite properties (like population, land area, founding date, …).
We’ve long had a notion of chemical entities—like water, or ethanol, or tungsten carbide. Each of these chemical entities has properties, like molecular mass, or structure graph, or boiling point.
And we’ve got many hundreds of thousands of chemicals where we know lots of properties. But all of these are in a sense concrete chemicals: specific compounds that we could put in a test tube and do things with.
But what we were trying to figure out yesterday is how to handle abstract chemicals—chemicals that we just abstractly construct, say by giving an abstract graph representing their chemical structures. Should these be represented by entities, like water or New York City? Or should they be considered more abstract, like lists of numbers, or, for that matter, mathematical graphs?
Well, of course, among the abstract chemicals we can construct are chemicals that we already represent by entities, like sucrose or aspirin or whatever. But here there’s an immediate distinction to make. Are we talking about individual molecules of sucrose or aspirin? Or about these things as bulk materials?
At some level it’s a confusing distinction. Because, we might think, once we know the molecular structure, we know everything—it’s just a matter of calculating it out. And some properties—like molar mass—are basically trivial to calculate from the molecular structure. But others—like melting point—are very far from trivial.
OK, but is this just a temporary problem that one shouldn’t base a long-term language design on? Or is it something more fundamental that will never change? Well, conveniently enough, I happen to have done a bunch of basic science that essentially answers this: and, yes, it’s something fundamental. It’s connected to what I call computational irreducibility. And for example, the precise value of, say, the melting point for an infinite amount of some material may actually be fundamentally uncomputable. (It’s related to the undecidability of the tiling problem; fitting in tiles is like seeing how molecules will arrange to make a solid.)
So by knowing this piece of (rather leading-edge) basic science, we know that we can meaningfully make a distinction between bulk versions of chemicals and individual molecules. Clearly there’s a close relation between, say, water molecules, and bulk water. But there’s still something fundamentally and irreducibly different about them, and about the properties we can compute for them.
Alright, so let’s talk about individual molecules. Obviously they’re made of atoms. And it seems like at least when we talk about atoms, we’re on fairly solid ground. It might be reasonable to say that any given molecule always has some definite collection of atoms in it—though maybe we’ll want to consider “parametrized molecules” when we talk about polymers and the like.
But at least it seems safe to consider types of atoms as entities. After all, each type of atom corresponds to a chemical element, and there are only a limited number of those on the periodic table. Now of course in principle one can imagine additional “chemical elements”; one could even think of a neutron star as being like a giant atomic nucleus. But again, there’s a reasonable distinction to be made: almost certainly there are only a limited number of fundamentally stable types of atoms—and most of the others have ridiculously short lifetimes.
There’s an immediate footnote, however. A “chemical element” isn’t quite as definite a thing as one might imagine. Because it’s always a mixture of different isotopes. And, say, from one tungsten mine to another, that mixture might change, giving a different effective atomic mass.
And actually this is a good reason to represent types of atoms by entities. Because then one just has to have a single entity representing tungsten that one can use in talking about molecules. And only if one wants to get properties of that type of atom that depend on qualifiers like which mine it’s from does one have to deal with such things.
In a few cases (think heavy water, for example), one will need to explicitly talk about isotopes in what is essentially a chemical context. But most of the time, it’s going to be enough just to specify a chemical element.
To specify a chemical element you just have to give its atomic number Z. And then textbooks will tell you that to specify a particular isotope you just have to say how many neutrons it contains. But that ignores the unexpected case of tantalum. Because, you see, one of the naturally occurring forms of tantalum (^{180m}Ta) is actually an excited state of the tantalum nucleus, which happens to be very stable. And to properly specify this, you have to give its excitation level as well as its neutron count.
In a sense, though, quantum mechanics saves one here. Because while there are an infinite number of possible excited states of a nucleus, quantum mechanics says that all of them can be characterized just by two discrete values: spin and parity.
Every isotope—and every excited state—is different, and has its own particular properties. But the world of possible isotopes is much more orderly than, say, the world of possible animals. Because quantum mechanics says that everything in the world of isotopes can be characterized just by a limited set of discrete quantum numbers.
We’ve gone from molecules to atoms to nuclei, so why not talk about particles too? Well, it’s a bigger can of worms. Yes, there are the well-known particles like electrons and protons that are pretty easy to talk about—and are readily represented by entities in the Wolfram Language. But then there’s a zoo of other particles. Some of them—just like nuclei—are pretty easy to characterize. You can basically say things like: “it’s a particular excited state of a charm-quark-anti-charm-quark system” or some such. But in particle physics one’s dealing with quantum field theory, not just quantum mechanics. And one can’t just “count elementary particles”; one also has to deal with the possibility of virtual particles and so on. And in the end the question of what kinds of particles can exist is a very complicated one—rife with computational irreducibility. (For example, what stable states there can be of the gluon field is a much more elaborate version of something like the tiling problem I mentioned in connection with melting points.)
Maybe one day we’ll have a complete theory of fundamental physics. And maybe it’ll even be simple. But exciting as that will be, it’s not going to help much here. Because computational irreducibility means that there’s essentially an irreducible distance between what’s underneath, and what phenomena emerge.
And in creating a language to describe the world, we need to talk in terms of things that can actually be observed and computed about. We need to pay attention to the basic physics—not least so we can avoid setups that will lead to confusion later. But we also need to pay attention to the actual history of science, and actual things that have been measured. Yes, there are, for example, an infinite number of possible isotopes. But for an awful lot of purposes it’s perfectly useful just to set up entities for ones that are known.
But is it the same in chemistry? In nuclear physics, we think we know all the reasonably stable isotopes that exist—so any additional and exotic ones will be very short-lived, and therefore probably not important in practical nuclear processes. But it’s a different story in chemistry. There are tens of millions of chemicals that people have studied (and, for example, put into papers or patents). And there’s really no limit on the molecules that one might want to consider, and that might be useful.
But, OK, so how can we refer to all these potential molecules? Well, in a first approximation we can specify their chemical structures, by giving graphs in which every node is an atom, and every edge is a bond.
What really is a “bond”? While it’s incredibly useful in practical chemistry, it’s at some level a mushy concept—some kind of semiclassical approximation to a full quantum mechanical story. There are some standard extra bits: double bonds, ionization states, etc. But in practice chemistry is very successfully done just by characterizing molecular structures by appropriately labeled graphs of atoms and bonds.
OK, but should chemicals be represented by entities, or by abstract graphs? Well, if it’s a chemical one’s already heard of, like carbon dioxide, an entity seems convenient. But what if it’s a new chemical that’s never been discussed before? Well, one could think about inventing a new entity to represent it.
Any self-respecting entity, though, better have a name. So what would the name be? Well, in the Wolfram Language, it could just be the graph that represents the structure. But maybe one wants something that seems more like an ordinary textual name—a string. Well, there’s always the IUPAC way of naming chemicals with names like 1,1′-{[3-(dimethylamino)propyl]imino}bis-2-propanol. Or there’s the more computer-friendly SMILES version: CC(CN(CCCN(C)C)CC(C)O)O. And whatever underlying graph one has, one can always generate one of these strings to represent it.
There’s an immediate problem, though: the string isn’t unique. In fact, however one chooses to write down the graph, it can’t always be unique. A particular chemical structure corresponds to a particular graph. But there can be many ways to draw the graph—and many different representations for it. And in fact even the (“graph isomorphism”) problem of determining whether two representations correspond to the same graph can be difficult to solve.
OK, so let’s imagine we represent a chemical structure by a graph. At first, it’s an abstract thing. There are atoms as nodes in the graph, but we don’t know how they’d be arranged in an actual molecule (and e.g. how many angstroms apart they’d be). Of course, the answer isn’t completely well defined. Are we talking about the lowest-energy configuration of the molecule? (What if there are multiple configurations of the same energy?) Is the molecule supposed to be on its own, or in water, or whatever? How was the molecule supposed to have been made? (Maybe it’s a protein that folded a particular way when it came off the ribosome.)
Well, if we just had an entity representing, say, “naturally occurring hemoglobin”, maybe we’d be better off. Because in a sense that entity could encapsulate all these details.
But if we want to talk about chemicals that have never actually been synthesized it’s a bit of a different story. And it feels as if we’d be better off just with an abstract representation of any possible chemical.
But let’s talk about some other cases, and analogies. Maybe we should just treat everything as an entity. Like every integer could be an entity. Yes, there are an infinite number of them. But at least it’s clear what names they should be given. With real numbers, things are already messier. For example, there’s no longer the same kind of uniqueness as with integers: 0.99999… is really the same as 1.00000…, but it’s written differently.
What about sequences of integers, or, for that matter, mathematical formulas? Well, every possible sequence or every possible formula could conceivably be a different entity. But this wouldn’t be particularly useful, because much of what one wants to do with sequences or formulas is to go inside them, and transform their structure. But what’s convenient about entities is that they’re each just “single things” that one doesn’t have to “go inside”.
So what’s the story with “abstract chemicals”? It’s going to be a mixture. But certainly one’s going to want to “go inside” and transform the structure. Which argues for representing the chemical by a graph.
But then there’s potentially a nasty discontinuity. We’ve got the entity of carbon dioxide, which we already know lots of properties about. And then we’ve got this graph that abstractly represents the carbon dioxide molecule.
We might worry that this would be confusing both to humans and programs. But the first thing to realize is that we can distinguish what these two things are representing. The entity represents the bulk naturally occurring version of the chemical—whose properties have potentially been measured. The graph represents an abstract theoretical chemical, whose properties would have to be computed.
But obviously there’s got to be a bridge. Given a concrete chemical entity, one of the properties will be the graph that represents the structure of the molecule. And given a graph, one will need some kind of ChemicalIdentify function, that—a bit like GeoIdentify or maybe ImageIdentify—tries to identify from the graph what chemical entity (if any) has a molecular structure that corresponds to that graph.
As I write out some of the issues, I realize how complicated all this may seem. And, yes, it is complicated. But in our meeting yesterday, it all went very quickly. Of course it helps that everyone there had seen similar issues before: this is the kind of thing that’s all over the foundations of what we do. But each case is different.
And somehow this case got a bit deeper and more philosophical than usual. “Let’s talk about naming stars”, someone said. Obviously there are nearby stars that we have explicit names for. And some other stars may have been identified in large-scale sky surveys, and given identifiers of some kind. But there are lots of stars in distant galaxies that will never have been named. So how should we represent them?
That led to talking about cities. Yes, there are definite, chartered cities that have officially been assigned names–and we probably have essentially all of these right now in the Wolfram Language, updated regularly. But what about some village that’s created for a single season by some nomadic people? How should we represent it? Well, it has a certain location, at least for a while. But is it even a definite single thing, or might it, say, devolve into two villages, or not a village at all?
One can argue almost endlessly about identity—and even existence—for many of these things. But ultimately it’s not the philosophy of such things that we’re interested in: we’re trying to build software that people will find useful. And so what matters in the end is what’s going to be useful.
Now of course that’s not a precise thing to know. But it’s like for language design in general: think of everything people might want to do, then see how to set up primitives that will let people do those things. Does one want some chemicals represented by entities? Yes, that’s useful. Does one want a way to represent arbitrary chemical structures by graphs? Yes, that’s useful.
But to see what to actually do, one has to understand quite deeply what’s really being represented in each case, and how everything is related. And that’s where the philosophy has to meet the chemistry, and the math, and the physics, and so on.
I’m happy to say that by the end of our hour-long meeting yesterday (informed by about 40 years of relevant experience I’ve had, and collectively 100+ years from people in the meeting), I think we’d come up with the essence of a really nice way to handle chemicals and chemical structures. It’s going to be a while before it’s all fully worked out and implemented in the Wolfram Language. But the ideas are going to help inform the way we compute and reason about chemistry for many years to come. And for me, figuring out things like this is an extremely satisfying way to spend my time. And I’m just glad that in my long-running effort to advance the Wolfram Language I get to do so much of it.
To comment, please visit the copy of this post at the Stephen Wolfram Blog »
]]>In 1918, the National Geodetic Survey (or NGS, at that time the United States Coast and Geodetic Survey) used its compiled geodetic data to compute the geographic center of the contiguous United States as 39°50′N 98°35′W. Their methodology—balancing a cardboard cutout of the lower 48 states at a point—was rudimentary but surprisingly effective, coming within 20 miles of modern estimates. A marker stands near the location (in the city of Lebanon, Kansas) to commemorate the historic calculation.
Of course, the NGS no longer endorses this or any other point as the geographic center. While they may seem straightforward on the surface, it turns out such calculations can be difficult to pin down due to changes in landscape, different mapping methods or even different definitions for “center”. Even the Wolfram Knowledgebase only lists a whole-number estimate—one that differs from the 1918 figure by quite a bit:
For most practical purposes, such an approximation is fine. Still, it’s intriguing that a seemingly concrete idea like the center of a region can be so shrouded in mystery. In a recent Wolfram Community post, Christopher Wolfram explored a few of these complicating factors, pointing out some pitfalls that the amateur geographer might not consider. In this post, we’re going to recreate some of his results and expand on some of his methods.
As a first estimate, the geographic center can be thought of as the geometric centroid of the bounding region. This is essentially what the NGS did in 1918—fortunately, modern technology makes it much quicker and easier. To set this up in the Wolfram Language, you can use DiscretizeGraphics to create a MeshRegion from built-in Polygon data:
Simply applying RegionCentroid to this region gives one estimate for the geographic center:
This seems quite good—in fact, it’s within 20 miles of the 1918 estimate—but it ignores a few key factors that can change the value significantly.
One major consideration when inspecting any map is which projection it uses to map the 3D surface of the Earth onto a 2D plane. Some information is lost or distorted in a mapping, so projections are usually chosen to preserve a particular quality (e.g. total area, local angles or distance from a standard point).
By default, coordinates in the Wolfram Language are given in the equirectangular projection, in which x coordinates represent longitude and y coordinates represent latitude. Since this is the opposite of the standard (latitude, longitude) format, Reverse is often needed when passing data into other functions (as in the previous example). Keeping that in mind, we can create a function that applies a certain projection to the region—using GeoGridPosition to compute the projected coordinates from the original:
It is not known which projection was used in the 1918 calculation, but the Mercator projection is a likely suspect. Although widely criticized for its distortion of scale (for instance, Greenland appears larger than Africa even though Africa is actually 14 times bigger), the Mercator projection has long been used for navigation. In fact, a version of this is still commonly used by online mapping services; most people will find that the region’s shape looks more “correct” this way:
We also need a way of finding the centroid that takes the projected coordinates into account:
As a test, we can calculate the centroid using the Mercator projection, which turns out to be within 50 miles of the 1918 estimate:
Besides Mercator, the Wolfram Language contains over 500 named projections, accessible through GeoProjectionData:
We can narrow that list down by keeping only those that can project the entire country (no need for region-specific UTM or SPCS coordinates):
From there, it’s easy to look at how different projections affect the centroid calculation:
It’s encouraging that most of the points are within a 100-mile radius of the old NGS value. However, there’s still quite a spread here, stretching from East-Central Kansas up to Nebraska, and even out west to the middle of Colorado. Here’s a map that shows centroids from some popular projections, along with a few nearby cities for reference:
As Christopher points out, many of these projections have parameters that may change results. In this particular case, the "Centering" parameter can be problematic; using its default setting of Automatic, it may not consistently locate the same “center” for each projection. There is a somewhat frustrating implication here: in order to calculate the center of the region, we must first know the center of the projection.
Though using an arbitrary center (such as GeoPosition[{0,0}]) will work in many cases, it breaks down for certain projections:
A more robust approach is to recursively re-center the map until a stable center point is found. In other words, we want to run meshCentroid with a setting of "Centering"->GeoPosition[{0,0}], then run it again using the previous output as the new setting for "Centering", repeating this process until the output stabilizes at a particular point. In practice, this is done using the recursive NestWhileList:
We can see roughly what the function is doing by looking at the series of regions created while the centroid stabilizes for a certain projection:
We can also look at the series of centers as it stabilizes:
All the projections stabilize in five iterations or fewer, but some appear to do so much faster than others. This plot shows the absolute distances between successive centroid values for each projection:
Finally, let’s see the map for the centers calculated like this:
This method gives a much tighter grouping; we’re no longer stretching out to Colorado, and only two of our points are outside the 100-mile radius! Unfortunately, this is about as close as this method can get us—further analysis would depend on the selection of projections, and there are far too many combinations to explore in this post.
What if we wanted to eliminate projections from the process entirely? One obvious way is to represent the unprojected region as a 3D polygon—essentially treating the US like part of a peel or shell that can be pulled directly from the Earth. This is done using GeoPositionXYZ:
Though it’s not obvious from the image, this region is not flat but curved. We can directly compute the centroid—but the answer must be converted from 3D Cartesian coordinates back to a geoposition:
The latitude and longitude appear to be in the right vicinity. But what about that third number? Since we’ve used a 3D region, the centroid has an altitude parameter—in this case negative, which implies that our centroid is deep underground. Fortunately, we can ignore this altitude value; any plotting functions will place the centroid on the surface.
Here’s how this 3D centroid compares to the previous results:
Another approach (suggested by Oscar S. Adams in a paper discussing the NGS’s methodology) would be to find the median latitude and longitude. This involves “cutting” the region in half at a particular latitude such that each side would have equal area, then doing the same for longitude and using that pair as the “center” of the region.
Though this method is independent of projection, we still need to put the data into the right format. In order for the method to be accurate, we need a format that both preserves the area of the region and represents latitude and longitude lines as straight horizontal and vertical lines. The "CylindricalEqualArea" projection fits these criteria:
This function calculates the line to cut a region in half by area:
We apply this directly to get the median x value, then apply to a rotated region (with latitude and longitude values swapped) to get the median y value:
These numbers are in the coordinate system of the "CylindricalEqualArea" projection. We can use GeoGridPosition, though, to convert them into a GeoPosition.
Since we’re using a projection, we have to convert back to standard coordinates with GeoGridPosition. Then we can plot the new centroid along with the projection-dependent centroids:
Again, we have a value that seems reasonable, but there is one drawback: this approach depends heavily on the orientation of the region. For instance, cutting the region in half with diagonals instead of using vertical and horizontal lines will yield a different result. Ideally, the centroid calculation should only involve the shape of the region, and not its position. Of course, since Earth’s land masses won’t shift noticeably from one calculation to the next, the results should still be consistent using this method.
The final method we’ll look at is finding the central feature of the region—the point that minimizes the sum of the distances from each of a collection of given points. In other words, we want the point that is closest to all the other points.
In order to do this, we need to generate a random collection of points from the region and directly apply CentralFeature:
Taking the average of 1,000 points may seem sufficient, but it turns out the results from this are not consistent:
Adding more points decreases the spread, but the computation time grows quickly. To test the limits of this method, Christopher ran the computation 100 times with 15,000 points each (a computation that took several hours, though your mileage may vary). He then used CentralFeature to find the central point in these results, yielding a single value:
Here is a final map summarizing all the centroids we’ve found:
Which answer is correct? In reality, any of these results would be acceptable in geodetic computations. Consider that the 100-mile radius shown makes up less than a percent of the total area of the country, meaning any point chosen from that circle would likely be sufficient.
That said, we found the 3D centroid method to be the most satisfying. It’s the only approach we tried that doesn’t rely on projections, pseudorandom numbers or any other hidden complications. (The central feature method was a close second because of its mathematical robustness—the complexity was just a bit off-putting.) So where would we place the commemorative marker for our new centroid? The location is about eight miles northwest of Smith Center, Kansas:
As noted in Christopher’s post, these methods can easily be generalized. You can compare projections and find the center of any country, city or other geodetic region available in the Wolfram Language. Maybe you have your own definition of “center” that you’d like to explore—leave a comment and let us know what you come up with!
Hans Benker provides a brief and accessible introduction to Mathematica and shows its applications in problems of engineering mathematics, discussing the construction, operation and possibilities of the Wolfram Language in detail. He explores Mathematica usage for matrices and differential and integral calculus. The last part of the book is devoted to the advanced topics of engineering mathematics, including differential equations, transformations, optimization, probability and statistics. The calculations are all presented in detail and are illustrated by numerous examples.
Exploración de modelos matemáticos usando Mathematica (Spanish)
This book explores mathematical models that are traditionally studied in courses on differential equations, but from a unique perspective. The authors analyze models by modifying their initial parameters, transforming them into problems that would be practically impossible to solve in an analytical way. Mathematica provides an essential computational platform for solving these problems, particularly when they are graphical in nature.
Statistisk formelsamling: med bayesiansk vinkling (Norwegian)
Svein Olav Nyberg provides an undergraduate-level statistical formulary with support for Mathematica. This volume includes basic formulas for Bayesian techniques, as well as for general basic statistics. It is an essential primer for Norwegian-language students working in statistical analysis.
Computational thinking is an increasingly necessary technique for problem solving in a range of disciplines, and Mathematica and the Wolfram Language equip students with a powerful computational tool. Approaching calculus from this perspective, K. V. Titov and N. D. Gorelov’s textbook provides a helpful introduction to using the Wolfram Language in the mathematics classroom.
Kompyuternaya matematika: uchebnoe posobie (Russian)
Another textbook from K. V. Titov, Kompyuternaya matematika: uchebnoe posobie emphasizes the use of computer technologies for mathematical analyses and offers practical solutions for numerous problems in various fields of science and technology, as well as their engineering applications. Titov discusses methodological approaches to problem solving in order to promote the development and application of online resources in education and to help integrate computer mathematics in educational technology.
These titles are just a sampling of the many books that explore applications of the Wolfram Language. You can find more Wolfram technologies books, both in English and other languages, by visiting the Wolfram Books site.
]]>As the Fourth of July approaches, many in America will celebrate 241 years since the founders of the United States of America signed the Declaration of Independence, their very own disruptive, revolutionary startup. Prior to independence, colonists would celebrate the birth of the king. However, after the Revolutionary War broke out in April of 1775, some colonists began holding mock funerals of King George III. Additionally, bonfires, celebratory cannon and musket fire and parades were common, along with public readings of the Declaration of Independence. There was also rum.
Today, we often celebrate with BBQ, fireworks and a host of other festivities. As an aspiring data nerd and a sociologist, I thought I would use the Wolfram Language to explore the Declaration of Independence using some basic natural language processing.
Using metadata, I’ll also explore a political network of colonists with particular attention paid to Paul Revere, using built-in Wolfram Language functions and network science to uncover some hidden truths about colonial Boston and its key players leading up to the signing of the Declaration of Independence.
The Wolfram Data Repository was recently announced and holds a growing collection of interesting resources for easily computable results.
As it happens, the Wolfram Data Repository includes the full text of the Declaration of Independence. Let’s explore the document using WordCloud by first grabbing it from the Data Repository.
Interesting, but this isn’t very patriotic thematically, so let’s use ColorFunction and then use DeleteStopwords to remove the signers of the document.
As we can see, the Wolfram Language has deleted the names of the signers and made words larger as a function of their frequency in the Declaration of Independence. What stands out is that the words “laws” and “people” appear the most frequently. This is not terribly surprising, but let’s look at the historical use of those words using the built-in WordFrequencyData functionality and DateListPlot for visualization. Keeping with a patriotic theme, let’s also use PlotStyle to make the plot red and blue.
What is incredibly interesting is that we can see a usage spike around 1776 in both words. The divergence between the use of the two words over time also strikes me as interesting.
According to historical texts, colonial Boston was a fascinating place in the late 18th century. David Hackett Fischer’s monograph Paul Revere’s Ride paints a comprehensive picture of the political factions that were driving the revolutionary movement. Of particular interest are the Masonic lodges and caucus groups that were politically active and central to the Revolutionary War.
Those of us raised in the United States will likely remember Paul Revere from our very first American history classes. He famously rode a horse through what is now the greater Boston area warning the colonial militia of incoming British troops, known as his “midnight ride,” notably captured in a poem by Henry Wadsworth Longfellow in 1860.
Up until Fischer’s exploration of Paul Revere’s political associations and caucus memberships, historians argued the colonial rebel movement was controlled by high-ranking political elites led by Samuel Adams, with many concluding Revere was simply a messenger. That he was, but through that messaging and other activities, he was key to joining together political groups that otherwise may not have communicated, as I will show through network analysis.
As it happens, this time last year I was at the Wolfram Summer School, which is currently in progress at Bentley University. One of the highlights of my time there was a lecture on social network analysis, led by Charlie Brummitt, that used metadata to analyze colonial rebels in Boston.
Duke University sociologist Kieran Healy has a fantastic blog post exploring this titled “Using Metadata to Find Paul Revere” that the lecture was derived from. I’m going to recreate some of his analysis with the Wolfram Language and take things a bit further with more advanced visualizations.
First, however, as a sociologist, my studies and research are often concerned with inequalities, power and marginalized groups. I would be remiss if I did not think of Abigail Adams’s correspondence with her husband John Adams on March 31, 1776, in which she instructed him to “remember the ladies” at the proceedings of the Continental Congress. I made a WordCloud of the letter here.
The data we are using is exclusively about men and membership data from male-only social and political organizations. It is worth noting that during the Revolutionary period, and for quite a while following, women were legally barred from participating in most political affairs. Women could vote in some states, but between 1777 and 1787, those rights were stripped in all states except New Jersey. It wasn’t until August 18, 1920, that the 19th Amendment passed, securing women’s right to vote unequivocally.
To that end, under English common law, women were treated as femes covert, meaning married women’s rights were absorbed by their husbands. Not only were women not allowed to vote, coverture laws dictated that a husband and wife were one person, with the former having sole political decision-making authority, as well as the ability to buy and sell property and earn wages.
Following the American Revolution, the United States was free from the tyranny of King George III; however, women were still subservient to men legally and culturally. For example, Hannah Griffitts, a poet known for her work about the Daughters of Liberty, “The Female Patriots,” expressed in a 1785 diary entry sentiments common among many colonial women:
The glorious fourth—again appears
A Day of Days—and year of years,
The sum of sad disasters,
Where all the mighty gains we see
With all their Boasted liberty,
Is only Change of Masters.
There is little doubt that without the domestic and emotional labor of women, often invisible in history, these men, the so-called Founding Fathers, would have been less successful and expedient in achieving their goals of independence from Great Britain. So today, we remember the ladies, the marginalized and the disenfranchised.
Conveniently, I uploaded a cleaned association matrix of political group membership in colonial Boston as a ResourceObject to the Data Repository. We’ll import with ResourceData to give us a nice data frame to work with.
We can see we have 254 colonists in our dataset. Let’s take a look at which colonial rebel groups Samuel Adams was a member of, as he’s known in contemporary times for a key ingredient in Fourth of July celebrations, beer.
Our True/False values indicate membership in one of seven political organizations: St. Andrews Lodge, Loyal Nine, North Caucus, the Long Room Club, the Tea Party, the Boston Committee of Correspondence and the London Enemies.
We can see Adams was a member of four of these. Let’s take a look at Revere’s memberships.
As we can see, Revere was slightly more involved, as he is a member of five groups. We can easily graph his membership in these political organizations. For those of you unfamiliar with how a network functions, nodes represent agents and the lines between them represent some sort of connection, interaction or association.
There are seven organizations in total, so let’s see how they are connected by highlighting political organizations as red nodes, with individuals attached to each node.
We can see the Tea Party and St. Andrews Lodge have many more members than Loyal Nine and others, which we will now explore further at the micro level.
What we’ve done so far is fairly macro and exploratory. Let’s drill down by looking at each individual’s connection to one another by way of shared membership in these various groups. Essentially, we are removing our political organization nodes and focusing on individual colonists. We’ll use Tooltip to help us identify each actor in the network.
We now use a social network method called BetweennessCentrality that measures the centrality of an agent in a network. It is the fraction of shortest paths between pairs of other agents that pass through that agent. Since the actor can broker information between the other agents, for example, this measure becomes key in determining the importance of a particular node in the network by measuring how a node lies between pairs of actors with nothing lying between a node and other actors.
We’ll first create a function that will allow us to visualize not only BetweennessCentrality, but also EigenvectorCentrality and ClosenessCentrality.
We begin with some brief code for BetweennessCentrality that uses the defined ColorData feature to show us which actors have the highest ability to transmit resources or information through the network, along with the Tooltip that was previously defined.
Lo and behold, Paul Revere appears to have a vastly higher betweenness score than anyone else in the network. Significantly, John Adams is at the center of our radial graph, but he does not appear to have much power in the network. Let’s grab the numbers.
Revere has almost double the score of the next highest colonist, Thomas Urann. What this indicates is Revere’s essential importance in the network as a broker of information. Since he is a member of five of the seven groups, this isn’t terribly surprising, but it would have otherwise been unnoticed without this type of inquiry.
ClosenessCentrality varies from betweenness in that we are concerned with path lengths to other actors. These agents who can reach a high number of other actors through short path lengths are able to disseminate information or even exert power more efficiently than agents on the periphery of the network. Let’s run our function on the network again and look at ClosenessCentrality to see if Revere still ranks highest.
Revere appears ranked the highest, but it is not nearly as dramatic as his betweenness score and, again, John Adams has a low score. Let’s grab the measurements for further analysis.
As our heat-map coloring of nodes indicates, other colonists are not far behind Revere, though he certainly is the highest ranked. While there are other important people in the network, Revere is clearly the most efficient broker of resources, power or information.
One final measure we can examine is EigenvectorCentrality, which uses a more advanced algorithm and takes into account the centrality of all nodes and an individual actor’s nearness and embeddedness among highly central agents.
There appears to be two top contenders for the highest eigenvector score. Let’s once again calculate the measurements in a table for examination.
Nathaniel Barber and Revere have nearly identical scores; however, Revere still tops the list. Let’s now take the top five closeness scores and create a network without them in it to see how the cohesiveness of the network might change.
We see quite a dramatic change in the graph on the left with our key players removed, indicating those with the top five closeness scores are fairly essential in joining these seven political organizations together. Joseph Warren appears to be one of only a few people who can act as a bridge between disparate clusters of connections. Essentially, it would be difficult to have information spread freely through the network on the left as opposed the network on the right that includes Paul Revere.
As we have seen, we can use network science in history to uncover or expose misguided preconceptions about a figure’s importance in historical events, based on group membership metadata. Prior to Fischer’s analysis, many thought Revere was just a courier, and not a major figure. However, what I have been able to show is Revere’s importance in bridging disparate political groups. This further reveals that the Revolutionary movement was pluralistic in its aims. The network was ultimately tied together by disdain for the tyranny of King George III, unjust British military actions and policies that led to bloody revolt, not necessarily a top-down directive from political elites.
Beyond history, network science and natural language processing have many applications, such as uncovering otherwise hidden brokers of information, resources and power, i.e. social capital. One can easily imagine how this might be useful for computational marketing or public relations.
How will you use network science to uncover otherwise-hidden insights to revolutionize and disrupt your work or interests?
Special thanks to Wolfram|Alpha data scientist Aaron Enright for helping with this blog post and to Charlie Brummitt for providing the beginnings of this analysis.
When I first started driving in high school, I had to pay for my own gas. Since I was also saving for college, I had to be careful about my spending, so I started manually tracking how much I was paying for gas in a spreadsheet and calculating how much gas I was using. Whenever I filled my tank, I kept the receipts and wrote down how many miles I’d traveled and how many gallons I’d used. Every few weeks, I would manually enter all of this information into the spreadsheet and plot out the costs and the amount of fuel I had used. This process helped me both visualize how much money I was spending on fuel and manage my budget.
Once I got to college, however, I got a more fuel-efficient car and my schedule got a lot busier, so I didn’t have the time to track my fuel consumption like this anymore. Now I work at Wolfram Research and I’m still really busy, but the cool thing is that I can use our company technology to more easily accomplish my automotive assessments.
After completing this easy project using the Wolfram Cloud’s web form and automated reporting capabilities, I don’t have to spend much time at all to keep track of my fuel usage and other information.
To start this project, I needed a way to store the data. I’ve found that the Wolfram Data Drop is a convenient way to store and access data for many of my projects.
I created a databin to store the data with just one line of Wolfram Language code:
Next, I needed to design a web form that I could use to log the data to the Databin. I used FormFunction to set up a basic one to record gallons of fuel used (from filling the tank each time) and trip distance (from reading the car’s onboard computer).
I also added another field for the date and time of the trip, so that I could add data retroactively (e.g. entering data from old receipts).
I used the DateString function to create an approximate time stamp for submitting data:
This form works in the notebook interface, but it isn’t accessible from anywhere but my Mathematica notebook. If you want it to access it on the web or from a phone, you need to deploy it to the cloud.
Conveniently, you can do this with just one more line of code using CloudDeploy:
If that’s all you wanted to record, you could stop there. After just a few lines of code, the form created will log distance traveled and fuel used, but there’s quite a bit more data that is available while at a gas station.
A typical car’s dashboard shows average speed and odometer readings from the onboard computer. Additionally, most newer cars will report an estimation of the average gas mileage on a per-trip basis, so I designed the following form that makes it easy to test the accuracy of those readings.
I also added a field to record the location by logging the city where I am filling up with the help of Interpreter. I used $GeoLocationCity and CityData to pre-populate this field so I don’t have to type it out each time.
Finally, if you’re saving for college like I was, you’ll want to record the total price too.
All of these data points can be helpful for tracking fuel consumption, efficiency and more.
The last thing to consider before deploying the webpage is the appearance. I set up some visual improvements with the help of AppearanceRules, PageTheme, and FormFunction’s "HTMLThemed" result style:
Now that I have a working form, I need to be able to access it when I’m at a gas station.
I almost always have my smartphone on me, so I can use URLShorten to make a simpler web address that I can type quickly:
Or I can avoid typing out a URL altogether by making a QR code with BarcodeImage, which I can read with my phone’s camera application:
Once I accessed the form on my phone, I added it as a button on my home screen, which makes returning to the form when I’m at a gas station very easy:
If you’re following along, at this point you can just start logging data by using the form; I personally have been logging this data for my car for over a year now. But what can I do with all of this data?
With the help of more than 5,000 built-in functions, including a wealth of visualization functions, the possibilities are almost limitless.
I started by querying for the data in my car’s databin with Dataset:
With a few lines of code and the built-in entity framework, I can see all of the counties where I’ve traveled over the last year or so using GeoHistogram:
I can also see the gas mileage over the course of the past year with TimeSeries:
I often wonder what I can do to improve my gas mileage. I know that there are many factors at play here: driving habits, highway/city driving, the weather—just to name a few. With the Wolfram Language, I can see the effects of some of these on my car’s gas mileage.
I can start by looking at my average speed to compare the effects of highway and city driving and compute the correlation:
It’s pretty clear from the plot that at higher average speeds, gas mileage is higher, but it does appear to eventually level off and somewhat decrease. This makes sense because although a higher average speed indicates less city driving (less stop-and-go traffic), it does require burning more fuel to maintain a higher speed. For example, on the interstate, the engine might be running above its optimal RPM, there will be more wind resistance, etc.
With the help of WeatherData, I can also see if there is a correlation with gas mileage and temperature. I can compute the mean temperature for each trip by taking the mean temperatures of each day between the times that I filled up:
The correlation is weaker, but there is a relationship:
I can also visualize both correlations for the average speed and temperature in 3D space by using miles per gallon as the “height”:
It’s also clear from this plot that gas mileage is positively correlated with both temperature and average speed.
Now that I have code to visualize and analyze the data, I need some way to automate this process when I’m away from my computer. For example, I can set up a template notebook that can generate reports in the cloud.
To do this, you can use CreateNotebook["Template"] or File > New > Template Notebook
(File > New > Template in the cloud).
After following John Fultz’s steps in his presentation to mimic the TimeSeries plot above, I created a simple report template here:
I can test the report generation locally by using GenerateDocument (or with the Generate button in the template notebook):
From here, I can generate a report every time I submit the form by adding this code to the form’s action. But first I need to upload the template notebook to the cloud with CopyFile (alternatively, you can upload it via the web interface):
Now I can update the form to generate the report, and then use HTTPRedirect to open the report as soon as it is finished:
That is a basic report. Of course, it’s easy to add more to the template, which I’ve done here, incorporating some of the plots I created before, as well as a few more. Again, I can generate the advanced report to test the template:
Seeing that it works, I can upload the template to the cloud:
Lastly, I need to update the form to use the new template and then deploy it:
With this setup, I can always access the latest report at the URL the form redirects me to, so I find it handy to also keep it on my phone’s home screen next to the button for the form:
Now you can see how simple it is to use the Wolfram Language to collect and analyze data from your vehicle. I started with a web form and a databin to collect and store information. Then, for convenience, I worked on accessing these through my smartphone. In order to analyze the data, I created visualizations with relevant variables. Finally, I automated the process so that my data collection will generate updated reports as I add new data. Altogether, this is a vast improvement over the manual spreadsheet method that I used when I was in high school.
Now that you see how quick and easy it is to set this up, give it a try yourself! Factor in other variables or try different visualizations, and maybe you can find other correlations. There’s a lot you can do with just a little Wolfram Language code!
Wolfram Community recently surpassed 15,000 members! And our Community members continue to impress us. Here are some recent highlights from the many outstanding Community posts.
BVH Accelerated 3D Shadow Mapping, Benjamin Goodman
Shade data converted to solar map
In a tour de force of computational narrative and a fusion of various Wolfram Language domains, Benjamin Goodman designs a shadow mapping algorithm. It’s a process of applying shadows to a computer graphic. Goodman optimized shadow mapping via space partitioning and a hierarchy of bounding volumes stored as a graph, forming a bounding volume hierarchy.
Pairs Trading with Copulas, Jonathan Kinlay
Jonathan Kinlay, the head of quantitative trading at Systematic Strategies LLC in New York, shows how copula models can be applied in pairs trading and statistical arbitrage strategies. The idea comes from when copulas began to be widely adopted in financial engineering, risk management and credit derivatives modeling, but it remains relatively underexplored compared to more traditional techniques in this field.
The Global Terrorism Database (GTD), Marco Thiel
Marco Thiel broke a Wolfram Community record in April when he contributed four featured posts in just three days! He utilized data from the Global Terrorism Database (GTD), an open-source database including information on terrorist events around the world, starting from 1970. It includes systematic data on domestic as well as transnational and international terrorist events, amounting to more than 150,000 cases. Marco analyzes weapon types, geo distribution of attacks and casualties, and temporal and demographical behavior.
Flight Data and Trajectories of Aeroplanes, Marco Thiel
Thiel utilizes the large amounts of data becoming ever more available. Often, however, these datasets are very valuable and difficult to access. Thiel shows how to use air traffic data to generate visualizations of three-dimensional flight paths on the globe and access flight positions and altitudes, call signs, types of planes, origins, destinations and much more.
Analysing “All” of the World’s News—Database of Everything, Marco Thiel
In another clever data collection/analysis project, Thiel works with “the largest, most comprehensive, and highest resolution open database of human society ever created,” according to the description provided by GDELT (Global Database of Events, Language, and Tone). Since 2015, this organization has acquired about three-quarters of a trillion emotional snapshots and more than 1.5 billion location references. Thiel performs some basic analysis and builds supporting visualizations.
How-to-Guide: External GPU on OSX—How to Use CUDA on Your Mac, Marco Thiel
Thiel discusses the neural network and machine learning framework that has become one of the key features of the latest releases of the Wolfram Language. Training neural networks can be very time-consuming, and the Wolfram Language offers an incredibly easy way to use a GPU to train networks and also do numerous other interesting computations. This post explains how to use powerful external GPU units for Wolfram Language computing on your Mac.
Creative Routines Charts, Patrick Scheibe
People are often interested in how creative or successful individuals manage their time, and when in their daily schedules they do what they are famous for. Patrick Scheibe describes how to build and personalize “creative routines” visualizations.
QR Code in Shopping Cart Handle, Patrick Scheibe
Scheibe also brought to Wolfram Community his famous article “QR Code in Shopping Cart Handle.” It explains the image processing algorithm for reading QR code labels when they are deformed by attachment to physical objects such as shopping carts and product packages.
Calculating NMR-Spectra with Wolfram Language, Hans Dolhaine
Hans Dolhaine, a chemist from Germany, writes a detailed walk-through calculating nuclear magnetic resonance spectra with the Wolfram Language. This is a useful educational tool for graduate physics and chemistry classes. Please feel free to share it in your interactions with students and educators.
Computational Introduction to Logarithms, Bill Gosper
Another excellent resource for educators is this elementary introduction to logarithms by means of computational exploration with the Wolfram Language. The Community contributor is renowned mathematician and programmer Bill Gosper. His article is highly instructive and accessible to a younger generation, and it contains beautiful animated illustrations that serve as outstanding educational material.
Using Recursion and FindInstance to Solve Sudoku and The Puzzled Ant and Particle Filter, Ali Hashmi
Finally, Ali Hashmi uses the recursion technique coupled with heuristics to solve a sudoku puzzle and also explains the connection between the puzzled ant problem and particle filters in computer vision.
If you haven’t yet signed up to be a member of Wolfram Community, don’t hesitate! You can join in on these discussions, post your own work in groups of your interest, and browse the complete list of Staff Picks.
As the next phase of Wolfram Research’s endeavor to make biology computable, we are happy to announce the recent release of neuroscience-related content.
The most central part of the human nervous system is the brain. It contains roughly 100 billion neurons that act together to process information, subdivided functionally and structurally into areas specialized for certain tasks. The brain’s anatomy, the characteristics of neurons and cognitive maps are used to represent some key aspects of the functional organization and processing abilities of our nervous system. Our new neuroscience content will give you a sneak peek into the amazing world of neuroscience with some facts about brains, neurons and cognition.
A primal part of the brain, the amygdala, is the well-studied cognitive area responsible for the emotional process circuitry and has active roles in emotional state, memory, face recognition and decision making. The amygdala is located near the brainstem, close to the center of the brain and, as its name suggests, is shaped like an almond:
Outgoing connections from the amygdala can be found with the "NeuronalOutput" property:
Here we see a visualization of the output connectivity of the amygdala in two layers:
Just as in a simple network, we could do additional computations on other networks. Like many other biological systems, our nervous system is hardwired to receive positive and negative feedback. Feedback is one of the key aspects of the brain’s information processing; it allows the augmentation or decrease of the efficacy of transmission, as well as fine-tuning for the resulting outputs.
Find the loop and highlight in the above graph:
Or find the specific circuit that comprises the combination amygdala-prefrontal cortex. The prefrontal cortex has the primary role in decision making, and therefore the amygdala-prefrontal cortex connectivity plays an essential role in modulating responses to emotional experiences:
We can also identify the minimum-cost flow between the amygdala and the spinal cord. The spinal cord processes signals from the brain and transmits them to other parts of the body to excite motor response:
It is also noteworthy that, in addition to the brain’s connectivity in the central nervous system, we have peripheral innervation integrated in our AnatomyData function. The motor commands from the spinal cord eventually reach the periphery.
Find nerves that innervate the left hand:
And visualize them in 3D with the AnatomyPlot3D function:
We have looked at macroscopic pictures of our nervous system so far. Now let’s look at the brain’s functional unit, the neuron. Of course, we cannot characterize all the billions of neurons, but key features of a few hundred types of neurons are very similar across various mammalian species; these will be considered in detail.
A variety of properties are available for the "Neuron" entity type to describe physical, electrophysiological and spatial characteristics of individual types of neurons:
We can get information on the types of neurons found in a particular brain region. For example, we can get a listing of neurons in the hippocampus, which is associated with emotional states, conversion of short-term to long-term memories and forming spatial memory:
Collecting further details, list the set of neurons whose axons arborize at the CA1 alveus area of the hippocampus:
Neurons transmit electrical signals to communicate with one another. Physical characteristics and patterns of their spikes, known as action potentials, differ across different neuron types.
We can obtain experimentally measured electrophysiological properties of hippocampus CA1 pyramidal cells:
Here we can visually recognize how spike characteristics vary across different neuron types:
A single neuron’s spike propagation can be simulated with the well-known Hodgkin and Huxley model (A. L. Hodgkin and A. F. Huxley, 1952) based on four differential equations involving voltages and currents. Also, there are biologically realistic computational models accommodating Hodgkin and Huxley’s concepts developed to simulate ensembles of spikes in a population of neurons (E. M. Izhikevich, 2004). We can better understand how neurons excite/suppress one another to transmit information by modeling the neurons’ electrical spikes and comparing their patterns of activities with experimentally measured ones:
After looking at microscopic features in our brain, let us finally explore the brain’s macro-scale executive function. Thanks to recent advances in imaging techniques to visualize brain activity in various cognitive states, we can map out cortical areas that are associated with specific cognitive processes. Brain areas associated with specific functions such as memory, decision making, language, emotional state, visual perception, etc. are well characterized with the appropriate activity-based fMRI analysis.
Using the EntityValue query with the AnatomicalFunctionalConcept entity type, we can find more information on hierarchically categorized brain activities:
Here we can look up the categories of functions associated with each cerebral lobe and create a simple cortical map:
We are not limited to the abstract representation of cortical maps; fMRI-based statistical maps of brain activity are also available.
Let’s look at how we perceive the visual world. A key aspect of visual perception is the subprocess (concept) of cognition as our brain categorizes our visually perceived faces, places, words, numbers, etc. with distinctive patterns of activity. The following graph illustrates how these concepts are hierarchically organized. Some areas of brain activation are highlighted (brain images are seen from the rear):
OK, let’s look further. Visually perceived words, sentences, faces, etc., in turn, affect “language” and “emotion”:
We can confirm that the amygdala (remember, the left and right amygdalae found near the center of the brain) is actively involved in emotions. If you want to learn more about these individual models, they are also available in 3D polygon data and ready to be aligned to our 3D brain model in AnatomyData for further computation.
Here is the brain activation area 3D graphic associated with emotion:
We can combine that graphic together with the brain model for visual comparison (the amygdala is highlighted in red; the right cerebral hemisphere is shown here for demonstration):
It’s fascinating to learn how our brain is organized and how it coordinates the processes in our nervous system. As we know, there is still a lot to be learned about human cognition, and exciting discoveries are being made every day. As we gain additional insights, we continue to expand our knowledgebase to attain a better and deeper understanding of the human nervous system.
Stay tuned for more neuroscience content to come!
We’re fascinated by artificial intelligence and machine learning, and Achim Zielesny’s second edition of From Curve Fitting to Machine Learning: An Illustrative Guide to Scientific Data Analysis and Computational Intelligence provides a great introduction to the increasingly necessary field of computational intelligence. This is an interactive and illustrative guide with all concepts and ideas outlined in a clear-cut manner, with graphically depicted plausibility arguments and a little elementary mathematics. Exploring topics such as two-dimensional curve fitting, multidimensional clustering and machine learning with neural networks or support vector machines, the subject-specific demonstrations are complemented with specific sections that address more fundamental questions like the relation between machine learning and human intelligence. Zielesny makes extensive use of Computational Intelligence Packages (CIP), a high-level function library developed with Mathematica’s programming language on top of Mathematica’s algorithms. Readers with programming skills may easily port or customize the provided code, so this book is particularly valuable to computer science students and scientific practitioners in industry and academia.
The Art of Programming in the Mathematica Software, third edition
Another gem for programmers and scientists who need to fine-tune and otherwise customize their Wolfram Language applications is the third edition of The Art of Programming in the Mathematica Software, by Victor Aladjev, Valery Boiko and Michael Shishakov. This text concentrates on procedural and functional programming. Experienced Wolfram Language programmers know the value of creating user tools. They can extend the most frequently used standard tools of the system and/or eliminate its shortcomings, complement new features, and much more. Scientists and data analysts can then conduct even the most sophisticated work efficiently using the Wolfram Language. Likewise, professional programmers can use these techniques to develop more valuable products for their clients/employers. Included is the MathToolBox package with more than 930 tools; their freeware license is attached to the book.
Introduction to Mathematica with Applications
For a more basic introduction to Mathematica, readers may turn to Marian Mureşan’s Introduction to Mathematica with Applications. First exploring the numerous features within Mathematica, the book continues with more complex material. Chapters include topics such as sorting algorithms, functions—both planar and solid—with many interesting examples and ordinary differential equations. Mureşan explores the advantages of using the Wolfram Language when dealing with the number pi and describes the power of Mathematica when working with optimal control problems. The target audience for this text includes researchers, professors and students—really anyone who needs a state-of-the art computational tool.
Geographical Models with Mathematica
The Wolfram Language’s powerful combination of extensive map data and computational agility is on display in André Dauphiné’s Geographical Models with Mathematica. This book gives a comprehensive overview of the types of models necessary for the development of new geographical knowledge, including stochastic models, models for data analysis, geostatistics, networks, dynamic systems, cellular automata and multi-agent systems, all discussed in their theoretical context. Dauphiné then provides over 65 programs that formalize these models, written in the Wolfram Language. He also includes case studies to help the reader apply these programs in their own work.
Our tour of new Wolfram Language books moves from terra firma to the stars in Geometric Optics: Theory and Design of Astronomical Optical Systems Using Mathematica. This book by Antonio Romano and Roberto Caveliere provides readers with the mathematical background needed to design many of the optical combinations that are used in astronomical telescopes and cameras. The results presented in the work were obtained through a different approach to third-order aberration theory as well as the extensive use of Mathematica. Replete with workout examples and exercises, Geometric Optics is an excellent reference for advanced graduate students, researchers and practitioners in applied mathematics, engineering, astronomy and astronomical optics. The work may be used as a supplementary textbook for graduate-level courses in astronomical optics, optical design, optical engineering, programming with Mathematica or geometric optics.
Don’t forget to check out Stephen Wolfram’s An Elementary Introduction to the Wolfram Language, now in its second edition. It is available in print, as an ebook and free on the web—as well as in Wolfram Programming Lab in the Wolfram Open Cloud. There’s also now a free online hands-on course based on the book. Read Stephen Wolfram’s recent blog post about machine learning for middle schoolers to learn more about the new edition. |
Derivatives of functions play a fundamental role in calculus and its applications. In particular, they can be used to study the geometry of curves, solve optimization problems and formulate differential equations that provide mathematical models in areas such as physics, chemistry, biology and finance. The function D computes derivatives of various types in the Wolfram Language and is one of the most-used functions in the system. My aim in writing this post is to introduce you to the exciting new features for D in Version 11.1, starting with a brief history of derivatives.
The idea of a derivative was first used by Pierre de Fermat (1601–1665) and other seventeenth-century mathematicians to solve problems such as finding the tangent to a curve at a point. Given a curve y=f(x), such as the one pictured below, they regarded the tangent line at a point {x,f(x)} as the limiting position of the secant drawn to the point through a nearby point {x,f(x+h)}, as the “infinitesimal” quantity h tends to 0.
Their technique can be illustrated as follows.
The slope of a secant line joining {x,f(x)} and {x+h,f(x+h)} is given by DifferenceQuotient.
Now suppose that the function f(x) is defined as follows.
Then the slope of a secant line joining {x,f(x)} and {x+h,f(x+h)} is given.
The mathematicians of the time then proceeded to find the slope of the tangent by setting h equal to 0.
The following animation shows the tangent lines along the curve that are obtained by using the formula for the slope derived above.
The direct replacement of the infinitesimal quantity h by 0 works well for simple examples, but it requires considerable ingenuity to compute the limiting value of the difference quotient in more difficult examples. Indeed, Isaac Barrow (1630–1677) and others used geometrical methods to compute this limiting value for a variety of curves. On the other hand, the built-in Limit function in the Wolfram Language incorporates methods based on infinite series expansions and can be used for evaluating the required limits. For example, suppose that we wish to find the derivative of Sin. We first compute the difference quotient of the function.
Next, we note that setting h equal to 0 directly leads to an Indeterminate expression, as shown below. The Quiet function is used to suppress messages that warn about the indeterminacy.
Although the direct substitution method has failed, we can use Limit to arrive at the result that the derivative of Sin[x] is Cos[x].
Continuing with the historical development, around 1670, Isaac Newton and Gottfried Wilhelm Leibniz “discovered” calculus in the sense that they introduced the general notions of derivative and integral, developed convenient notations for these two operations and established that they are inverses of each other. However, an air of mystery still surrounded the use of infinitesimal quantities in the works of these pioneers. In his 1734 essay The Analyst, Bishop Berkeley called infinitesimals the “ghosts of departed quantities”, and ridiculed the mathematicians of his time by saying that they were “men accustomed rather to compute, than to think.” Meanwhile, calculus continued to provide spectacularly successful models in physics, such as the wave equation for oscillatory motion. These successes spurred mathematicians on to search for a rigorous definition of derivatives using limits, which was finally achieved by Augustin-Louis Cauchy in 1823.
The work of Cauchy and later mathematicians, particularly Karl Weierstrass (1815–1897), laid to rest the controversy about the foundations of calculus. Mathematicians could now treat derivatives in a purely algebraic way without feeling concerned about the treacherous computation of limits. To be more precise, the calculus of derivatives could now be reduced to two sets of rules—one for computing derivatives of individual functions such as Sin, and another for finding derivatives of sums, products, compositions, etc. of these functions. It is this algebraic approach to derivatives that is implemented in D and allows us to directly calculate the derivative of Sin with a single line of input, as shown here.
Starting from the derivative of a function, one can compute derivatives of higher orders to gain further insight into the physical phenomenon described by the function. For example, suppose that the position s(t) of a particle moving along a straight line at time t is defined as follows.
Then, the velocity and the acceleration of the particle are given by its first and second derivatives, respectively. The higher derivatives too can be computed easily using D; they also have special names, which can be seen in the following computation.
Let us now return to our original example and compute the first four derivatives of Sin.
There is a clear pattern in the table, namely that each derivative may be obtained by adding a multiple of 𝜋/2 to x, as shown here.
In Version 11.1, D returns exactly this formula for the n^{th} derivative of Sin.
An immediate application of the above closed form would be to compute higher-order derivatives of functions with blinding speed. D itself uses this method to compute the billionth derivative of Sin in a flash, using Version 11.1.
The Wolfram Language has a rich variety of mathematical functions, starting from elementary functions such as Power to advanced special functions such as EllipticE. The n^{th} derivatives for many of these functions can be computed in closed form using D in Version 11.1. The following table captures the beauty and complexity of these formulas, each of which encodes all the information required to compute higher derivatives of a given function.
Some of the entries in the table are rather simple. For example, the first entry states that all the derivatives of the exponential function are equal to the function itself, which generalizes the following result from basic calculus.
In sharp contrast to that, the n^{th} derivative for ArcTan is given by a formidable expression involving HypergeometricPFQRegularized.
If we now give specific values to n in that formula, we obtain elementary answers from the first few derivatives.
These answers agree with the ones obtained if D is used separately for each derivative computation. The results are then simplified.
The familiar sum, product and chain rules of calculus generalize very nicely to the case of n^{th} derivatives. The sum rule is the easiest, and simply states that the n^{th} derivative of a sum is the sum of the n^{th} derivatives.
The product rule, or the so-called Leibniz rule, gives an answer that is essentially a binomial expansion, expressed as a sum wrapped in Inactive to prevent evaluation.
We can recover the product rule from a first course on derivatives simply by setting n=1 and applying Activate to evaluate the resulting inert expression.
Finally, there is a form of the chain rule due to the pious Italian priest Francesco Faà di Bruno (1825–1888). This is given by a rather messy expression in terms of BellY, and states that:
Once again, it is easy to recover the chain rule for first derivatives by setting n=1 as we did earlier.
The special functions in the Wolfram Language typically occur in families, with different members of each family labeled by integers or other parameters. For example, there is one function BesselJ[n,z] for each integer n. The first four members of this family are pictured below (the sinusoidal character of Bessel functions helps in the modelling of circular membranes).
It turns out that the derivatives of BesselJ[n,z] can be expressed in terms of other Bessel functions from the same family. While earlier versions did make some use of these relationships, Version 11.1 exploits them more fully to return compact answers for examples such as the following, which generated 2^{10}=1024 instances of BesselJ in earlier releases!
The functions considered so far are differentiable in the sense that they have derivatives for all values of the variable. The absolute value function provides a standard example of a non-differentiable function, since it does not have a derivative at the origin. Unfortunately, the built-in Abs function is defined for complex values, and hence does not have a derivative at any point. Version 11.1 overcomes this limitation by introducing RealAbs, which agrees with Abs for real values, as seen in the following plot.
This function has a derivative at all values except at the origin, which is given by:
The introduction of RealAbs is sure to be welcomed by users who have long requested such a function for use in differential equations and other applications.
This real absolute value function is continuous and only mildly non-differentiable, but in 1872, Karl Weierstrass stunned the mathematical world by introducing a fractal function that is continuous at every point but differentiable nowhere. Version 11.1 introduces several fractal curves of this type, which are named after their discoverers. Approximations for a few of these curves are pictured here.
Albert Einstein’s 1916 paper announcing the general theory of relativity provided a great impetus to the development of calculus. In this landmark paper, he made systematic use of the tensor calculus developed by Gregorio Ricci (1853–1925) and his student Tullio Levi-Civita (1873–1941) to formulate a theory of the gravitational field, which has now received superb confirmation through the detection of gravitational waves. The KroneckerDelta tensor, which derives its name from the Greek delta character δ that is used to represent it, plays a key role in tensor calculus.
The importance of KroneckerDelta lies in the fact that it allows us to “sift” a tensor and isolate individual terms from it with ease. In order to understand this idea, let us obtain the definition of this tensor by applying PiecewiseExpand to it.
From the above, we see that KroneckerDelta[i, j] is 1 if its components i and j are equal, and is equal to 0 otherwise. As a result, it allows us to sift through all the terms in the following sum and select, say, the third term f(3) from it.
In Version 11.1, D makes use of this property of KroneckerDelta to differentiate finite sums with symbolic upper limits with respect to an indexed variable x(j), as illustrated here.
The last result expresses the fact that only the j^{th} term in the derivative is nonzero, since none of the other terms depend on x(j), and hence their derivatives with respect to this variable are 0. For example, if we set n=5 and j=2, then the sum reduces to the single term f^{′}(x(2)).
Along with the improvements for the functionality of D, Version 11.1 also includes a major documentation update for this important function. In particular, the reference page now includes many application examples of the types encountered in a typical college calculus course. These examples are based on a large collection of more than 5,000 textbook exercises that were solved by a group of talented interns using the Wolfram Language during the summer of 2016. Some of the graphics from these examples are shown here. You can click anywhere inside each of the three following graphics to view their corresponding examples in the online documentation.
D is a venerable function that has been available since Version 1.0 (1988). We hope that the enhancements for this function in Version 11.1 will make it even more appealing to users at all levels. Any comments or feedback about the new features are very welcome.
Exoplanets are currently an active area of research in astronomy. In the past few years, the number of exoplanet discoveries has exploded, mainly as the result of the Kepler mission to survey eclipsing exoplanet systems. But Kepler isn’t the only exoplanet study mission going on. For example, the TRAnsiting Planets and PlanetesImals Small Telescope (TRAPPIST) studies its own set of targets. In fact, the media recently focused on an exoplanet system orbiting an obscure star known as TRAPPIST-1. As an introduction to exoplanet systems, we’ll explore TRAPPIST-1 and its system of exoplanets using the Wolfram Language.
To familiarize yourself with the TRAPPIST-1 system, it helps to start with the host star itself, TRAPPIST-1. Imagine placing the Sun, TRAPPIST-1 and Jupiter alongside one another on a table. How would their sizes compare? The following provides a nice piece of eye candy that lets you see how small TRAPPIST-1 is compared to our Sun. It’s actually only a bit bigger than Jupiter.
Although the diameter looks to be about the same as Jupiter’s, its mass is quite different—actually about 80 times the mass of Jupiter.
And it has only about 8% of the Sun’s mass.
TRAPPIST-1 is a thus very low-mass star, at the very edge of the main sequence, but still allowing the usual hydrogen fusion in its core.
The exoplanets in this system are what actually gained all of the media attention. All of the exoplanets (blue orbits) found in the TRAPPIST-1 system so far orbit the star at distances that would be far inside the orbit of Mercury (in green), if they were in our solar system.
As a more quantitative approach to study the planets in this system, it is useful to take a look at the orbital periods of these planets, which lie very close together. Planets in such close proximity can often perturb one another, which can result in planets being ejected out of the system, unless some orbital resonances ensure that the planets are never in the wrong place at the wrong time. It’s easy to look up the orbital period of the TRAPPIST-1 planets.
Divide them all by the orbital period of the first exoplanet to look for orbital resonances, as indicated by ratios close to rational fractions.
These show near resonances with the following ratios.
TRAPPIST-1 h has an inaccurately known orbital period so it’s not clear whether it partakes in any resonances.
Similarly, nearest-neighbor orbital period ratios show resonances.
Which are close to:
An orbital resonance of 3/2 means that one of the planets orbits 3 times for every 2 of the other. Pluto and Neptune in our solar system exhibit a near 3:2 orbital resonance.
This can help explain how so many planets can be packed into such a tight space without experiencing disruptive perturbations.
What about the distances of the exoplanets from their host star? If you placed TRAPPIST-1 and its planets alongside Jupiter and its four Galilean moons, how would they compare? The star and Jupiter are similar in size. The exoplanets are a bit larger than the moons (which are hard to see here) and they orbit a bit farther away, but the overall scales are of similar magnitude. In the following graphic, all distances are to scale, but we magnified the actual diameters of the planets and moons to make them easier to see.
The sizes of the planets can be compared to Jupiter’s four biggest moons for additional scale comparisons.
At the time of this writing, we have curated over 3,400 confirmed exoplanets:
Most of the confirmed exoplanets have been discovered since 2014, during missions such as Kepler.
You can query for data on individual exoplanets.
In addition, there are various classifications of exoplanets that can be queried.
You can perform an analysis to see when the exoplanets were discovered.
You can also do a systematic comparison of exoplanet parameters, which we limit here to the radius and density. We are only considering the entity class of super-Earths here. The red circle marks the approximate location of the TRAPPIST-1 system in this plot.
Here is another example of systematic comparison of exoplanet parameters by discovery method, indicated by color coding. Once again, the TRAPPIST-1 system is shown, with red dots at its mean values.
In addition to data specific to exoplanets, the Wolfram Language also includes data on chemical compounds present in planetary atmospheres.
You can use this data to show, for example, how the density of various atmospheric components changes with temperature.
As a more concrete application, the Wolfram Language also provides the tools needed to explore collections of raw data. For example, here we can import irregularly sampled stellar light-curve data directly from the NASA Exoplanet Archive for the HAT-P-7 exoplanet system.
Then we can remove data points that are not computable.
This raw data can be easily visualized, as shown here.
The following subsamples the data and does some additional post processing to both the times and magnitudes.
Plotting the data shows evidence of eclipses, appearing as a smattering of points below the dense band of data.
A Fourier-like algorithm can identify periodicities in this data.
Zoom into the fundamental frequency, at higher resolution.
We find that the fundamental peak frequency occurs at .453571 radians/day, for which the reciprocal gives an estimate of the corresponding orbital period in days.
With some additional processing, we can apply a minimum string length (MSL) algorithm to the raw data to look for periodicities.
We can apply the MSL algorithm to a range of potential periods to try to find a value that minimizes the distance between neighboring points when the data is phase folded.
Clearly, the minimum string length occurs at about 2.20474 days, in close agreement with method 1.
We can also validate this derived value with the value stored in the header of the original data.
This orbital period corresponds with that of exoplanet HAT-P-7b, as can be seen in the Wolfram curated data collection (complete with units and precision).
From the known orbital period, we can phase fold the original dataset, overlapping the separate eclipses, to obtain a more complete picture of the exoplanet eclipse.
Noise can be reduced by carrying out a phase-binning technique. All data points are placed into bins of width .0005 days, and the mean of the values in each bin is determined.
This graphic, mainly for purposes of visualization, shows the host star, HAT-P-7, with its exoplanet HAT-P-7b orbiting it. All parameters, including diameter and orbital radius, are to scale. The idea is to try to reproduce the brightness variations seen in the observed light curve. For this graphic, the GrayLevel of the exoplanet is set to GrayLevel[1], which enables you to more clearly see the exoplanet go though phases as it orbits the host star.
Now we can do an analogous thing, generating a list of frames instead of a static graphic. In this case, the GrayLevel of the exoplanet is much reduced, as compared to the animation above. For purposes of illustration and to reduce computation time, a small set of values has been chosen around the primary eclipse.
Now, to measure how the brightness of the scene changes, we can use image processing to total all of the pixel values. It’s rasterized at a large image size so that edge artifacts are minimized (which can otherwise have measurable effects on the resulting light curve). So this code takes a minute or so to run.
Next, we rescale all of the pixel counts to fit in the same vertical range as the observed light curve.
Now compare the model data to the actual data. The red points show the model data computed at a few orbital phases around the primary eclipse.
A more detailed model light curve can be constructed if you increase the computation time. The version above was done for speed. Of course, additional secondary effects can be included, such as the possibility of gravity darkening and other effects that cause brightness variations across the face of the star. Such secondary effects are beyond the scope of this blog.
Other star systems can be far more complicated and provide their own unique challenges to understanding their dynamical behavior. The Wolfram Language provides a powerful tool that allows you to explore the subtleties of stellar light curve analysis as well as the periodicities in irregularly sampled data. It would be interesting to see some of these more complicated systems tackled in similar ways to what we’ve done in this blog.