Between October 1787 and April 1788, a series of essays was published under the pseudonym of “Publius.” Altogether, 77 appeared in four New York City periodicals, and a collection containing these and eight more appeared in book form as *The Federalist* soon after. As of the twentieth century, these are known collectively as *The Federalist Papers*. The aim of these essays, in brief, was to explain the proposed Constitution and influence the citizens of the day in favor of ratification thereof. The authors were Alexander Hamilton, James Madison and John Jay.

On July 11, 1804, Alexander Hamilton was mortally wounded by Aaron Burr, in a duel beneath the New Jersey Palisades in Weehawken (a town better known in modern times for its tunnels to Manhattan and Alameda). Hamilton died the next day. Soon after, a list he had drafted became public, claiming authorship of more than sixty essays. James Madison publicized his claims to authorship only after his term as president had come to an end, many years after Hamilton’s death. Their lists overlapped, in that essays 49–58 and 62–63 were claimed by both men. Three essays were claimed by each to have been collaborative works, and essays 2–5 and 64 were written by Jay (intervening illness being the cause of the gap). Herein we refer to the 12 claimed by both men as “the disputed essays.”

Debate over this authorship, among historians and others, ensued for well over a century. In 1944 Douglass Adair published “The Authorship of the Disputed *Federalist Papers*,” wherein he proposed that Madison had been the author of all 12. It was not until 1963, however, that a statistical analysis was performed. In “Inference in an Authorship Problem,” Frederick Mosteller and David Wallace concurred that Madison had indeed been the author of all of them. An excellent account of their work, written much later, is Mosteller’s “Who Wrote the Disputed *Federalist Papers*, Hamilton or Madison?.” His work on this had its beginnings also in the 1940s, but it was not until the era of “modern” computers that the statistical computations needed could realistically be carried out.

Since that time, numerous analyses have appeared, and most tend to corroborate this finding. Indeed, it has become something of a standard for testing authorship attribution methodologies. I recently had occasion to delve into it myself. Using this technology, developed in the Wolfram Language, I will show results for the disputed essays that are mostly in agreement with this consensus opinion. Not entirely so, however—there is always room for surprise. Brief background: in early 2017 I convinced Catalin Stoean, a coauthor from a different project, to work with me in developing an authorship attribution method based on the Frequency Chaos Game Representation (FGCR) and machine learning. Our paper “Text Documents Encoding through Images for Authorship Attribution” was recently published, and will be presented at SLSP 2018. The method outlined in this blog comes from this recent work.

The idea that rigorous, statistical analysis of text might be brought to bear on determination of authorship goes back at least to Thomas Mendenhall’s “The Characteristic Curves of Composition” in 1887 (earlier work along these lines had been done, but it tended to be less formal in nature). The methods originally used mostly involved comparisons of various statistics, such as frequencies for sentence or word length (that latter in both character and syllable counts), frequency of usage of certain words and the like. Such measures can be used because different authors tend to show distinct characteristics when assessed over many such statistics. The difficulty encountered with the disputed essays was that, by measures then in use, the authors were in agreement to a remarkable extent. More refined measures were needed.

Modern approaches to authorship attribution are collectively known as “stylometry.” Most approaches fall into one or more of the following categories: lexical characteristics (e.g. word frequencies, character attributes such as *n*-gram frequencies, usage of white space), syntax (e.g. structure of sentences, usage of punctuation) and semantic features (e.g. use of certain uncommon words, relative frequencies of members of synonym families).

Among advantages enjoyed by modern approaches, there is the ready availability on the internet of large corpora, and the increasing availability (and improvement) of powerful machine learning capabilities. In terms of corpora, one can find all manner of texts, newspaper and magazine articles, technical articles and more. As for machine learning, recent breakthroughs in image recognition, speech translation, virtual assistant technology and the like all showcase some of the capabilities in this realm. The past two decades have seen an explosion in the use of machine learning (dating to before that term came into vogue) in the area of authorship attribution.

A typical workflow will involve reading in a corpus, programmatically preprocessing to group by words or sentences, then gathering various statistics. These are converted into a format, such as numeric vectors, that can be used to train a machine learning classifier. One then takes text of known or unknown authorship (for purposes of validation or testing, respectively) and performs similar preprocessing. The resulting vectors are classified by the result of the training step.

We will return to this after a brief foray to describe a method for visualizing DNA sequences.

Nearly thirty years ago, H. J. Jeffrey introduced a method of visualizing long DNA sequences in “Chaos Game Representation of Gene Structure.” In brief, one labels the four corners of a square with the four DNA nucleotide bases. Given a sequence of nucleotides, one starts at the center of this square and places a dot halfway from the current spot to the corner labeled with the next nucleotide in the sequence. One continues placing dots in this manner until the end of a sequence of nucleotides is reached. This in effect makes nucleotide strings into instruction sets, akin to punched cards in mechanized looms.

One common computational approach is slightly different. It is convenient to select a level of pixelation, such that the final result is a rasterized image. The actual details go by the name of the Frequency Chaos Game Representation, or FCGR for short. In brief, a square image space is divided into discrete boxes. The gray level in the resulting image of each such pixelized box is based on how many points from chaos game representation (CGR) land in it.

Following are images thus created from nucleotide sequences of six different species (cribbed from the author’s “Linking Fourier and PCA Methods for Image Look‐Up”). This has also appeared on Wolfram Community.

It turns out that such images do not tend to vary much from others created from the same nucleotide sequence. For example, the previous images were created from the initial subsequences of length 150,000 from their respective chromosomes. Corresponding images from the final subsequences of corresponding chromosomes are shown here:

As is noted in the referenced article, dimension-reduction methods can now be used on such images, for the purpose of creating a “nearest image” lookup capability. This can be useful, say, for quick identification of the approximate biological family a given nucleotide sequence belongs to. More refined methods can then be brought to bear to obtain a full classification. (It is not known whether image lookup based on FCGR images is alone sufficient for full identification—to the best of my knowledge, it has not been attempted on large sets containing closer neighbor species than the six shown in this section). It perhaps should go without saying (but I’ll note anyway) that even without any processing, the Wolfram Language function `Nearest` will readily determine which images from the second set correspond to similar images from the first.

A key aspect to CGR is that it uses an alphabet of length four. This is responsible for a certain fractal effect in that blocks from each quadrant tend to be approximately repeated in nested subblocks in corresponding nested subquadrants. In order to obtain an alphabet of length four, it was convenient to use multiple digits from a power of four. Some experiments indicated that an alphabet of length 16 would work well. Since there are 26 characters in the English version of the Latin alphabet, as well as punctuation, numeric characters, white space and more, some amount of merging was done, with the general idea that “similar” characters could go into the same overall class. For example, we have one class comprised of {c,k,q,x,z}, another of {b,d,p} and so on. This brought the modified alphabet to 16 characters. Written in base 4, the 16 possibilities give all possible pairs of digits in base 4. The string of base 4 digits thus produced is then used to produce an image from text.

For relatively short texts, up to a few thousand characters, say, we simply create one image. Longer texts we break into chunks of some specified size (typically in the range of 2,000–10,000 characters) and make an image for each such chunk. Using `ExampleData``["Text"]` from the Wolfram Language, we show the result for the first and last chunks from *Alice in Wonderland* and *Pride and Prejudice*, respectively:

While there is not so much for the human eye to discern between these pairs, machine learning does quite well in this area.

The paper with Stoean provides details for a methodology that has proven to be best from among variations we have tried. We use it to create one-dimensional vectors from the two-dimensional image arrays; use a common dimension reduction via the singular-value decomposition to make the sizes manageable; and feed the training data, thus vectorized, into a simple neural network. The result is a classifier that can then be applied to images from text of unknown authorship.

While there are several moving parts, so to speak, the breadth of the Wolfram Language make this actually fairly straightforward. The main tools are indicated as follows:

1. `Import` to read in data.

2. `StringDrop`, `StringReplace` and similar string manipulation functions, used for removing initial sections (as they often contain identifying information) and to do other basic preprocessing.

3. Simple replacement rules to go from text to base 4 strings.

4. Simple code to implement FCGR, such as can be found in the Community forum.

5. Dimension reduction using `SingularValueDecomposition`. Code for this is straightforward, and one version can be found in “Linking Fourier and PCA Methods for Image Look‐Up.”

6. Machine learning functionality, at a fairly basic level (which is the limit of what I can handle). The functions I use are `NetChain` and `NetTrain`, and both work with a simple neural net.

7. Basic statistics functions such as `Total`, `Sort` and `Tally` are useful for assessing results.

Common practice in this area is to show results of a methodology on one or more sets of standard benchmarks. We used three such sets in the referenced paper. Two come from Reuters articles in the realm of corporate/industrial news. One is known as Reuters_50_50 (also called CCAT50). It has fifty authors represented, each with 50 articles for training and 50 for testing. Another is a subset of this, comprised of 50 training and 50 testing articles from ten of the fifty authors. One might think that using both sets entails a certain level of redundancy, but, perhaps surprisingly, past methods that perform very well on either of these tend not to do quite so well on the other. We also used a more recent set of articles, this time in Portuguese, from Brazilian newspapers. The only change to the methodology that this necessitated involved character substitutions to handle e.g. the “*c*‐with‐cedilla” character *ç*.

Results of this approach were quite strong. As best we could find in prior literature, scores equaled or exceeded past top scores on all three datasets. Since that time, we have applied the method to two other commonly used examples. One is a corpus comprised of IMDb reviews from 62 prolific reviewers. This time we were not the top performer, but came in close behind two other methods. Each was actually a “hybrid” comprised of weighted scores from some submethods. (Anecdotally, our method seems to make different mistakes from others, at least in examples we have investigated closely. This makes it a sound candidate for adoption in hybridized approaches.) As for the other new test, well, that takes us to the next section.

We now return to *The Federalist Papers*. The first step, of course, is to convert the text to images. We show a few here, created from first and last chunks from two essays. The ones on the top are from Federalist No. 33 (Hamilton) while those on the bottom are from Federalist No. 44 (Madison). Not surprisingly, they are not different in the obvious ways that the genome‐based images were different:

Before attempting to classify the disputed essays, it is important to ascertain that the methodology is sound. This requires a validation step. We proceeded as follows: We begin with those essays known to have been written by either Hamilton or Madison (we discard the three they coauthored, because there is not sufficient data therein to use). We hold back three entire essays from those written by Madison, and eight from the set by Hamilton (this is in approximate proportion to the relative number each penned). These withheld essays will be our first validation set. We also withhold the final chunk from each of the 54 essays that remain, to be used as a second validation set. (This two‐pronged validation appears to be more than is used elsewhere in the literature. We like to think we have been diligent.)

The results for the first validation set are perfect. Every one of the 70 chunks from the withheld essays are ascribed to their correct author. For the second set, two were erroneously ascribed. The scores for most chunks have the winner around four to seven times higher than the loser. For the two that were mistaken, these ratios dropped considerably, in one case to a factor of three and in the other to around 1.5. Overall, even with the two misses, these are extremely good results as compared to methods reported in past literature. I will remark that all processing, from importing the essays through classifying all chunks, takes less than half a minute on my desktop machine (with the bulk of that occupied in multiple training runs of the neural network classifier).

In order to avail ourselves of the full corpus of training data, we next merge the validation chunks into the training set and retrain. When we run the classifier on chunks from the disputed essays, things are mostly in accordance with prior conclusions. Except…

The first ten essays go strongly to Madison. Indeed, every chunk therein is ascribed to him. The last two go to Hamilton, albeit far less convincingly. A typical aggregated score for one of the convincing outcomes might be approximately 35:5 favoring Madison, whereas for the last two that go to Hamilton the scores are 34:16 and 42:27, respectively. A look at the chunk level suggests a perhaps more interesting interpretation. Essay 62, the next‐to‐last, has the five-chunk score pairs shown here (first is Hamilton’s score, then Madison’s):

Three are fairly strongly in favor of Hamilton as author (one of which could be classified as overwhelmingly so). The second and fourth are quite close, suggesting that despite the ability to do solid validation, these might be too close to call (or might be written by one and edited by the other).

The results from the final disputed essay are even more stark:

The first four chunks go strongly to Hamilton. The next two go strongly to Madison. The last also favors Madison, albeit weakly. This would suggest again a collaborative effort, with Hamilton writing the first part, Madison roughly a third and perhaps both working on the final paragraphs.

The reader will be reminded that this result comes from but one method. In its favor is that it performs extremely well on established benchmarks, and also in the validation step for the corpus at hand. On the counter side, many other approaches, over a span of decades, all point to a different outcome. That stated, we can mention that most (or perhaps all) prior work has not been at the level of chunks, and that granularity can give a better outcome in cases where different authors work on different sections. While these discrepancies with established consensus are of course not definitive, they might serve to prod new work on this very old topic. At the least, other methods might be deployed at the granularity of the chunk level we used (or similar, perhaps based on paragraphs), to see if parts of those essays 62 and 63 then show indications of Hamilton authorship.

*To two daughters of Weehawken. My wonderful mother‐in‐law, Marie Wynne, was a library clerk during her working years. My cousin Sharon Perlman (1953–2016) was a physician and advocate for children, highly regarded by peers and patients in her field of pediatric nephrology. Her memory is a blessing.*

On June 23 we celebrate the 30th anniversary of the launch of Mathematica. Most software from 30 years ago is now long gone. But not Mathematica. In fact, it feels in many ways like even after 30 years, we’re really just getting started. Our mission has always been a big one: to make the world as computable as possible, and to add a layer of computational intelligence to everything.

Our first big application area was math (hence the name “Mathematica”). And we’ve kept pushing the frontiers of what’s possible with math. But over the past 30 years, we’ve been able to build on the framework that we defined in Mathematica 1.0 to create the whole edifice of computational capabilities that we now call the Wolfram Language—and that corresponds to Mathematica as it is today.

From when I first began to design Mathematica, my goal was to create a system that would stand the test of time, and would provide the foundation to fill out my vision for the future of computation. It’s exciting to see how well it’s all worked out. My original core concepts of language design continue to infuse everything we do. And over the years we’ve been able to just keep building and building on what’s already there, to create a taller and taller tower of carefully integrated capabilities.

It’s fun today to launch Mathematica 1.0 on an old computer, and compare it with today:

Yes, even in Version 1, there’s a recognizable Wolfram Notebook to be seen. But what about the Mathematica code (or, as we would call it today, Wolfram Language code)? Well, the code that ran in 1988 just runs today, exactly the same! And, actually, I routinely take code I wrote at any time over the past 30 years and just run it.

Of course, it’s taken a lot of long-term discipline in language design to make this work. And without the strength and clarity of the original design it would never have been possible. But it’s nice to see that all that daily effort I’ve put into leadership and consistent language design has paid off so well in long-term stability over the course of 30 years.

Back in 1988, Mathematica was a big step forward in high-level computing, and people were amazed at how much it could do. But it’s absolutely nothing compared to what Mathematica and the Wolfram Language can do today. And as one way to see this, here’s how the different major areas of functionality have “lit up” between 1988 and today:

There were 551 built-in functions in 1988; there are now more than 5100. And the expectations for each function have vastly increased too. The concept of “superfunctions” that automate a swath of algorithmic capability already existed in 1988—but their capabilities pale in comparison to our modern superfunctions.

Back in 1988 the core ideas of symbolic expressions and symbolic programming were already there, working essentially as they do today. And there were also all sorts of functions related to mathematical computation, as well as to things like basic visualization. But in subsequent years we were able to conquer area after area.

Partly it’s been the growth of raw computer power that’s made new areas possible. And partly it’s been our ability to understand what could conceivably be done. But the most important thing has been that—through the integrated design of our system—we’ve been able to progressively build on what we’ve already done to reach one new area after another, at an accelerating pace. (Here’s a plot of function count by version.)

I recently found a to-do list I wrote in 1991—and I’m happy to say that now, in 2018, essentially everything on it has been successfully completed. But in many cases it took building a whole tower of capabilities—over a large number of years—to be able to achieve what I wanted.

From the very beginning—and even from projects of mine that preceded Mathematica—I had the goal of building as much knowledge as possible into the system. At the beginning the knowledge was mostly algorithmic, and formal. But as soon we could routinely expect network connectivity to central servers, we started building in earnest what’s now our immense knowledgebase of computable data about the real world.

Back in 1988, I could document pretty much everything about Mathematica in the 750-page book I wrote. Today if we were to print out the online documentation it would take perhaps 36,000 pages. The core concepts of the system remain as simple and clear as they ever were, though—so it’s still perfectly possible to capture them even in a small book.

Thirty years is basically half the complete history of modern digital computing. And it’s remarkable—and very satisfying—that Mathematica and the Wolfram Language have had the strength not only to persist, but to retain their whole form and structure, across all that time.

Thirty years ago Mathematica (all 2.2 megabytes of it) came in boxes available at “neighborhood software stores”, and was distributed on collections of floppy disks (or, for larger computers, on various kinds of magnetic tapes). Today one just downloads it anytime (about 4 gigabytes), accessing its knowledgebase (many terabytes) online—or one just runs the whole system directly in the Wolfram Cloud, through a web browser. (In a curious footnote to history, the web was actually invented back in 1989 on a collection of NeXT computers that had been bought to run Mathematica.)

Thirty years ago there were “workstation class computers” that ran Mathematica, but were pretty much only owned by institutions. In 1988, PCs used MS-DOS, and were limited to 640K of working memory—which wasn’t enough to run Mathematica. The Mac could run Mathematica, but it was always a tight fit (“2.5 megabytes of memory required; 4 megabytes recommended”)—and in the footer of every notebook was a memory gauge that showed you how close you were to running out of memory. Oh, yes, and there were two versions of Mathematica, depending on whether or not your machine had a “numeric coprocessor” (which let it do floating-point arithmetic in hardware rather than in software).

Back in 1988, I had got my first cellphone—which was the size of a shoe. And the idea that something like Mathematica could “run on a phone” would have seemed preposterous. But here we are today with the Wolfram Cloud app on phones, and Wolfram Player running natively on iPads (and, yes, they don’t have virtual memory, so our tradition of tight memory management from back in the old days comes in very handy).

In 1988, computers that ran Mathematica were always things you plugged into a power outlet to use. And the notion of, for example, using Mathematica on a plane was basically inconceivable (well, OK, even in 1981 when I lugged my Osborne 1 computer running CP/M onto a plane, I did find one power outlet for it at the very back of a 747). It wasn’t until 1991 that I first proudly held up at a talk a Compaq laptop that was (creakily) running Mathematica off batteries—and it wasn’t routine to run Mathematica portably for perhaps another decade.

For years I used to use `1989^1989` as my test computation when I tried Mathematica on a new machine. And in 1989 I would usually be counting the seconds waiting for the computation to be finished. (`1988^1988` was usually too slow to be useful back in 1988: it could take minutes to return.) Today, of course, the same computation is instantaneous. (Actually, a few years ago, I did the computation again on the first Raspberry Pi computer—and it again took several seconds. But that was a $25 computer. And now even it runs the computation very fast.)

The increase in computer speed over the years has had not only quantitative but also qualitative effects on what we’ve been able to do. Back in 1988 one basically did a computation and then looked at the result. We talked about being able to interact with a Mathematica computation in real time (and there was actually a demo on the NeXT computer that did a simple case of this even in 1989). But it basically took 18 years before computers were routinely fast enough that we could implement `Manipulate` and `Dynamic`—with “Mathematica in the loop”.

I considered graphics and visualization an important feature of Mathematica from the very beginning. Back then there were “paint” (bitmap) programs, and there were “draw” (vector) programs. We made the decision to use the then-new PostScript language to represent all our graphics output resolution-independently.

We had all sorts of computational geometry challenges (think of all those little shattered polygons), but even back in 1988 we were able to generate resolution-independent 3D graphics, and in preparing for the original launch of Mathematica we found the “most complicated 3D graphic we could easily generate”, and ended up with the original icosahedral “spikey”—which has evolved today into our rhombic hexecontahedron logo:

In a sign of a bygone software era, the original Spikey also graced the elegant, but whimsical, Mathematica startup screen on the Mac:

Back in 1988, there were command-line interfaces (like the Unix shell), and there were word processors (like WordPerfect). But it was a new idea to have “notebooks” (as we called them) that mixed text, input and output—as well as graphics, which more usually were generated in a separate window or even on a separate screen.

Even in Mathematica 1.0, many of the familiar features of today’s Wolfram Notebooks were already present: cells, cell groups, style mechanisms, and more. There was even the same doubled-cell-bracket evaluation indicator—though in those days longer rendering times meant there needed to be more “entertainment”, which Mathematica provided in the form of a bouncing-string-figure wait cursor that was computed in real time during the vertical retrace interrupt associated with refreshing the CRT display.

In what would now be standard good software architecture, Mathematica from the very beginning was always divided into two parts: a kernel doing computations, and a front end supporting the notebook interface. The two parts communicated through the MathLink protocol (still used today, but now called WSTP) that in a very modern way basically sent symbolic expressions back and forth.

Back in 1988—with computers like Macs straining to run Mathematica—it was common to run the front end on a local desktop machine, and then have a “remote kernel” on a heftier machine. Sometimes that machine would be connected through Ethernet, or rarely through the internet. More often one would use a dialup connection, and, yes, there was a whole mechanism in Version 1.0 to support modems and phone dialing.

When we first built the notebook front end, we thought of it as a fairly thin wrapper around the kernel—that we’d be able to “dash off” for the different user interfaces of different computer systems. We built the front end first for the Mac, then (partly in parallel) for the NeXT. Within a couple of years we’d built separate codebases for the then-new Microsoft Windows, and for X Windows.

But as we polished the notebook front end it became more and more sophisticated. And so it was a great relief in 1996 when we managed to create a merged codebase that ran on all platforms.

And for more than 15 years this was how things worked. But then along came the cloud, and mobile. And now, out of necessity, we again have multiple notebook front end codebases. Maybe in a few years we’ll be able to merge them again. But it’s funny how the same issues keep cycling around as the decades go by.

Unlike the front end, we designed the kernel from the beginning to be as robustly portable as possible. And over the years it’s been ported to an amazing range of computers—very often as the first serious piece of application software that a new kind of computer runs.

From the earliest days of Mathematica development, there was always a raw command-line interface to the kernel. And it’s still there today. And what’s amazing to me is how often—in some new and unfamiliar situation—it’s really nice to have that raw interface available. Back in 1988, it could even make graphics—as ASCII art—but that’s not exactly in so much demand today. But still, the raw kernel interface is what for example wolframscript uses to provide programmatic access to the Wolfram Language.

There’s much of the earlier history of computing that’s disappearing. And it’s not so easy in practice to still run Mathematica 1.0. But after going through a few early Macs, I finally found one that still seemed to run well enough. We loaded up Mathematica 1.0 from its distribution floppies, and yes, it launched! (I guess the distribution floppies were made the week before the actual release on June 23, 1988; I vaguely remember a scramble to get the final disks copied.)

Needless to say, when I wanted to livestream this, the Mac stopped working, showing only a strange zebra pattern on its screen. Whacking the side of the computer (a typical 1980s remedy) didn’t do anything. But just as I was about to give up, the machine suddenly came to life, and there I was, about to run Mathematica 1.0 again.

I tried all sorts of things, creating a fairly long notebook. But then I wondered: just how compatible is this? So I saved the notebook on a floppy, and put it in a floppy drive (yes, you can still get those) on a modern computer. At first, the modern operating system didn’t know what to do with the notebook file.

But then I added our old “.ma” file extension, and opened it. And… oh my gosh… it just worked! The latest version of the Wolfram Language successfully read the 1988 notebook file format, and rendered the live notebook (and also created a nice, modern “.nb” version):

There’s a bit of funny spacing around the graphics, reflecting the old way that graphics had to be handled back in 1988. But if one just selects the cells in the notebook, and presses Shift + Enter, up comes a completely modern version, now with color outputs too!

Before Mathematica, sophisticated technical computing was at best the purview of a small “priesthood” of technical computing experts. But as soon as Mathematica appeared on the scene, this all changed—and suddenly a typical working scientist or mathematician could realistically expect to do serious computation with their own hands (and then to save or publish the results in notebooks).

Over the past 30 years, we’ve worked very hard to open progressively more areas to immediate computation. Often there’s great technical sophistication inside. But our goal is to be able to let people translate high-level computational thinking as directly and automatically as possible into actual computations.

The result has been incredibly powerful. And it’s a source of great satisfaction to see how much has been invented and discovered with Mathematica over the years—and how many of the world’s most productive innovators use Mathematica and the Wolfram Language.

But amazingly, even after all these years, I think the greatest strengths of Mathematica and the Wolfram Language are only just now beginning to become broadly evident.

Part of it has to do with the emerging realization of how important it is to systematically and coherently build knowledge into a system. And, yes, the Wolfram Language has been unique in all these years in doing this. And what this now means is that we have a huge tower of computational intelligence that can be immediately applied to anything.

To be fair, for many of the past 30 years, Mathematica and the Wolfram Language were primarily deployed as desktop software. But particularly with the increasing sophistication of the general computing ecosystem, we’ve been able in the past 5–10 years to build out extremely strong deployment channels that have now allowed Mathematica and the Wolfram Language to be used in an increasing range of important enterprise settings.

Mathematica and the Wolfram Language have long been standards in research, education and fields like quantitative finance. But now they’re in a position to bring the tower of computational intelligence that they embody to any area where computation is used.

Since the very beginning of Mathematica, we’ve been involved with what’s now called artificial intelligence (and in recent times we’ve been leaders in supporting modern machine learning). We’ve also been very deeply involved with data in all forms, and with what’s now called data science.

But what’s becoming clearer only now is just how critical the breadth of Mathematica and the Wolfram Language is to allowing data science and artificial intelligence to achieve their potential. And of course it’s satisfying to see that all those capabilities that we’ve built over the past 30 years—and all the design coherence that we’ve worked so hard to maintain—are now so important in areas like these.

The concept of computation is surely the single most important intellectual development of the past century. And it’s been my goal with Mathematica and the Wolfram Language to provide the best possible vehicle to infuse high-level computation into every conceivable domain.

For pretty much every field X (from art to zoology) there either is now, or soon will be, a “computational X” that defines the future of the field by using the paradigm of computation. And it’s exciting to see how much the unique features of the Wolfram Language are allowing it to help drive this process, and become the “language of computational X”.

Traditional non-knowledge-based computer languages are fundamentally set up as a way to tell computers what to do—typically at a fairly low level. But one of the aspects of the Wolfram Language that’s only now beginning to be recognized is that it’s not just intended to be for telling computers what to do; it’s intended to be a true computational communication language, that provides a way of expressing computational thinking that’s meaningful both to computers and to humans.

In the past, it was basically just computers that were supposed to “read code”. But like a vast generalization of the idea of mathematical notation, the goal with the Wolfram Language is to have something that humans can readily read, and use to represent and understand computational ideas.

Combining this with the idea of notebooks brings us the notion of computational essays—which I think are destined to become a key communication tool for the future, uniquely made possible by the Wolfram Language, with its 30-year history.

Thirty years ago it was exciting to see so many scientists and mathematicians “discover computers” through Mathematica. Today it’s exciting to see so many new areas of “computational X” being opened up. But it’s also exciting to see that—with the level of automation we’ve achieved in the Wolfram Language—we’ve managed to bring sophisticated computation to the point where it’s accessible to essentially anyone. And it’s been particularly satisfying to see all sorts of kids—at middle-school level or even below—start to get fluent in the Wolfram Language and the high-level computational ideas it provides access to.

If one looks at the history of computing, it’s in many ways a story of successive layers of capability being added, and becoming ubiquitous. First came the early languages. Then operating systems. Later, around the time Mathematica came on the scene, user interfaces began to become ubiquitous. A little later came networking and then large-scale interconnected systems like the web and the cloud.

But now what the Wolfram Language provides is a new layer: a layer of computational intelligence—that makes it possible to take for granted a high level of built-in knowledge about computation and about the world, and an ability to automate its application.

Over the past 30 years many people have used Mathematica and the Wolfram Language, and many more have been exposed to their capabilities, through systems like Wolfram|Alpha built with them. But what’s possible now is to let the Wolfram Language provide a truly ubiquitous layer of computational intelligence across the computing world. It’s taken decades to build a tower of technology and capabilities that I believe are worthy of this—but now we are there, and it’s time to make this happen.

But the story of Mathematica and the Wolfram Language is not just a story of technology. It’s also a story of the remarkable community of individuals who’ve chosen to make Mathematica and the Wolfram Language part of their work and lives. And now, as we go forward to realize the potential for the Wolfram Language in the world of the future, we need this community to help explain and implement the paradigm that the Wolfram Language defines.

Needless to say, injecting new paradigms into the world is never easy. But doing so is ultimately what moves forward our civilization, and defines the trajectory of history. And today we’re at a remarkable moment in the ability to bring ubiquitous computational intelligence to the world.

But for me, as I look back at the 30 years since Mathematica was launched, I am thankful for everything that’s allowed me to single-mindedly pursue the path that’s brought us to the Mathematica and Wolfram Language of today. And I look forward to our collective effort to move forward from this point, and to contribute to what I think will ultimately be seen as a crucial element in the development of technology and our world.

]]>Whew! So much has happened in a year. Consider this number: we added 230 new functions to the Wolfram Language in 2017! The Wolfram Blog traces the path of our company’s technological advancement, so let’s take a look back at 2017 for the blog’s year in review.

The year 2017 saw two Wolfram Language releases, a major release of Wolfram SystemModeler, the new Wolfram iOS Player hit the app store, Wolfram|Alpha pumping up its already-unmatched educational value and a host of features and capabilities related to these releases. We’ll start with the Wolfram Language releases.

Stephen Wolfram says it’s “a minor release that’s not minor.” And if you look at the summary of new features, you’ll see why:

Stephen continues, “There’s a lot here. One might think that a .1 release, nearly 29 years after Version 1.0, wouldn’t have much new any more. But that’s not how things work with the Wolfram Language, or with our company. Instead, as we’ve built our technology stack and our procedures, rather than progressively slowing down, we’ve been continually accelerating.”

The launch of Wolfram Language 11.2 continues the tradition of significant releases. Stephen says, “We have a very deliberate strategy for our releases. Integer releases (like 11) concentrate on major complete new frameworks that we’ll be building on far into the future. ‘.1’ releases (like 11.2) are intended as snapshots of the latest output from our R&D pipeline—delivering new capabilities large and small as soon as they’re ready.”

“It’s been one of my goals with the Wolfram Language to build into it as much data as possible—and make all of that data immediately usable and computable.” To this end, Stephen and company have been working on the Wolfram Data Repository, which is now available. Over time, this resource will snowball into a massive trove of computable information. Read more about it in Stephen’s post. But, more importantly, contribute to the Repository with your own data!

Our post about Wolfram|Alpha Pro upgrades was one of the most popular of the year. And all the web traffic around Wolfram|Alpha’s development of step-by-step solutions is not surprising when you consider that this product is *the* educational tool for anyone studying (or teaching!) mathematics in high school or early college. Read the post to find out why students and forward-thinking teachers recommend Wolfram|Alpha Pro products.

John Fultz, Wolfram’s director of user interface technology, announced the release of a highly anticipated product—Wolfram Player for iOS. “The beta is over, and we are now shipping Wolfram Player in the App Store. Wolfram Player for iOS joins Wolfram CDF Player on Windows, Mac and Linux as a free platform for sharing your notebook content with the world.” Now Wolfram Notebooks are the premium data presentation tool for every major platform.

The Wolfram MathCore and R&D teams announced a major leap for SystemModeler. “As part of the 4.1, 4.2, 4.3 sequence of releases, we completely rebuilt and modernized the core computational kernel of SystemModeler. Now in SystemModeler 5, we’re able to build on this extremely strong framework to add a whole variety of new capabilities.”

Some of the headlines include:

- Support for continuous media such as fluids and gases, using the latest Modelica libraries
- Almost 200 additional Modelica components, including Media, PowerConverters and Noise libraries
- Complete visual redesign of almost 6,000 icons, for consistency and improved readability
- Support for new GUI workspaces optimized for different levels of development and presentation
- Almost 500 built-in example models for easy exploration and learning
- Modular reconfigurability, allowing different parts of models to be easily switched and modified
- Symbolic parametric simulation: the ability to create a fully computable object representing variations of model parameters
- Importing and exporting FMI 2 models for broad model interchange and system integration

Earlier last year Markus Dahl, applications engineer, announced another advancement within the SystemModeler realm—the integration of OPC Unified Architecture (OPC UA). “Wolfram SystemModeler can be utilized very effectively when combining different Modelica libraries, such as ModelPlug and OPCUA, to either create virtual prototypes of systems or test them in the real world using cheap devices like Arduinos or Raspberry Pis. The tested code for the system can then easily be exported to another system, or used directly in a HIL (hardware-in-the-loop) simulation.”

In 2017 we had some blog posts that made quite a splash by showing off Wolfram technology. From insights into the science behind movies to timely new views on history, the Wolfram Language provided some highlight moments in public conversations this year. Let’s check out a few…

The story of mathematician Katherine Johnson and two of her NASA colleagues, Dorothy Vaughan and Mary Jackson, was in the spotlight at the 2017 Academy Awards, where the film about these women—*Hidden Figures*—was nominated for three Oscars. Three Wolfram scientists took a look at the math/physics problems the women grappled with, albeit with the luxury of modern computational tools found in the Wolfram Language. Our scientists commented on the crucial nature of Johnson’s work: “Computers were in their early days at this time, so Johnson and her team’s ability to perform complicated navigational orbital mechanics problems without the use of a computer provided an important sanity check against the early computer results.”

Another Best Picture nominee in 2017 was *Arrival*, a film for which Stephen and Christoper Wolfram served as scientific advisors. Stephen wrote an often-cited blog post about the experience, Quick, How Might the Alien Spacecraft Work?. On the set, Christopher was tasked with analyzing and writing code for a fictional nonlinear visual language. On January 31, he demonstrated the development process he went through in a livecoding event broadcast on LiveEdu.tv. This livecoding session garnered almost 60,000 views.

Wolfram celebrated the birthday of the late, great Muhammad Ali with a blog post from one of our data scientists, Jofre Espigule-Pons. Using charts and graphs from histograms and network plots, Espigule-Pons examined Ali’s boxing career, his opponent pool and even his poetry. This tribute to the boxing icon was one of the most-loved blog posts of 2017.

For the Fourth of July holiday, Swede White, Wolfram’s media and communications specialist, used a variety of functions in the Wolfram Language to analyze the social networks of the revolutionaries who shaped our nation. (Yes, social networks existed before Facebook was a thing!) The data visualizations are enlightening. It turns out that Paul Revere was the right guy to spread the warning: although he never rode through towns shouting, “The British are coming,” he had the most social connections.

So you say there’s no *X* in *espresso*. But are you certain? Vitaliy Kaurov, academic director of the Wolfram Science and Innovation Initiatives, examines the history behind this point of contention. This blog post is truly a shining example of what computational analysis can do for fields such as linguistics and lexicology. And it became a social media hit to boot, especially in certain circles of the Reddit world where pop culture debates can be virtually endless.

Just in time for the holiday board game season, popular Wolfram blogger Jon McLoone, director of technical communication and strategy, breaks down the exact probabilities of winning Risk. There are other Risk win/loss estimators out there, but they are just that—estimations. John uses the Wolfram Language to give exact odds for each battle possibility the game offers. Absolute candy for gamer math enthusiasts!

We had a great year at Wolfram Research, and we wish you a productive and rewarding 2018!

]]>As the Fourth of July approaches, many in America will celebrate 241 years since the founders of the United States of America signed the Declaration of Independence, their very own disruptive, revolutionary startup. Prior to independence, colonists would celebrate the birth of the king. However, after the Revolutionary War broke out in April of 1775, some colonists began holding mock funerals of King George III. Additionally, bonfires, celebratory cannon and musket fire and parades were common, along with public readings of the Declaration of Independence. There was also rum.

Today, we often celebrate with BBQ, fireworks and a host of other festivities. As an aspiring data nerd and a sociologist, I thought I would use the Wolfram Language to explore the Declaration of Independence using some basic natural language processing.

Using metadata, I’ll also explore a political network of colonists with particular attention paid to Paul Revere, using built-in Wolfram Language functions and network science to uncover some hidden truths about colonial Boston and its key players leading up to the signing of the Declaration of Independence.

The Wolfram Data Repository was recently announced and holds a growing collection of interesting resources for easily computable results.

As it happens, the Wolfram Data Repository includes the full text of the Declaration of Independence. Let’s explore the document using `WordCloud` by first grabbing it from the Data Repository.

Interesting, but this isn’t very patriotic thematically, so let’s use `ColorFunction` and then use `DeleteStopwords` to remove the signers of the document.

As we can see, the Wolfram Language has deleted the names of the signers and made words larger as a function of their frequency in the Declaration of Independence. What stands out is that the words “laws” and “people” appear the most frequently. This is not terribly surprising, but let’s look at the historical use of those words using the built-in `WordFrequencyData` functionality and `DateListPlot` for visualization. Keeping with a patriotic theme, let’s also use `PlotStyle` to make the plot red and blue.

What is incredibly interesting is that we can see a usage spike around 1776 in both words. The divergence between the use of the two words over time also strikes me as interesting.

According to historical texts, colonial Boston was a fascinating place in the late 18th century. David Hackett Fischer’s monograph *Paul Revere’s Ride* paints a comprehensive picture of the political factions that were driving the revolutionary movement. Of particular interest are the Masonic lodges and caucus groups that were politically active and central to the Revolutionary War.

Those of us raised in the United States will likely remember Paul Revere from our very first American history classes. He famously rode a horse through what is now the greater Boston area warning the colonial militia of incoming British troops, known as his “midnight ride,” notably captured in a poem by Henry Wadsworth Longfellow in 1860.

Up until Fischer’s exploration of Paul Revere’s political associations and caucus memberships, historians argued the colonial rebel movement was controlled by high-ranking political elites led by Samuel Adams, with many concluding Revere was simply a messenger. That he was, but through that messaging and other activities, he was key to joining together political groups that otherwise may not have communicated, as I will show through network analysis.

As it happens, this time last year I was at the Wolfram Summer School, which is currently in progress at Bentley University. One of the highlights of my time there was a lecture on social network analysis, led by Charlie Brummitt, that used metadata to analyze colonial rebels in Boston.

Duke University sociologist Kieran Healy has a fantastic blog post exploring this titled “Using Metadata to Find Paul Revere” that the lecture was derived from. I’m going to recreate some of his analysis with the Wolfram Language and take things a bit further with more advanced visualizations.

First, however, as a sociologist, my studies and research are often concerned with inequalities, power and marginalized groups. I would be remiss if I did not think of Abigail Adams’s correspondence with her husband John Adams on March 31, 1776, in which she instructed him to “remember the ladies” at the proceedings of the Continental Congress. I made a `WordCloud` of the letter here.

The data we are using is exclusively about men and membership data from male-only social and political organizations. It is worth noting that during the Revolutionary period, and for quite a while following, women were legally barred from participating in most political affairs. Women could vote in some states, but between 1777 and 1787, those rights were stripped in all states except New Jersey. It wasn’t until August 18, 1920, that the 19th Amendment passed, securing women’s right to vote unequivocally.

To that end, under English common law, women were treated as *femes covert*, meaning married women’s rights were absorbed by their husbands. Not only were women not allowed to vote, coverture laws dictated that a husband and wife were one person, with the former having sole political decision-making authority, as well as the ability to buy and sell property and earn wages.

Following the American Revolution, the United States was free from the tyranny of King George III; however, women were still subservient to men legally and culturally. For example, Hannah Griffitts, a poet known for her work about the Daughters of Liberty, “The Female Patriots,” expressed in a 1785 diary entry sentiments common among many colonial women:

The glorious fourth—again appears

A Day of Days—and year of years,

The sum of sad disasters,

Where all the mighty gains we see

With all their Boasted liberty,

Is only Change of Masters.

There is little doubt that without the domestic and emotional labor of women, often invisible in history, these men, the so-called Founding Fathers, would have been less successful and expedient in achieving their goals of independence from Great Britain. So today, we remember the ladies, the marginalized and the disenfranchised.

Conveniently, I uploaded a cleaned association matrix of political group membership in colonial Boston as a `ResourceObject` to the Data Repository. We’ll import with `ResourceData` to give us a nice data frame to work with.

We can see we have 254 colonists in our dataset. Let’s take a look at which colonial rebel groups Samuel Adams was a member of, as he’s known in contemporary times for a key ingredient in Fourth of July celebrations, beer.

Our `True/False` values indicate membership in one of seven political organizations: St. Andrews Lodge, Loyal Nine, North Caucus, the Long Room Club, the Tea Party, the Boston Committee of Correspondence and the London Enemies.

We can see Adams was a member of four of these. Let’s take a look at Revere’s memberships.

As we can see, Revere was slightly more involved, as he is a member of five groups. We can easily graph his membership in these political organizations. For those of you unfamiliar with how a network functions, nodes represent agents and the lines between them represent some sort of connection, interaction or association.

There are seven organizations in total, so let’s see how they are connected by highlighting political organizations as red nodes, with individuals attached to each node.

We can see the Tea Party and St. Andrews Lodge have many more members than Loyal Nine and others, which we will now explore further at the micro level.

What we’ve done so far is fairly macro and exploratory. Let’s drill down by looking at each individual’s connection to one another by way of shared membership in these various groups. Essentially, we are removing our political organization nodes and focusing on individual colonists. We’ll use `Tooltip` to help us identify each actor in the network.

We now use a social network method called `BetweennessCentrality` that measures the centrality of an agent in a network. It is the fraction of shortest paths between pairs of other agents that pass through that agent. Since the actor can broker information between the other agents, for example, this measure becomes key in determining the importance of a particular node in the network by measuring how a node lies between pairs of actors with nothing lying between a node and other actors.

We’ll first create a function that will allow us to visualize not only `BetweennessCentrality`, but also `EigenvectorCentrality` and `ClosenessCentrality`.

We begin with some brief code for `BetweennessCentrality` that uses the defined `ColorData` feature to show us which actors have the highest ability to transmit resources or information through the network, along with the Tooltip that was previously defined.

Lo and behold, Paul Revere appears to have a vastly higher betweenness score than anyone else in the network. Significantly, John Adams is at the center of our radial graph, but he does not appear to have much power in the network. Let’s grab the numbers.

Revere has almost double the score of the next highest colonist, Thomas Urann. What this indicates is Revere’s essential importance in the network as a broker of information. Since he is a member of five of the seven groups, this isn’t terribly surprising, but it would have otherwise been unnoticed without this type of inquiry.

`ClosenessCentrality` varies from betweenness in that we are concerned with path lengths to other actors. These agents who can reach a high number of other actors through short path lengths are able to disseminate information or even exert power more efficiently than agents on the periphery of the network. Let’s run our function on the network again and look at `ClosenessCentrality` to see if Revere still ranks highest.

Revere appears ranked the highest, but it is not nearly as dramatic as his betweenness score and, again, John Adams has a low score. Let’s grab the measurements for further analysis.

As our heat-map coloring of nodes indicates, other colonists are not far behind Revere, though he certainly is the highest ranked. While there are other important people in the network, Revere is clearly the most efficient broker of resources, power or information.

One final measure we can examine is `EigenvectorCentrality`, which uses a more advanced algorithm and takes into account the centrality of all nodes and an individual actor’s nearness and embeddedness among highly central agents.

There appears to be two top contenders for the highest eigenvector score. Let’s once again calculate the measurements in a table for examination.

Nathaniel Barber and Revere have nearly identical scores; however, Revere still tops the list. Let’s now take the top five closeness scores and create a network without them in it to see how the cohesiveness of the network might change.

We see quite a dramatic change in the graph on the left with our key players removed, indicating those with the top five closeness scores are fairly essential in joining these seven political organizations together. Joseph Warren appears to be one of only a few people who can act as a bridge between disparate clusters of connections. Essentially, it would be difficult to have information spread freely through the network on the left as opposed the network on the right that includes Paul Revere.

As we have seen, we can use network science in history to uncover or expose misguided preconceptions about a figure’s importance in historical events, based on group membership metadata. Prior to Fischer’s analysis, many thought Revere was just a courier, and not a major figure. However, what I have been able to show is Revere’s importance in bridging disparate political groups. This further reveals that the Revolutionary movement was pluralistic in its aims. The network was ultimately tied together by disdain for the tyranny of King George III, unjust British military actions and policies that led to bloody revolt, not necessarily a top-down directive from political elites.

Beyond history, network science and natural language processing have many applications, such as uncovering otherwise hidden brokers of information, resources and power, i.e. social capital. One can easily imagine how this might be useful for computational marketing or public relations.

How will you use network science to uncover otherwise-hidden insights to revolutionize and disrupt your work or interests?

*Special thanks to Wolfram|Alpha data scientist Aaron Enright for helping with this blog post and to Charlie Brummitt for providing the beginnings of this analysis.*

I will touch on two aspects of her scientific work that were mentioned in the film: orbit calculations and reentry calculations. For the orbit calculation, I will first exactly follow what Johnson did and then compare with a more modern, direct approach utilizing an array of tools made available with the Wolfram Language. Where the movie mentions the solving of differential equations using Euler’s method, I will compare this method with more modern ones in an important problem of rocketry: computing a reentry trajectory from the rocket equation and drag terms (derived using atmospheric model data obtained directly from within the Wolfram Language).

The movie doesn’t focus much on the math details of the types of problems Johnson and her team dealt with, but for the purposes of this blog, I hope to provide at least a flavor of the approaches one might have used in Johnson’s day compared to the present.

One of the earliest papers that Johnson coauthored, “Determination of Azimuth Angle at Burnout for Placing a Satellite over a Selected Earth Position,” deals with the problem of making sure that a satellite can be placed over a specific Earth location after a specified number of orbits, given a certain starting position (e.g. Cape Canaveral, Florida) and orbital trajectory. The approach that Johnson’s team used was to determine the azimuthal angle (the angle formed by the spacecraft’s velocity vector at the time of engine shutoff with a fixed reference direction, say north) to fire the rocket in, based on other orbital parameters. This is an important step in making sure that an astronaut is in the correct location for reentry to Earth.

In the paper, Johnson defines a number of constants and input parameters needed to solve the problem at hand. One detail to explain is the term “burnout,” which refers to the shutoff of the rocket engine. After burnout, orbital parameters are essentially “frozen,” and the spacecraft moves solely under the Earth’s gravity (as determined, of course, through Newton’s laws). In this section, I follow the paper’s unit conventions as closely as possible.

For convenience, some functions are defined to deal with angles in degrees instead of radians. This allows for smoothly handling time in angle calculations:

Johnson goes on to describe several other derived parameters, though it’s interesting to note that she sometimes adopted values for these rather than using the values returned by her formulas. Her adopted values were often close to the values obtained by the formulas. For simplicity, the values from the formulas are used here.

Semilatus rectum of the orbit ellipse:

Angle in orbit plane between perigee and burnout point:

Orbit eccentricity:

Orbit period:

Eccentric anomaly:

To describe the next parameter, it’s easiest to quote the original paper: “The requirement that a satellite with burnout position *φ*1, *λ*1 pass over a selected position *φ*2, *λ*2 after the completion of *n* orbits is equivalent to the requirement that, during the first orbit, the satellite pass over an equivalent position with latitude *φ*2 the same as that of the selected position but with longitude *λ*2e displaced eastward from *λ*2 by an amount sufficient to compensate for the rotation of the Earth during the *n* complete orbits, that is, by the polar hour angle *n ω _{E} T*. The longitude of this equivalent position is thus given by the relation”:

Time from perigee for angle *θ*:

Part of the final solution is to determine values for intermediate parameters *δλ*_{1-2e} and *θ*_{2e}. This can be done in a couple of ways. First, I can use `ContourPlot` to obtain a graphical solution via equations 19 and 20 from the paper:

`FindRoot` can be used to find the solutions numerically:

Of course, Johnson didn’t have access to `ContourPlot` or `FindRoot`, so her paper describes an iterative technique. I translated the technique described in the paper into the Wolfram Language, and also solved for a number of other parameters via her iterative method. Because the base computations are for a spherical Earth, corrections for oblateness are included in her method:

Graphing the value of *θ*2e for the various iterations shows a quick convergence:

I can convert the method in a `FindRoot` command as follows (this takes the oblateness effects into account in a fully self-consistent manner and calculates values for all nine variables involved in the equations):

Interestingly, even the iterative root-finding steps of this more complicated system converge quite quickly:

With the orbital parameters determined, it is desirable to visualize the solution. First, some critical parameters from the previous solutions need to be extracted:

Next, the latitude and longitude of the satellite as a function of azimuth angle need to be derived:

*φ*s and *λ*s are the latitudes and longitudes as a function of *θ*s:

The satellite ground track can be constructed by creating a table of points:

Johnson’s paper presents a sketch of the orbital solution including markers showing the burnout, selected and equivalent positions. It’s easy to reproduce a similar plain diagram here:

For comparison, here is her original diagram:

A more visually useful version can be constructed using `GeoGraphics`, taking care to convert the geocentric coordinates into geodetic coordinates:

Today, virtually every one of us has, within immediate reach, access to computational resources far more powerful than those available to the entirety of NASA in the 1960s. Now, using only a desktop computer and the Wolfram Language, you can easily find direct numerical solutions to problems of orbital mechanics such as those posed to Katherine Johnson and her team. While perhaps less taxing of our ingenuity than older methods, the results one can get from these explorations are no less interesting or useful.

To solve for the azimuthal angle *ψ* using more modern methods, let’s set up parameters for a simple circular orbit beginning after burnout over Florida, assuming a spherically symmetric Earth (I’ll not bother trying to match the orbit of the Johnson paper precisely, and I’ll redefine certain quantities from above using the modern SI system of units). Starting from the same low-Earth orbit altitude used by Johnson, and using a little spherical trigonometry, it is straightforward to derive the initial conditions for our orbit:

The relevant physical parameters can be obtained directly from within the Wolfram Language:

Next, I obtain a differential equation for the motion of our spacecraft, given the gravitational field of the Earth. There are several ways you can model the gravitational potential near the Earth. Assuming a spherically symmetric planet and utilizing a Cartesian coordinate system throughout, the potential is merely:

Alternatively, you can use a more realistic model of Earth’s gravity, where the planet’s shape is taken to be an oblate ellipsoid of revolution. The exact form of the potential from such an ellipsoid (assuming constant mass-density over ellipsoidal shells), though complicated (containing multiple elliptic integrals), is available through `EntityValue`:

For a general homogeneous triaxial ellipsoid, the potential contains piecewise functions:

Here, *κ* is the largest root of *x*^{2}/(*a*^{2}+*κ*)+*y*^{2}/(*b*^{2}+*κ*)+*z*^{2}/(*c*^{2}+*κ*)=1. In the case of an oblate ellipsoid, the previous formula can be simplified to contain only elementary functions…

… where *κ*=((2 *z*^{2} (*a*^{2}-*c*^{2}+*x*^{2}+*y*^{2})+(-*a*^{2}+*c*^{2}+*x*^{2}+*y*^{2})^{2}+*z*^{4})^{1/2}-*a*^{2}-*c*^{2}+*x^{2}+y^{2}+z^{2})*/2.

A simpler form that is widely used in the geographic and space science community, and that I will use here, is given by the so-called International Gravity Formula (IGF). The IGF takes into account differences from a spherically symmetric potential up to second order in spherical harmonics, and gives numerically indistinguishable results from the exact potential referenced previously. In terms of four measured geodetic parameters, the IGF potential can be defined as follows:

I could easily use even better values for the gravitational force through `GeogravityModelData`. For the starting position, the IGF potential deviates only 0.06% from a high-order approximation:

With these functional forms for the potential, finding the orbital path amounts to taking a gradient of the potential to get the gravitational field vector and then applying Newton’s third law. Doing so, I obtain the orbital equations of motion for the two gravity models:

I am now ready to use the power of `NDSolve` to compute orbital trajectories. Before doing this, however, it will be nice to display the orbital path as a curve in three-dimensional space. To give these curves context, I will plot them over a texture map of the Earth’s surface, projected onto a sphere. Here I construct the desired graphics objects:

While the orbital path computed in an inertial frame forms a periodic closed curve, when you account for the rotation of the Earth, it will cause the spacecraft to pass over different points on the Earth’s surface during each subsequent revolution. I can visualize this effect by adding an additional rotation term to the solutions I obtain from `NDSolve`. Taking the number of orbital periods to be three (similar to John Glenn’s flight) for visualization purposes, I construct the following `Manipulate` to see how the orbital path is affected by the azimuthal launch angle *ψ*, similar to the study in Johnson’s paper. I’ll plot both a path assuming a spherical Earth (in white) and another path using the IGF (in green) to get a sense of the size of the oblateness effect (note that the divergence of the two paths increases with each orbit):

In the notebook attached to this blog, you can see this `Manipulate` in action, and note the speed at which each new solution is obtained. You would hope that Katherine Johnson and her colleagues at NASA would be impressed!

Now, varying the angle *ψ* at burnout time, it is straightforward to calculate the position of the spacecraft after, say, three revolutions:

The movie also mentions Euler’s method in connection with the reentry phase. After the initial problem of finding the azimuthal angle has been solved, as done in the previous sections, it’s time to come back to Earth. Rockets are fired to slow down the orbiting body, and a complex set of events happens as the craft transitions from the vacuum of space to an atmospheric environment. Changing atmospheric density, rapid deceleration and frictional heating all become important factors that must be taken into account in order to safely return the astronaut to Earth. Height, speed and acceleration as a function of time are all problems that need to be solved. This set of problems can be solved with Euler’s method, as done by Katherine Johnson, or by using the differential equation-solving functionality in the Wolfram Language.

For simple differential equations, one can get a detailed step-by-step solution with a specified quadrature method. An equivalent of Newton’s famous *F* = *m a* for a time-dependent mass *m*(*t*) is the so-called ideal rocket equation (in one dimension)…

… where *m*(*t*) is the rocket mass, *v*_{e} the engine exhaust velocity and *m ^{‘}_{p}*(

With initial and final conditions for the mass, I get the celebrated rocket equation (Tsiolkovsky 1903):

The details of solving this equation with concrete parameter values and e.g. with the classical Euler method I can get from Wolfram|Alpha. Here are those details together with a detailed comparison with the exact solution, as well as with other numerical integration methods:

Following the movie plot, I will now implement a minimalistic ODE model of the reentry process. I start by defining parameters that mimic Glenn’s flight:

I assume that the braking process uses 1% of the thrust of the stage-one engine and runs, say, for 60 seconds. The equation of motion is:

Here, **F**_{grav} is the gravitational force, **F**_{exhaust}(*t*) the explicitly time-dependent engine force and **F**_{friction}(* x*(

For the height-dependent air density, I can conveniently use the `StandardAtmosphereData` function. I also account for a height-dependent area because of the parachute that opened about 8.5 km above ground:

This gives the following set of coupled nonlinear differential equations to be solved. The last `WhenEvent``[...]` specifies to end the integration when the capsule reaches the surface of the Earth. I use vector-valued position and velocity variables X and V:

With these definitions for the weight, exhaust and air friction force terms…

… total force can be found via:

In this simple model, I neglected the Earth’s rotation, intrinsic rotations of the capsule, active flight angle changes, supersonic effects on the friction force and more. The explicit form of the differential equations in coordinate components is the following. The equations that Katherine Johnson solved would have been quite similar to these:

Supplemented by the initial position and velocity, it is straightforward to solve this system of equations numerically. Today, this is just a simple call to `NDSolve`. I don’t have to worry about the method to use, step size control, error control and more because the Wolfram Language automatically chooses values that guarantee meaningful results:

Here is a plot of the height, speed and acceleration as a function of time:

Plotting as a function of height instead of time shows that the exponential increase of air density is responsible for the high deceleration. This is not due to the parachute, which happens at a relatively low altitude. The peak deceleration happens at a very high altitude as the capsule goes from a vacuum to an atmospheric environment very quickly:

And here is a plot of the vertical and tangential speed of the capsule in the reentry process:

Now I repeat the numerical solution with a fixed-step Euler method:

Qualitatively, the solution looks the same as the previous one:

For the used step size of the time integration, the accumulated error is on the order of a few percent. Smaller step sizes would reduce the error (see the previous Wolfram|Alpha output):

Note that the landing time predicted by the Euler method deviates only 0.11% from the previous time. (For comparison, if I were to solve the equation with two modern methods, say `"BDF"` vs. `"Adams"`, the error would be smaller by a few orders of magnitude.)

Now, the reentry process generates a lot of heat. This is where the heat shield is needed. At which height is the most heat per area *q* generated? Without a detailed derivation, I can, from purely dimensional grounds, conjecture :

Many more interesting things could be calculated (Hicks 2009), but just like the movie had to fit everything into two hours and seven minutes, I will now end my blog for the sake of time. I hope I can be pardoned for the statement that, with the Wolfram Language, the sky’s the limit.

To download this post as a Computable Document Format (CDF) file, click here. New to CDF? Get your copy for free with this one-time download.

]]>

In his book *Idea Makers*, Stephen Wolfram devotes a chapter to Leibniz. Wolfram visited the Leibniz archive in Hanover and wrote about it:

Leafing through his yellowed (but still robust enough for me to touch) pages of notes, I felt a certain connection—as I tried to imagine what he was thinking when he wrote them, and tried to relate what I saw in them to what we now know after three more centuries…. [A]s I’ve learned more, and gotten a better feeling for Leibniz as a person, I’ve realized that underneath much of what he did was a core intellectual direction that is curiously close to the modern computational one that I, for example, have followed.

Leibniz was an early visionary of computing, and built his own calculator, which Wolfram photographed when he visited the archive.

In a recent talk about AI ethics, Wolfram talked more about how Leibniz’s visions of the future are embodied in current Wolfram technologies:

Leibniz—who died 300 years ago next month—was always talking about making a universal language to, as we would say now, express [mathematics] in a computable way. He was a few centuries too early, but I think now we’re finally in a position to do this…. With the Wolfram Language we’ve managed to express a lot of kinds of things in the world—like the ones people ask Siri about. And I think we’re now within sight of what Leibniz wanted: to have a general symbolic discourse language that represents everything involved in human affairs….

If we look back even to Leibniz’s time, we can see all sorts of modern concepts that hadn’t formed yet. And when we look inside a modern machine learning or theorem proving system, it’s humbling to see how many concepts it effectively forms—that we haven’t yet absorbed in our culture.

The Wolfram Language is a form of philosophical language, what Leibniz called a *lingua generalis*, a universal language to be used for calculation. What would Leibniz have made of the tools we have today? How will these tools transform our world? In his essay on Leibniz, Wolfram mulls this over:

In Leibniz’s whole life, he basically saw less than a handful of computers, and all they did was basic arithmetic. Today there are billions of computers in the world, and they do all sorts of things. But in the future there will surely be far far more computers (made easier to create by the Principle of Computational Equivalence). And no doubt we’ll get to the point where basically everything we make will explicitly be made of computers at every level. And the result is that absolutely everything will be programmable, down to atoms. Of course, biology has in a sense already achieved a restricted version of this. But we will be able to do it completely and everywhere.

Leibniz was also a major figure in philosophy, best known for his contention that we live in the “best of all possible worlds,” and his development in his book *Monadology* of the concept of the *monad*: an elementary particle of metaphysics that has properties resulting in what we observe in the physical world.

Wolfram speculates that the concept of the monad may have motivated Leibniz’s invention of binary:

With binary, Leibniz was in a sense seeking the simplest possible underlying structure. And no doubt he was doing something similar when he talked about what he called “monads”. I have to say that I’ve never really understood monads. And usually when I think I almost have, there’s some mention of souls that just throws me completely off.

Still, I’ve always found it tantalizing that Leibniz seemed to conclude that the “best of all possible worlds” is the one “having the greatest variety of phenomena from the smallest number of principles”. And indeed, in the prehistory of my work on

A New Kind of Science, when I first started formulating and studying one-dimensional cellular automata in 1981, I considered naming them “polymones”—but at the last minute got cold feet when I got confused again about monads.

Despite being the daughter of a physicist and having heard about elementary particles since infancy, I am a bit boggled by the concept of the monad. As I contemplate Leibniz’s strange bridge between metaphysics and such things as electrons or the mathematical definition of a point, I am reminded of lines from *Candide*, a book Voltaire wrote satirizing the notion that we live in the best of all possible worlds:

“But for what purpose was the earth formed?” asked Candide.

“To drive us mad,” replied Martin.

Yet knowledge is increasingly digitized in the twenty-first century, a process that relies on that binary language Leibniz invented. I think perhaps that if monads as such did not exist in Leibniz’s time, it may have become necessary to invent them.

]]>