In the past, using *Mathematica* has always involved first installing software on your computer. But as of today that’s no longer true. Instead, all you have to do is point a web browser at *Mathematica* Online, then log in, and immediately you can start to use *Mathematica*—with zero configuration.

Here’s what it looks like:

It’s a notebook interface, just like on the desktop. You interactively build up a computable document, mixing text, code, graphics, and so on—with inputs you can immediately run, hierarchies of cells, and even things like Manipulate. It’s taken a lot of effort, but we’ve been able to implement almost all the major features of the standard *Mathematica* notebook interface purely in a web browser—extending CDF (Computable Document Format) to the cloud.

There are some tradeoffs of course. For example, Manipulate can’t be as zippy in the cloud as it is on the desktop, because it has to run across the network. But because its Cloud CDF interface is running directly in the web browser, it can immediately be embedded in any web page, without any plugin, like right here:

Another huge feature of *Mathematica* Online is that because your files are stored in the cloud, you can immediately access them from anywhere. You can also easily collaborate: all you have to do is set permissions on the files so your collaborators can access them. Or, for example, in a class, a professor can create notebooks in the cloud that are set so each student gets their own active copy to work with—that they can then email or share back to the professor.

And since *Mathematica* Online runs purely through a web browser, it immediately works on mobile devices too. Even better, there’s soon going to be a Wolfram Cloud app that provides a native interface to *Mathematica* Online, both on tablets like the iPad, and on phones:

There are lots of great things about *Mathematica* Online. There are also lots of great things about traditional desktop *Mathematica*. And I, for one, expect routinely to use both of them.

They fit together really well. Because from *Mathematica* Online there’s a single button that “peels off” a notebook to run on the desktop. And within desktop *Mathematica*, you can seamlessly access notebooks and other files that are stored in the cloud.

If you have desktop *Mathematica* installed on your machine, by all means use it. But get *Mathematica* Online too (which is easy to do—through Premier Service Plus for individuals, or a site license add-on). And then use the Wolfram Cloud to store your files, so you can access and compute with them from anywhere with *Mathematica* Online. And so you can also immediately share them with anyone you want.

By the way, when you run notebooks in the cloud, there are some extra web-related features you get—like being able to embed inside a notebook other web pages, or videos, or actually absolutely any HTML code.

*Mathematica* Online is initially set up to run—and store content—in our main Wolfram Cloud. But it’ll soon also be possible to get a Wolfram Private Cloud—so you operate entirely in your own infrastructure, and for example let people in your organization access *Mathematica* Online without ever using the public web.

A few weeks ago we launched the Wolfram Programming Cloud—our very first full product based on the Wolfram Language, and Wolfram Cloud technology. *Mathematica* Online is our second product based on this technology stack.

The Wolfram Programming Cloud is focused on creating deployable cloud software. *Mathematica* Online is instead focused on providing a lightweight web-based version of the traditional *Mathematica* experience. Over the next few months, we’re going to be releasing a sequence of other products based on the same technology stack, including the Wolfram Discovery Platform (providing unlimited access to the Wolfram Knowledgebase for R&D) and the Wolfram Data Science Platform (providing a complete data-source-to-reports data science workflow).

One of my goals since the beginning of *Mathematica* more than a quarter century ago has been to make the system as widely accessible as possible. And it’s exciting today to be able to take another major new step in that direction—making *Mathematica* immediately accessible to anyone with a web browser.

There’ll be many applications. From allowing remote access for existing *Mathematica* users. To supporting mobile workers. To making it easy to administer *Mathematica* for project-based users, or on public-access computers. As well as providing a smooth new workflow for group collaboration and for digital classrooms.

But for me right now it’s just so neat to be able to see all the power of *Mathematica* immediately accessible through a plain old web browser—on a computer or even a phone.

And all you need do is go to the *Mathematica* Online website…

Throughout the two weeks, students learned the Wolfram Language from a variety of faculty and guest speakers. They had the opportunity to hear Stephen Wolfram speak about the company, the Wolfram Cloud, and all of the other exciting products to come in the near future. Students also heard from guest speakers such as Vitaliy Kaurov and Christopher Wolfram, who showed off other aspects of the Wolfram Language.

Although students spent a vast amount of time hard at work on their projects, they also had many laughs throughout the program. They participated in group activities such as the human knot, the Spikey photo scavenger hunt, and the toothpick and gumball building contest, as well as weekend field trips to the Boston Museum of Science and laser tag.

Every year, I am thoroughly impressed by the projects students complete within two short weeks; of course this wouldn’t be possible without our talented faculty and awesome students!

Let’s check out some projects from this year:

Single- and Multibank Engines Using Four-Stroke Cycle, by Alec Nelson

Law of Moments for Lever with Two Weights, by Mariam Martirosyan

And here are some more to check out on the Wolfram Demonstrations Project:

- Color Schemes, by Katie Borg
- Scuderi Split Cycle Engine, by Ray Yuan
- Coulomb’s Law for Three Aligned Charges, by Eduard Mihranyan

You can check out all of the projects created at *Mathematica* Summer Camp. To find out more information about *Mathematica* Summer Camp 2015, check out our website.

In college a few years later, I spent some hours trying, and failing, to find any knight’s tour, using pencil and paper in various systematic and haphazard ways. And for no particular reason, this memory came to me while I was driving to work today, along with the realization that the problem can be reduced to finding a Hamiltonian cycle—a closed path that visits every vertex—of the graph of possible knight moves. Something that is easy to do in *Mathematica*. Here is how.

The first thing that we have to do is construct the graph of knight moves. A knight can move to eight possible squares in the open, but as few as two in the corners. But if you ignore that and think of when you were taught chess, you were probably told that a knight could move along two and across one, or vice versa. This gives the knight the property that it always moves a `EuclideanDistance` of √5. We can use this test to construct the edges of our graph from the list of all possible pairs of positions.

Sorting removes unhelpful multiple edges that just slow the computation down. Ignoring chess conventions, I choose *x* and *y* to both take values 1–8 (rather than *x* being from A to H).

Now that we have the graph, we can find a tour.

Which looks very nice visualized on a chess board.

I understand that this problem is often used as an exercise in computer science courses, and you can find some iterative solutions and specific solutions at demonstrations.wolfram.com. But at four lines, this graph theory solution isn’t too much of a challenge. Indeed, it seemed a little unsatisfying to stop here.

Giving `FindHamiltonianCycle` a second argument finds more than one tour. But scrolling through a notebook of the first 1,000 didn’t reveal anything interesting. And the properties of differently sized chess boards have been well studied. So I decided to extend the idea in a slightly more whimsical way.

Oddly, all the definitions of a knight’s tour that I found on the web assume a rectangular space. Indeed, *Mathematica* contains a built-in function `KnightTourGraph`, which does roughly the same as my `knightGraph` function, but assumes a rectangular grid. But if I bring in *Mathematica*‘s image processing, it is easy to generate the knight’s tour graph over points in non-rectangular shapes, like this:

First I `Binarize` the pixels of some source image. The knight is allowed to visit the black pixels in the result, but not the white. Then I get the coordinates of the black pixels and map them to the coordinate space of my plotting function.

Now I just solve and plot that graph as before, but this time visualize with the black pixels, instead of a chess-board grid.

It turns out that most irregular shapes do not have a closed knight’s tour, but messing around with the image size can find exceptions. So a bit of search code is needed. This one takes a single character, rasterizes it at different font sizes, and returns the size for which a tour exists. The `Monitor` part lets you watch the progress; tour finding scales poorly and can get very slow if you let the font sizes grow too big.

Let’s start by searching simple filled circles.

So out of the 42 sizes tested, only these 5 have solutions. The smaller, more crowded font sizes produce more irregular tour paths.

But larger image sizes make the shape more recognizable.

Playing around with other characters is amusing for a while. Here is the letter “o” in 34 point:

An asterisk at 52 point:

And since chess puts us firmly in the “games” space, here are some playing card suits:

And finally my rendition of Pacman:

Feel free to share any interesting discoveries in the comments section.

Download this post as a Computable Document Format (CDF) file.

]]>If you want to follow along, you can download a trial of *SystemModeler*. It’s also available with a student license, or you can buy a home-use license.

Let’s start with the simplest electrical circuit I can think of:

I named it `HelloSystemModeler` in the tradition of computer language tutorials, where most first simple programs usually start with a greeting like “Hello, world.” This circuit only contains three components: a resistor, a voltage source (like an ideal battery), and a ground. To create this circuit, I used drag-and-drop from the built-in components that come with *SystemModeler*.

This circuit is something that you would build in middle school (at least I did), with the resistor replaced with a small light bulb. You’d also learn Ohm’s law, stating that *I = V/R*, so with a potential of 10 volts and a resistance of 5 ohms, we get a current of 2 amperes. Let’s see how *SystemModeler* handles this.

Simulate the circuit using `WSMSimulate`:

I can now query the simulation object for all variables and parameters. Let’s look at the current in the resistor:

Now let’s go for something a bit more interesting and usable, say, an RC circuit. The name refers to the two components in the circuit: a resistor (R) and a capacitor (C). Together they act as a low-pass filter that will filter out the high-frequency components of a signal.

The capacitor and resistor components are defined directly from their constitutive equations: for a capacitor, we in general have the equation *I(t) = C dV(t)/dt*, and if we look at the model for the capacitor, we see exactly this equation popping up:

The code above is written in the object-oriented modeling language Modelica, which *SystemModeler* implements. Not shown here is the annotation in the code that specifies how the component should look when used in the graphical user interface.

The same way of defining components from equations works for the resistor, inductor, and so on. This brings a very powerful concept to the modeling table, namely, to be able to define components and let the computer figure out a form of the complete system that can be simulated. Less burden on the modeler and more work for the computer. If I didn’t have such an acausal modeling environment, I’d have to derive the complete relationship from input to output, and then implement that function. That would look something like this:

Not only has the real-world reassemblance completely disappeared, but it’s also hard to reuse this if I want to change the structure and add an extra component, say an inductor, to it. I would have to redo all my work, whereas in *SystemModeler* I can simply connect it where I want it, and the system will regenerate the simulation equations.

Returning to the RC circuit, in the diagram below, the filter is connected to the power sources and a load resistor.

Next I run a simulation with this model and can directly see the result of the filter. In the diagram below, the high-frequency signal is damped.

Another way to see this is to look at the frequency spectrum of these signals. First I sample the signals:

Then I can look at their spectra by using the discrete Fourier transform:

Here I can see the peak 600 Hz for both signals, and for the input signals there’s a second peak at 30,000 Hz, where the noise is added.

There are many ways to improve this filter, and one of them is to add an amplifier to the circuit. Let’s take a look first at an inverting amplifier circuit, which happens to be one of the first lab exercises I did at university.

The operational amplifier in the center of the next figure will amplify the voltage at the negative pin by a factor of *resistor1/resistor = 30000/10000 = 3*, if that value is within the limits from the two constant voltages I’ve connected to the circuit. If it’s outside the limits, the limits will be the value. The constant voltages are 25 volts in this case.

Simulating this circuit, we can see exactly how that plays out:

The voltage reaches 25 V, where the limit for the operational amplifier is, and saturates there.

There are many applications for operational amplifiers, such as amplifiers, regulators, and active filters. Let’s now look at an active low-pass filter, similar to our first RC filter, but with the operational amplifier added:

An interesting application made possible by *SystemModeler* and *Mathematica* being symbolic tools is the possibility to derive analytical expressions for the transfer functions of circuits. To do that I’ll first get a linearized state-space model:

The state-space model is a set of ordinary differential equations (ODEs) that relate the input *Vi* to the output *Vo*. The variables of the ODEs are `capacitor1.v` and `capacitor2.v`, which are the states of the state-space model.

It’s possible to directly convert the state-space model to a transfer function, displaying the relationship between input voltage *Vi* and output voltage *Vo*:

Or display its Bode plot to show the cut-off frequency:

The cut-off frequency is where the magnitude of output voltage is 3 dB lower than the input voltage. In the above plot, one can see that the frequency is around .7 rad/s by moving the slider.

So far, so good.

Now I’m going to look at an example where I don’t know how to directly apply Ohm’s law. Rectification is the process of taking an alternating current and turning it into a direct current. This is what happens in your laptop charger, which takes the AC supplied from the wall socket and converts it into DC that is usable in your computer.

To do that, a full-wave rectifier is usually used, with a so-called diode bridge:

The diodes in the bridge will “fire” so that a positive current will always flow into the positive pin on the DC side.

Next I connect this to a power source and a load:

Running a simulation, we can now see the how this full-wave rectifier converts the alternating current (blue) into a direct current (red):

As can be seen, there are a lot of ripples in this circuit. To smooth it out, a capacitor can be applied across the DC bus, like so:

Simulating with three different values for the capacitance of the capacitor, we get:

I’ll pick one of these capacitor values, say 10 mF, and save the corresponding simulation in a separate variable:

And as we can see, the behavior is smoother. How much ripple is there? Instead of looking at values in the plot and calculating it with my calculator, *Mathematica* can do the work for me by finding the maximum and minimum values in a cycle:

Now that I have them, I can go on to mark them in a plot:

Or just calculate the ripple as a percentage of the mean signal value:

OK, so that’s about a 7% ripple. We can improve this by adding an inductor on the DC bus side together with the capacitor.

Another interesting analysis task is to look at the efficiency of the conversion. For that task, I connect power sensors to the AC and DC sides.

In a separate simulation, I integrate the power consumed on both sides and look at the quotient:

As we can see, the efficiency at first is low, when the capacitor is charging. But later the efficiency is very high.

I’ll end this blog post by showing how these electrical components in turn can be connected to other domains. One such example is an electrical motor. So what I’ll do is connect our rectifier to a DC motor and some mechanical components, including a torque opposite of the motor’s that kicks in 10 seconds after the motor starts.

The load connected to the motor will spin up and find a steady state value around 10 rad/s. After 10 seconds, a torque (the component on the right in the above diagram) with the opposite direction from the DC motor’s torque starts to turn, and the load is slowed down.

With the new release of *SystemModeler* 4, even more use cases are possible, including using the digital library to construct AD/DA converters and other digital circuits or quasi-stationary circuits for fast large analog circuit simulations. You can also create models directly from equations in *Mathematica*.

*SystemModeler* can give you a leg up in your next course or project. Keep an eye on the Wolfram Blog for further posts in this series.

Download this post, as a Computable Document Format (CDF) file, and its accompanying models.

]]>Disclaimer: you may read, carry out, and modify inputs in this blog post independent of your age. Hands-on taste tests might require a certain minimal legal age (check your countries’ and states’ laws).

We start by importing two images from Wikipedia to set the theme; later we will use them on maps.

We will restrict our analysis to the lower 48 states. We get the polygon of the US and its latitude/longitude boundaries for repeated use in the following.

And we define a function that tests if a point lies within the continental US.

We start with beer. Let’s have a look at the yearly US beer production and consumption over the last few decades.

This production puts the US in second place, after China, on the world list of beer producers. (More details about the international beer economy can be found here.)

And here is a quick look at the worldwide per capita beer consumption.

The consumption of the leading 30 countries in natural units, kegs of beer:

Some countries prefer drinking wine (see here for a detailed discussion of this subject). The following graphic shows (on a logarithmic base 2 scale) the ratio of beer consumption to wine consumption. Negative logarithmic ratios mean a higher wine consumption compared to beer consumption. (See the American Association of Wine Economists’ working paper no. 79 for a detailed study of the correlation between wine and beer consumption with GDP, mean temperature, etc.)

We start with the beer breweries. To plot and analyze, we need a list of breweries. The Wolfram Knowledgebase contains data about a lot of companies, organizations, food, geographic regions, and global beer production and consumption. But breweries are not yet part of the Wolfram Knowledgebase. With some web searching, we can more or less straightforwardly find a web page with a listing of all US breweries. We then import the data about 2600+ beer breweries in the US as a structured dataset. This is an all-time high over the last 125 years. (For a complete list of historical breweries in the US, you can become a member of the American Breweriana Association and download their full database, which also covers long-closed breweries.)

Here are a few randomly selected entries from the dataset.

We see that for each brewery, we have their name, the city where they are located, their website URL, and their phone number (the BC, BP, and similar abbreviations stand for if and what you can eat with your beer, which is irrelevant for today’s blog post).

Next, we process the data, remove breweries no longer in operation, and extract brewery names, addresses, and ZIP codes.

We now have data for 2600+ breweries.

For a geographic analysis, we resolve the ZIP codes to actual lat/long coordinates using the `EntityValue` function.

Unfortunately, not all ZIP codes were resolved to actual latitudes and longitudes. These are the ones where we did not successfully find a geographic location.

Why did we not find coordinates for these ZIP codes? As frequently happens with non-programmatically curated data, there are mistakes in the data, and so we will have to clean it up. The easiest way would be to simply ignore these breweries, but we can do better. These are the actual entries of the breweries with missing coordinates.

A quick check at the USPS website shows that, for instance, the first of the above ZIP codes, 54704, is not a ZIP code that the USPS recognizes and/or delivers mail to.

So no wonder the Wolfram Knowledgebase was not able to find a coordinate for this “ZIP code”. Fortunately, we can make progress in fixing the incorrect ZIP codes programmatically. Assume the nonexistent ZIP code was just a typo. Let’s find a ZIP code in Madison, WI that has a small string distance to the ZIP code 54704.

The ZIP code 53704 is in string (and Euclidean) distance as near as possible to 54704.

And taking a quick look at the company’s website confirms that 53704 is the correct ZIP code. This observation, together with the programmatic ZIP code lookups, allows us to define a function to programmatically correct the ZIP codes in case they are just simple typos.

For instance, for Black Market Brewing in Temecula, we find that the corrected ZIP code is 92590.

So, to clean the data, we perform some string replacements to get a dataset that has ZIP codes that exist.

We now acquire coordinates again for the corrected dataset.

Now we have coordinates for all breweries.

And all ZIP codes are now associated with a geographic position. (At least when I wrote the blog post; because the used website gets regularly updated, at a later point in time new typos could have occurred and the `fixDataRules` would have to be updated appropriately.)

Now that we have coordinates, we can make a map with all the breweries indicated.

Let’s pause for a moment and think about what goes into beer. According to the *Reinheitsgebot* from November 1487, it’s just malted barley, hops, and water (plus yeast). The detailed composition of water has an important influence on a beer’s taste. The water composition in turn relates to hydrogeology. (See this paper for a detailed discussion of the relation.) Carrying out a quick web search lets us find a site showing important natural springs in the US. We import the coordinates of the springs and plot them together with the breweries.

We redraw the last map, but this time add the natural springs in blue. Without trying to quantify the correlation here between breweries and springs, a visual correlation is clearly visible.

We quickly calculate a plot of the distribution of the distances of a brewery to the nearest spring from the list `springPositions`.

And if we connect each brewery to the nearest spring, we obtain the following graphic.

We can also have a quick look at which regions of the US can use their local barley and hops, as the Wolfram Knowledgebase knows in which US states these two plants can be grown.

(For the importance of spring water for whiskey, see this paper.) Most important for a beer’s taste is the hops (see this paper and this paper for more details). The -acids of hops give the beer its bitter taste. The most commonly occurring -acid in hops is humulone. (To refresh your chemistry knowledge, see the Step-by-step derivation for where to place the dots in the below diagram.)

But let’s not be sidetracked by chemistry and instead focus in this blog post on geographic aspects relating to beer.

Historically, a relationship has existed between beer production and the church (in the form of monasteries; see “A Comprehensive History of Beer Brewing” for details). Today we don’t see a correlation (other than through population densities) between religion and beer production. Just to confirm, let’s draw a map of major churches in the US together with the breweries. At the website of the Hartford Institute, we find a listing of major churches. (Yes, it would have been fun to really draw all 110,000+ churches of the US on a map, but the blog team did not want me to spend $80–$100 to buy a US church database and support spam-encouraging companies, e.g from here or here.)

Back to the breweries. Instead of a cloud of points of individual breweries we can construct a continuous brewery probability field and plot it. This more prominently shows the hotspots of breweries in the US. To do so, we calculate a smooth kernel distribution for the brewery density in projected coordinates. We use the Sheather–Jones bandwidth estimator, which relieves us from needing to specify an explicit bandwidth. Determining the optimal bandwidth is a nontrivial calculation and will take a few minutes.

We plot the resulting distribution and map the resulting image onto a map of the US. Blue denotes a low brewery density and red a high one. Denver, Oregon, and Southern California clearly stand out as local hotspots.

The black points on top of the brewery density map are the actual brewery locations.

Using the brewery density as an elevation, we can plot the beer topography of the US. Previously unknown (beer-density) mountain ranges and peaks become visible in topographically flat areas.

The next graphic shows a map where we accumulate the brewery counts by latitude and longitude. Similar to the classic wheat belt, we see two beer belts running East to West and two beer belts running North to South.

Let’s determine the elevations of the breweries and make a histogram to see whether there is more interest in a locally grown beer at low or high elevations.

It seems that elevations between 500 and 1500 ft are most popular for places making a fresh cold barley pop (with an additional peak at around 5000 ft caused by the many breweries in the Denver region).

For further use, we summarize all relevant information about the breweries in `breweryData`.

We define some functions to find the nearest brewery and the distance to the nearest brewery.

Here are the nearest breweries from the Wolfram headquarters In Champaign, IL.

And here is a plot of the distances from Champaign to all breweries, sorted by size. After accounting for the breweries in the immediate neighborhood of Champaign, for the first nearly 1000 miles we see a nearly linear increase in the number of breweries with a slope of approximately 2.1 breweries/mile.

Now that we know where to find a freshly brewed beer, let’s switch focus and concentrate on whiskey distilleries. Again, after some web searching we find a web page with a listing of all distilleries in the continental US. Again, we read in the data, this time in unstructured form, extract the distillery and cities named, and carry out some data cleanup as we go.

This time, we have the name of the distillery, their website, and the city as available data. Here are some example distilleries.

A quick check shows that we did a proper job in cleaning the data and now have locations for all distilleries.

We now have a list of about 500 distilleries.

We retrieve the elevations of the cities with distilleries.

The average elevation of a distillery does not deviate much from the one for breweries.

We summarize all relevant information about the distilleries in `distilleryData`.

Define functions to find the nearest brewery and the distance to the nearest brewery.

We now use the function `nearestDistilleries` to locate the nearest distillery and make a map of the bearings to take to go to the nearest distillery.

Let’s come back to breweries. What’s the distribution by state? Here are the states with the most breweries.

If we normalize by state population, we get the following ranking.

And which city has the most breweries? We accumulate the ZIP codes by city. Here are the top dozen cities by brewery count.

And here is a more visual representation of the top 25 brewery cities. We show a beer glass over the top brewery cities whose size is proportional to the number of breweries.

Oregon isn’t a very large state, and it includes beer capital Portland, so let’s plan a trip to visit all breweries. To minimize driving, we calculate the shortest tour that visits all of the state’s breweries. (All distances are along geodesics, not driving distances on roads.)

A visit to all Oregon breweries will be a 1,720-mile drive.

And here is a sketch of the shortest trips that hit all breweries for each of the lower 48 states.

Let’s quickly make a website that lets you plan a short beer tour through your state (and maybe some neighboring states). The function `makeShortestTourDisplay` calculates and visualizes the shortest path. For comparison, the length of a tour with the breweries chosen in random order is also shown. The shortest path often allows us to save a factor 5…15 in driving distances.

We deploy the function `makeShortestTourDisplay` to let you easily plan your favorite beer state tours.

And if the reader has time to take a year off work, a visit to all breweries in the continental US is just a 41,000-mile trip.

The collected caps from such a trip could make beautiful artwork! Here is a graphic showing one of the possible tours. The color along the tour changes continuously with the spectrum, and we start in the Northeast.

On average, we would have to drive just 15 miles between two breweries.

Here is a distribution of the distances.

Such a trip covering all breweries would involve driving nearly 300 miles up and down.

Here is a plot of the height profile along the trip.

We compare the all-brewery trip with the all-distillery trip, which is still about 21,000 miles.

To calculate the distribution function for the average distance from a US citizen to the nearest brewery and similar facts, we build a list of coordinates and the population of all ZIP code regions. We will only consider the part of the population that is older than 21 years. We retrieve this data for the ~30,000 ZIP codes.

We exclude the ZIP codes that are in Alaska, Hawaii, and Guam and concentrate on the 48 states of the continental US.

We will take into account adults from the ~29,000 populated ZIP code areas with a non-vanishing number of adults totaling about 214 million people.

Now that we have a function to calculate the distance to the nearest brewery at hand and a list of positions and populations for all ZIP codes, let’s do some elementary statistics using this data.

Here is a plot of the distribution of distances from all ZIP codes to the nearest brewery.

More than 32 million Americans have a local brewery within their own ZIP code region.

While ~15% of the above-drinking-age population is located in the same ZIP code as a brewery, this does not imply zero distance to the next brewery. As a rough estimation, we will model the distribution within a ZIP code as the distance between two random points. In the spirit of the famous spherical cow, the shape of a ZIP code we will approximate as a disk. Thus, we need the size distribution of the ZIP code areas.

The average distance between two randomly selected points from a disk is approximately the radius of the disk itself.

Within our crude model, we take the areas of the cities and calculate the radius of the corresponding disk. We could do a much more refined Monte Carlo model using the actual polygons of the ZIP code regions, but for the qualitative results that we are interested in, this would be overkill.

Now, with a more refined treatment of the same ZIP code data, on average, for a US citizen in the lower 48 states, the nearest brewery is still only about 13.5 miles away.

And, modulo a scale factor, the distribution of distances to the nearest brewery is the same as the distribution above.

Let’s redo the same calculation for the distilleries.

The weighted average distance to the nearest distillery is about 30 miles for the above-drinking-age customers of the lower 48 states.

And for about 1 in 7 Americans the nearest distillery is closer then the nearest brewery.

We define a function that, for a given geographic position, calculates the distance to the nearest brewery and the nearest distillery.

E.g. if you are at Mt. Rushmore, the nearest brewery is just 18 miles away, while the nearest distillery is nearly 160 miles away.

For some visualizations to be made below, for a dense grid of points in the US, find the distance to the nearest brewery and the nearest distillery. It will take 20 minutes to calculate these 320,000 distances, so we have time to visit the nearest espresso machine in the meantime.

So, how far away can the nearest brewery be from an adult US citizen (within the lower 48 states)? We calculate the maximal distance to a brewery.

We find that the city furthest away from a freshly brewed beer is Ely in Nevada–about 170 miles away.

And here is the maximal distance to a distillery. From Redford, Texas it is about 335 miles to the nearest distillery.

Of the inhabitants of these two cities, the people from Ely have “only” a 188-mile distance to a distillery and the people from Redford are 54 miles from the next brewery.

After having found the external distance cities, the next natural question is for the city that has the maximal distance to either a brewery or a distillery.

Let’s have a look at the situation in the middle of Kansas. The ~100 adult citizens of Manter, Kansas are quite far away from a local alcoholic drink.

And here is a detailed look at the breweries/distilleries situation near Manter.

Now that we have the detailed distances for a dense grid of points over the continental US, let’s visualize this data. First, we make plots showing the distance, where blue indicates small distances and red dangerously large distances.

Using these distance plots properly projected into the US yields a more natural-looking image.

And here is the corresponding image for distilleries. Note the clearly visible great Distillery Ridge mountain range between Eastern US distilleries and Western US distilleries.

For completeness, here is the maximum of either the distance to the nearest brewery or the distance to the nearest distillery.

And here is the equivalent 3D image with the distance to the next brewery or distillery shown as vertical elevation. We also use a typical elevation plot coloring scheme for this graphic.

We can also zoom into the Big Dry Badlands mountain range to the East of Denver as an equal-distance-to-freshly-made-alcoholic-drink contour plot. The regions with a distance larger than 100 miles to the nearest brewery or distillery are emphasized with a purple background.

Or, more explicit graphically, we can use the beer and whiskey images from earlier to show the regions that are closer to a brewery than to a distillery and vice versa. In the first image, the grayed-out regions are the ones where the nearest distillery is at a smaller distance than the nearest brewery. The second image shows regions where the nearest brewery is at a smaller distance than the nearest distillery in gray.

There are many more bells and whistles that we could add to these types of graphics. For instance, we could add some interactive elements to the above graphic that show details when hovering over the graphic.

Earlier in this blog post, we constructed an infographic about beer production and consumption in the US over the last few decades. After having analyzed distillery locations, a natural question is what role whiskey plays among all spirits. This paper analyzes the average alcohol content of spirits consumed in the US over a 50+ year time span at the level of US states. If you have a subscription, you can easily import the main findings of the study, which is Table 1.

Here is a snippet of the data. The average alcohol content of the spirits consumed decreased substantially from 1950 to 2000, mainly due to a decrease in whiskey consumption.

Here is a graphical representation of the data from 1950 to 2000.

So far we have concentrated on beer- and whiskey-related issues on a geographic scale. Let’s finish with some stats and infographics on the kinds of beer produced in the breweries mapped above. Again, after some web searching, we find a page that lists the many types of beer, 160+ different styles to be precise. (See also the *Handbook of Brewing* and the “Brewers Association 2014 Beer Style Guidelines” for a detailed discussion of beer styles.)

We again import the data. The web page is perfectly maintained and checked, so this time we do not have to carry out any data cleanup.

How much beer one can drink depends on the alcohol content. Here is the distribution of beer styles by alcohol content. Hover over the graph to see the beer styles in the individual bins.

Beer colors are defined on a special scale called *Standard Reference Method* (SRM). Here is a translation of the SRM values to RGB colors.

How do beer colors correlate with alcohol content and bitterness? The following graphic shows the parameter ranges for the 160+ beer styles. Again, hover over the graph to see the beer style categories highlighted.

In an interactive 3D version, we can easily restrict the color values.

After visualizing breweries in the US and analyzing the alcohol content of beer types, what about the distribution of the actual brewed beers within the US? After doing some web searching again, we can find a website that lists breweries and the beers they brew.

So, let’s read in the beer data from the site for 2,600 breweries. We start with preparing a list of the relevant web pages.

Next, we prepare for processing the individual pages.

As this will take a while, we can display the breweries, their beers, and a link to the brewery website to entertain us in the meantime. Here is an example of what to display while waiting.

Now we process the data for all breweries. Time for another cup of coffee. To have some entertainment while processing the beers of 2,000+ breweries, we again use `Monitor` to display the last-analyzed brewery and their beers. We also show a clickable link to the brewery website so that the reader can choose a beer of their liking.

Here is a typical data entry. We have the brewery name, its location, and, if available, the actual beers, their classification as Lager, Bock, Doppelbock, Stout, etc., together with their alcohol content.

Here is the distribution of the number of different beers made by the breweries. To get a feeling, we will quickly import some example images.

Concretely, more than 24,400 US-made beers were listed in the just-imported web pages.

Accumulating all beers gives the following cumulative distribution of the alcohol content.

On average, a US beer has an alcohol content (by volume) of (6.72.1)%.

If we tally up by beer type, we get the following distribution of types. India Pale Ale is the winner, followed by American Pale Ale.

Now let’s put the places where a Hefeweizen is freshly brewed on a map.

And here are some healthy breakfast beers with oatmeal or coffee (in the name).

For the carnivorous beer drinkers, there are plenty of options. Here are samples of beers with various mammals and fish in their name. (Using `Select[#&@@@Flatten[Last/@Take[brewerBeerDataUS,All],2],DeleteCases[Interpreter["Species"][
StringSplit[#]],_Failure]=!={}&]`, we could get a complete list of all animal beers.)

What about the names of the individual beers? Here is the distribution of their (string) lengths. Hover over the columns to see the actual names.

Presume you plan a day trip up to 125 miles in radius (meaning not longer than about a two-hour drive in each direction). How many different beers and beer types would you encounter as a function of your starting location? Building a fast lookup for the breweries up to distance *d*, you can calculate these numbers for a dense set of points across the US and visualize the resulting data geographically. (For simplicity, we assume a spherical Earth for this calculation.)

In the best-case scenario, you can try about 80 different beer types realized through more than 2000 different individual beers within a 125-mile radius.

After so much work doing statistics on breweries, beer colors, beer names, etc., let’s have some real fun: let’s make some fun visualizations using the beers and logos of breweries.

Many of the brewery homepages show images of the beers that they make. Let’s import some of these and make a delicious beer (bottle, can, glass) collage.

We continue by making a reduced version of `brewerBeerDataUS` that contains the breweries and URLs by state.

Fortunately, many of the brewery websites have their logo on the front page, and in many cases the image has *logo* in the image filename. This means a possible automated way to get images of logos is to read in the front page of the web presences of the breweries.

We will restrict our logo searches to logos that are not too wide or too tall, because we want to use them inside graphics.

We also define a small list of special-case lookups, especially for states that have only a few breweries.

Now we are ready to carry out an automated search for brewery logos. To get some variety into the vizualizations, we try to get about six different logos per state.

After removing duplicates (from breweries that brew in more than one state), we have about 240 images at hand.

A simple collage of brewery logos does not look too interesting.

So instead, let’s make some random and also symmetrized kaleidoscopic images of brewery logos. To do so, we will map the brewery logos into the polygons of a radial-symmetric arrangement of polygons. The function `kaleidoscopePolygons` generates such sets of polygons.

The next result shows two example sets of polygons with threefold and fourfold symmetry.

And here are two random beer logo kaleidoscopes.

Here are four symmetric beer logo kaleidoscopes of different rotational symmetry orders.

Or we could add brewery stickers onto the faces of the Wolfram|Alpha Spikey, the rhombic hexecontahedron. As the faces of a rhombic hexecontahedron are quadrilaterals, the images don’t have to be distorted very much.

Let’s end with randomly selecting a brewery logo for each state and mapping it onto the polygons of the state.

The next graphic shows some randomly selected logos from states in the Northeast.

And we finish with a brewery logo mapped onto each state of the continental US.

We will now end and leave the analysis of wineries for a future blog post. For a more detailed account of the distribution of breweries throughout the US over the last few hundred years, and a variety of other beer-related geographical topics, I recommend reading the recent book *The Geography of Beer*, especially the chapter “Mapping United States Breweries 1612 to 2011″. For deciding if a bottle of beer, a glass of wine, or a shot of whiskey is right for you, follow this flowchart.

Download this post as a Computable Document Format (CDF) file.

*To comment, please visit the original post at the Wolfram|Alpha Blog »*

Even without the effect of tree branches, Mars is a difficult planetary target because it is small, and atmospheric turbulence can make shooting it a process of capturing a lot and hoping for the best during those moments when the air stabilizes enough to provide a clear view.

Before the days of photographic imaging, astronomers would sketch Mars at the eyepiece of their telescope, waiting for moments of atmospheric clarity before adding to their drawings. Unfortunately this technique sometimes resulted in features being added to the planet that did not actually exist.

Now the standard way to process planetary images like this is to take a bunch of short exposures and then normalize and stack the collection, weeding out the obviously bad frames. This process increases the signal-to-noise ratio of the final image.

I initially collected my images using software specific to my CCD camera. Here is the path to my collection of Mars exposures in the standard astronomical FITS format.

The first step was to import the collection into *Mathematica*. With `marsPath` set to a directory containing the images, I imported the entire set:

All the images were now in an easy-to-manipulate list. Wolfram Language is a powerful functional programming language, which allowed me to perform image processing by mapping pure functions onto elements of this list instead of writing looping code.

Each frame of my FITS files consisted of a sublist of three color channels, one for red, green, and blue.

I needed to convert them to combined color first, which was done by just mapping the `ColorCombine` function onto each image in the list:

As Mars occupies only a small portion of the frame, I cropped the image to 200 pixels wide, so time wouldn’t be spent processing uninteresting pixels.

Here is what the array of cropped images looked like:

As you can see, Mars was moving around a lot due to atmospheric turbulence, and because of that and the movement of the tree branches, I needed to align the images before combining them. Fortunately *Mathematica* can do this automatically. Here I specified that only a translation transformation should be applied to each image, otherwise `ImageAlign` might try to rotate the images if it got confused. The subsequent line filtered out all the cases where `ImageAlign` failed to find a successful transformation because the image was just too fuzzy.

The FITS images I collected were 16-bit images. They needed to be represented by a more fine-grained data type before doing scaling and arithmetic on the images. Otherwise, I would get banding, and fine detail would have been washed out. In order to prevent this effect, I converted them to use 64-bit real values.

This was the resulting array of aligned images:

Every time I image things like this, I need to eliminate the “clinker” images—ones that are too fuzzy to add to the stack. To sort them out, I created a table of the images I was going to use, initialized to `True`.

It was easy to spin up a quick UI to allow me to sort out the clinkers.

The `useTable` array could then be used to pull out the good images in one step.

To combine the good images into a single image with an improved signal-to-noise ratio, I took their statistical median using the `Median` function.

Of course, it is completely trivial to get the `Mean` of the image list as well, and it is sometimes good to see how the `Mean` image compares to the `Median` image. Other statistical functions, like `TrimmedMean`, could also have been used here.

The final adjustment sharpened the image and brought down the brightness while bringing up the contrast.

Not bad, considering the quality of the initial frames.

*Mathematica* provided all the tools I needed to easily import, prepare, align, combine, and adjust fuzzy planetary images taken through tree branches in my back yard to yield a decent amateur image of Mars.

Download this post as a notebook (.nb) file or CDF (.cdf) file.

Download the compressed (.zip) accompanying Flexible Image Transport System (FITS) files (**Note:** This file is over 100 MB).

This year the ICM is in Seoul, and I’m going to it today. I went to the ICM once before—in Kyoto in 1990. *Mathematica* was only two years old then, and mathematicians were just getting used to it. Plenty already used it extensively—but at the ICM there were also quite a few who said, “I do *pure* mathematics. How can *Mathematica* possibly help me?”

Twenty-four years later, the vast majority of the world’s pure mathematicians do in fact use *Mathematica* in one way or another. But there’s nevertheless a substantial core of pure mathematics that still gets done pretty much the same way it’s been done for centuries—by hand, on paper.

Ever since the 1990 ICM I’ve been wondering how one could successfully inject technology into this. And I’m excited to say that I think I’ve recently begun to figure it out. There are plenty of details that I don’t yet know. And to make what I’m imagining real will require the support and involvement of a substantial set of the world’s pure mathematicians. But if it’s done, I think the results will be spectacular—and will surely change the face of pure mathematics at least as much as *Mathematica* (and for a younger generation, Wolfram|Alpha) have changed the face of calculational mathematics, and potentially usher in a new golden age for pure mathematics.

The whole story is quite complicated. But for me one important starting point is the difference in the typical workflows for calculational mathematics and pure mathematics. Calculational mathematics tends to involve setting up calculational questions, and then working through them to get results—just like in typical interactive *Mathematica* sessions. But pure mathematics tends to involve taking mathematical objects, results or structures, coming up with statements about them, and then giving proofs to show why those statements are true.

How can we usefully insert technology into this workflow? Here’s one simple way. Think about Wolfram|Alpha. If you enter 2+2, Wolfram|Alpha—like *Mathematica*—will compute 4. But if you enter new york—or, for that matter, 2.3363636 or cos(x) log(x)—there’s no single “answer” for it to compute. And instead what it does is to generate a report that gives you a whole sequence of “interesting facts” about what you entered.

And this kind of thing fits right into the workflow for pure mathematics. You enter some mathematical object, result or structure, and then the system tries to tell you interesting things about it—just like some extremely wise mathematical colleague might. You can guide the system if you want to, by telling it what kinds of things you want to know about, or even by giving it a candidate statement that might be true. But the workflow is always the Wolfram|Alpha-like “what can you tell me about that?” rather than the *Mathematica*-like “what’s the answer to that?”

Wolfram|Alpha already does quite a lot of this kind of thing with mathematical objects. Enter a number, or a mathematical expression, or a graph, or a probability distribution, or whatever, and Wolfram|Alpha will use often-quite-sophisticated methods to try to tell you a collection of interesting things about it.

But to really be useful in pure mathematics, there’s something else that’s needed. In addition to being able to deal with concrete mathematical objects, one also has to be able to deal with abstract mathematical structures.

Countless pure mathematical papers start with things like, “Let *F* be a field with such-and-such properties.” We need to be able to enter something like this—then have our system automatically give us interesting facts and theorems about *F*, in effect creating a whole automatically generated paper that tells the story of *F*.

So what would be involved in creating a system to do this? Is it even possible? There are several different components, all quite difficult and time consuming to build. But based on my experiences with *Mathematica*, Wolfram|Alpha, and *A New Kind of Science,* I am quite certain that with the right leadership and enough effort, all of them can in fact be built.

A key part is to have a precise symbolic description of mathematical concepts and constructs. Lots of this now already exists—after more than a quarter century of work—in *Mathematica*. Because built right into the Wolfram Language are very general ways to represent geometries, or equations, or stochastic processes or quantifiers. But what’s not built in are representations of pure mathematical concepts like bijections or abstract semigroups or pullbacks.

Over the years, plenty of mathematicians have implemented specific cases. But could we systematically extend the Wolfram Language to cover the whole range of pure mathematics—and make a kind of “*Mathematica Pura”*? The answer is unquestionably yes. It’ll be fascinating to do, but it’ll take lots of difficult language design.

I’ve been doing language design now for 35 years—and it’s the hardest intellectual activity I know. It requires a curious mixture of clear thinking, aesthetics and pragmatic judgement. And it involves always seeking the deepest possible understanding, and trying to do the broadest unification—to come up in the end with the cleanest and “most obvious” primitives to represent things.

Today the main way pure mathematics is described—say in papers—is through a mixture of mathematical notation and natural language, together with a few diagrams. And in designing a precise symbolic language for pure mathematics, this has to be the starting point.

One might think that somehow mathematical notation would already have solved the whole problem. But there’s actually only a quite small set of constructs and concepts that can be represented with any degree of standardization in mathematical notation—and indeed many of these are already in the Wolfram Language.

So how should one go further? The first step is to understand what the appropriate primitives are. The whole Wolfram Language today has about 5000 built-in functions—together with many millions of built-in standardized entities. My guess is that to broadly support pure mathematics there would need to be something less than a thousand other well-designed functions that in effect define frameworks—together with maybe a few tens of thousands of new entities or their analogs.

Take something like function spaces. Maybe there’ll be a FunctionSpace function to represent a function space. Then there’ll be various operations on function spaces, like PushForward or MetrizableQ. Then there’ll be lots of named function spaces, like “CInfinity”, with various kinds of parameterizations.

Underneath, everything’s just a symbolic expression. But in the Wolfram Language there end up being three immediate ways to input things, all of which are critical to having a convenient and readable language. The first is to use short notations—like + or —as in standard mathematical notation. The second is to use carefully chosen function names—like MatrixRank or Simplex. And the third is to use free-form natural language—like trefoil knot or aleph0.

One wants to have short notations for some of the most common structural or connective elements. But one needs the right number: not too few, like in LISP, nor too many, like in APL. Then one wants to have function names made of ordinary words, arranged so that if one’s given something written in the language one can effectively just “read the words” to know at least roughly what’s going on in it.

But in the modern Wolfram Language world there’s also free-form natural language. And the crucial point is that by using this, one can leverage all the various convenient—but sloppy—notations that actual mathematicians use and find familiar. In the right context, one can enter “L2” for Lebesgue Square Integrable—and the natural language system will take care of disambiguating it and inserting the canonical symbolic underlying form.

Ultimately every named construct or concept in pure mathematics needs to have a place in our symbolic language. Most of the 13,000+ entries in *MathWorld*. Material from the 5600 or so entries in the MSC2010 classification scheme. All the many things that mathematicians in any given field would readily recognize when told their names.

But, OK, so let’s say we manage to create a precise symbolic language that captures the concepts and constructs of pure mathematics. What can we do with it?

One thing is to use it “Wolfram|Alpha style”: you give free-form input, which is then interpreted into the language, and then computations are done, and a report is generated.

But there’s something else too. If we have a sufficiently well-designed symbolic language, it’ll be useful not only to computers but also to humans. In fact, if it’s good enough, people should prefer to write out their math in this language than in their current standard mixture of natural language and mathematical notation.

When I write programs in the Wolfram Language, I pretty much think directly in the language. I’m not coming up with a description in English of what I’m trying to do and then translating it into the Wolfram Language. I’m forming my thoughts from the beginning in the Wolfram Language—and making use of its structure to help me define those thoughts.

If we can develop a sufficiently good symbolic language for pure mathematics, then it’ll provide something for pure mathematicians to think in too. And the great thing is that if you can describe what you’re thinking in a precise symbolic language, there’s never any ambiguity about what anything means: there’s a precise definition that you can just go to the documentation for the language to find.

And once pure math is represented in a precise symbolic language, it becomes in effect something on which computation can be done. Proofs can be generated or checked. Searches for theorems can be done. Connections can automatically be made. Chains of prerequisites can automatically be found.

But, OK, so let’s say we have the raw computational substrate we need for pure mathematics. How can we use this to actually implement a Wolfram|Alpha-like workflow where we enter descriptions of things, and then in effect automatically get mathematical wisdom about them?

There are two seemingly different directions one can go. The first is to imagine abstractly enumerating possible theorems about what has been entered, and then using heuristics to decide which of them are interesting. The second is to start from computable versions of the millions of theorems that have actually been published in the literature of mathematics, and then figure out how to connect these to whatever has been entered.

Each of these directions in effect reflects a slightly different view of what doing mathematics is about. And there’s quite a bit to say about each direction.

Let’s start with theorem enumeration. In the simplest case, one can imagine starting from an axiom system and then just enumerating true theorems based on that system. There are two basic ways to do this. The first is to enumerate possible statements, and then to use (implicit or explicit) theorem-proving technology to try to determine which of them are true. And the second is to enumerate possible proofs, in effect treeing out possible ways the axioms can be applied to get theorems.

It’s easy to do either of these things for something like Boolean algebra. And the result is that one gets a sequence of true theorems. But if a human looks at them, many of them seem trivial or uninteresting. So then the question is how to know which of the possible theorems should actually be considered “interesting enough” to be included in a report that’s generated.

My first assumption was that there would be no automatic approach to this—and that “interestingness” would inevitably depend on the historical development of the relevant area of mathematics. But when I was working on *A New Kind of Science*, I did a simple experiment for the case of Boolean algebra.

There are 14 theorems of Boolean algebra that are usually considered “interesting enough” to be given names in textbooks. I took all possible theorems and listed them in order of complexity (number of variables, number of operators, etc). And the surprising thing I found is that the set of named theorems corresponds almost exactly to the set of theorems that can’t be proved just from ones that precede them in the list. In other words, the theorems which have been given names are in a sense exactly the minimal statements of new information about Boolean algebra.

Boolean algebra is of course a very simple case. And in the kind of enumeration I just described, once one’s got the theorems corresponding to all the axioms, one would conclude that there aren’t any more “interesting theorems” to find—which for many mathematical theories would be quite silly. But I think this example is a good indication of how one can start to use automated heuristics to figure out which theorems are “worth reporting on”, and which are, for example, just “uninteresting embellishments”.

Of course, the general problem of ranking “what’s interesting” comes up all over Wolfram|Alpha. In mathematical examples, one’s asking what region is interesting to plot?, “what alternate forms are interesting?” and so on. When one enters a single number, one’s also asking “what closed forms are interesting enough to show?”—and to know this, one for example has to invent rankings for all sorts of mathematical objects (how complicated should one consider relative to log(343) relative to Khinchin’s Constant, and so on?).

So in principle one can imagine having a system that takes input and generates “interesting” theorems about it. Notice that while in a standard *Mathematica*-like calculational workflow, one would be taking input and “computing an answer” from it, here one’s just “finding interesting things to say about it”.

The character of the input is different too. In the calculational case, one’s typically dealing with an operation to be performed. In the Wolfram|Alpha-like pure mathematical case, one’s typically just giving a description of something. In some cases that description will be explicit. A specific number. A particular equation. A specific graph. But more often it will be implicit. It will be a set of constraints. One will say (to use the example from above), “Let *F* be a field,” and then one will give constraints that the field must satisfy.

In a sense an axiom system is a way of giving constraints too: it doesn’t say that such-and-such an operator “is Nand”; it just says that the operator must satisfy certain constraints. And even for something like standard Peano arithmetic, we know from Gödel’s Theorem that we can never ultimately resolve the constraints–we can never nail down that the thing we denote by “+” in the axioms is the particular operation of ordinary integer addition. Of course, we can still prove plenty of theorems about “+”, and those are what we choose from for our report.

So given a particular input, we can imagine representing it as a set of constraints in our precise symbolic language. Then we would generate theorems based on these constraints, and heuristically pick the “most interesting” of them.

One day I’m sure doing this will be an important part of pure mathematical work. But as of now it will seem quite alien to most pure mathematicians—because they are not used to “disembodied theorems”; they are used to theorems that occur in papers, written by actual mathematicians.

And this brings us to the second approach to the automatic generation of “mathematical wisdom”: start from the historical corpus of actual mathematical papers, and then make connections to whatever specific input is given. So one is able to say for example, “The following theorem from paper X applies in such-and-such a way to the input you have given”, and so on.

So how big is the historical corpus of mathematics? There’ve probably been about 3 million mathematical papers published altogether—or about 100 million pages, growing at a rate of about 2 million pages per year. And in all of these papers, perhaps 5 million distinct theorems have been formally stated.

So what can be done with these? First, of course, there’s simple search and retrieval. Often the words in the papers will make for better search targets than the more notational material in the actual theorems. But with the kind of linguistic-understanding technology for math that we have in Wolfram|Alpha, it should not be too difficult to build what’s needed to do good statistical retrieval on the corpus of mathematical papers.

But can one go further? One might think about tagging the source documents to improve retrieval. But my guess is that most kinds of static tagging won’t be worth the trouble; just as one’s seen for the web in general, it’ll be much easier and better to make the search system more sophisticated and content-aware than to add tags document by document.

What would unquestionably be worthwhile, however, is to put the theorems into a genuine computable form: to actually take theorems from papers and rewrite them in a precise symbolic language.

Will it be possible to do this automatically? Eventually I suspect large parts of it will. Today we can take small fragments of theorems from papers and use the linguistic understanding system built for Wolfram|Alpha to turn them into pieces of Wolfram Language code. But it should gradually be possible to extend this to larger fragments—and eventually get to the point where it takes, at most, modest human effort to convert a typical theorem to precise symbolic form.

So let’s imagine we curate all the theorems from the literature of mathematics, and get them in computable form. What would we do then? We could certainly build a Wolfram|Alpha-like system that would be quite spectacular—and very useful in practice for doing lots of pure mathematics.

But there will inevitably be some limitations—resulting in fact from features of mathematics itself. For example, it won’t necessarily be easy to tell what theorem might apply to what, or even what theorems might be equivalent. Ultimately these are classic theoretically undecidable problems—and I suspect that they will often actually be difficult in practical cases too. And at the very least, all of them involve the same kind of basic process as automated theorem proving.

And what this suggests is a kind of combination of the two basic approaches we’ve discussed—where in effect one takes the complete corpus of published mathematics, and views it as defining a giant 5-million-axiom formal system, and then follows the kind of automated theorem-enumeration procedure we discussed to find “interesting things to say”.

So, OK, let’s say we build a wonderful system along these lines. Is it actually solving a core problem in doing pure mathematics, or is it missing the point?

I think it depends on what one sees the nature of the pure mathematical enterprise as being. Is it science, or is it art? If it’s science, then being able to make more theorems faster is surely good. But if it’s art, that’s really not the point. If doing pure mathematics is like creating a painting, automation is going to be largely counterproductive—because the core of the activity is in a sense a form of human expression.

This is not unrelated to the role of proof. To some mathematicians, what matters is just the theorem: knowing what’s true. The proof is essentially backup to ensure one isn’t making a mistake. But to other mathematicians, proof is a core part of the content of the mathematics. For them, it’s the story that brings mathematical concepts to light, and communicates them.

So what happens when we generate a proof automatically? I had an interesting example about 15 years ago, when I was working on *A New Kind of Science*, and ended up finding the simplest axiom system for Boolean algebra (just the single axiom ((pq)r)(p((pr)p))==r, as it turned out). I used equational-logic automated theorem-proving (now built into FullSimplify) to prove the correctness of the axiom system. And I printed the proof that I generated in the book:

It has 343 steps, and in ordinary-size type would be perhaps 40 pages long. And to me as a human, it’s completely incomprehensible. One might have thought it would help that the theorem prover broke the proof into 81 lemmas. But try as I might, I couldn’t really find a way to turn this automated proof into something I or other people could understand. It’s nice that the proof exists, but the actual proof itself doesn’t tell me anything.

And the problem, I think, is that there’s no “conceptual story” around the elements of the proof. Even if the lemmas are chosen “structurally” as good “waypoints” in the proof, there are no cognitive connections—and no history—around these lemmas. They’re just disembodied, and apparently disconnected, facts.

So how can we do better? If we generate lots of similar proofs, then maybe we’ll start seeing similar lemmas a lot, and through being familiar they will seem more meaningful and comprehensible. And there are probably some visualizations that could help us quickly get a good picture of the overall structure of the proof. And of course, if we manage to curate all known theorems in the mathematics literature, then we can potentially connect automatically generated lemmas to those theorems.

It’s not immediately clear how often that will possible. And indeed, in existing examples of computer-assisted proofs—like for the Four Color Theorem, the Kepler Conjecture, or the simplest universal Turing machine—my impression is that their often-computer-generated lemmas rarely correspond to known theorems from the literature.

But despite all this, I know at least one example showing that with enough effort, one can generate proofs that tell stories that people can understand: the step-by-step solutions system in Wolfram|Alpha Pro. Millions of times a day students and others compute things like integrals with Wolfram|Alpha—then ask to see the steps.

It’s notable that actually computing the integral is much easier than figuring out good steps to show; in fact, it takes some fairly elaborate algorithms and heuristics to generate steps that successfully communicate to a human how the integral can be done. But the example of step-by-step in Wolfram|Alpha suggests that it’s at least conceivable that with enough effort, it would be possible to generate proofs that are readable as “stories”—perhaps even selected to be as short and simple as possible (“proofs from The Book”, as Erdős would say).

Of course, while these kinds of automated methods may eventually be good at communicating the details of something like a proof, they won’t realistically be able to communicate—or even identify—overarching ideas and motivations. Needless to say, present-day pure mathematics papers are often quite deficient in communicating these too. Because in an effort to ensure rigor and precision, many papers tend to be written in a very formal way that cannot successfully represent the underlying ideas and motivations in the mind of the author—with the result that some of the most important ideas in mathematics are transmitted through an essentially oral tradition.

It would certainly help the progress of pure mathematics if there were better ways to communicate its content. And perhaps having a precise symbolic language for pure mathematics would make it easier to express concretely some of those important points that are currently left unwritten. But one thing is for sure: having such a language would make it possible to take a theorem from anywhere, and—like with a typical Wolfram Language code fragment—immediately be able to plug it in anywhere else, and use it.

But back to the question of whether automation in pure mathematics can ultimately make sense. I consider it fairly clear that a Wolfram|Alpha-like “pure math assistant” would be useful to human mathematicians. I also consider it fairly clear that having a good, precise, symbolic language—a kind of *Mathematica Pura* that’s a well-designed follow-on to standard mathematical notation—would be immensely helpful in formulating, checking and communicating math.

But what about a computer just “going off and doing math by itself”? Obviously the computer can enumerate theorems, and even use heuristics to select ones that might be considered interesting to human mathematicians. And if we curate the literature of mathematics, we can do extensive “empirical metamathematics” and start trying to recognize theorems with particular characteristics, perhaps by applying graph-theoretic criteria on the network of theorems to see what counts as “surprising” or a “powerful” theorem. There’s also nothing particularly difficult—like in Wolfram*Tones*—about having the computer apply aesthetic criteria deduced from studying human choices.

But I think the real question is whether the computer can build up new conceptual frameworks and structures—in effect new mathematical theories. Certainly some theorems found by enumeration will be surprising and indicative of something fundamentally new. And it will surely be impressive when a computer can take a large collection of theorems—whether generated or from the literature—and discover correlations among them that indicate some new unifying principle. But I would expect that in time the computer will be able not only to identify new structures, but also name them, and start building stories about them. Of course, it is for humans to decide whether they care about where the computer is going, but the basic character of what it does will, I suspect, be largely indistinguishable from many forms of human pure mathematics.

All of this is still fairly far in the future, but there’s already a great way to discover math-like things today—that’s not practiced nearly as much as it should be: experimental mathematics. The term has slightly different meanings to different people. For me it’s about going out and studying what mathematical systems do by running experiments on them. And so, for example, if we want to find out about some class of cellular automata, or nonlinear PDEs, or number sequences, or whatever, we just enumerate possible cases and then run them and see what they do.

There’s a lot to discover like this. And certainly it’s a rich way to generate observations and hypotheses that can be explored using the traditional methodologies of pure mathematics. But the real thrust of what can be done does not fit into what pure mathematicians typically think of as math. It’s about exploring the “flora and fauna”—and principles—of the universe of possible systems, not about building up math-like structures that can be studied and explained using theorems and proofs. Which is why—to quote the title of my book—I think one should best consider this a new kind of science, rather than something connected to existing mathematics.

In discussing experimental mathematics and *A New Kind of Science*, it’s worth mentioning that in some sense it’s surprising that pure mathematics is doable at all—because if one just starts asking absolutely arbitrary questions about mathematical systems, many of them will end up being undecidable.

This is particularly obvious when one’s out in the computational universe of possible programs, but it’s also true for programs that represent typical mathematical systems. So why isn’t undecidability more of a problem for typical pure mathematics? The answer is that pure mathematics implicitly tends to select what it studies so as to avoid undecidability. In a sense this seems to be a reflection of history: pure mathematics follows what it has historically been successful in doing, and in that way ends up navigating around undecidability—and producing the millions of theorems that make up the corpus of existing pure mathematics.

OK, so those are some issues and directions. But where are we at in practice in bringing computational knowledge to pure mathematics?

There’s certainly a long history of related efforts. The works of Peano and Whitehead and Russell from a century ago. Hilbert’s program. The development of set theory and category theory. And by the 1960s, the first computer systems—such as Automath—for representing proof structures. Then from the 1970s, systems like Mizar that attempted to provide practical computer frameworks for presenting proofs. And in recent times, increasingly popular “proof assistants” based on systems like Coq and HOL.

One feature of essentially all these efforts is that they were conceived as defining a kind of “low-level language” for mathematics. Like most of today’s computer languages, they include a modest number of primitives, then imagine that essentially any actual content must be built externally, by individual users or in libraries.

But the new idea in the Wolfram Language is to have a knowledge-based language, in which as much actual knowledge as possible is carefully designed into the language itself. And I think that just like in general computing, the idea of a knowledge-based language is going to be crucial for injecting computation into pure mathematics in the most effective and broadly useful way.

So what’s involved in creating our *Mathematica Pura*—an extension to the Wolfram Language that builds in the actual structure and content of pure math? At the lowest level, the Wolfram Language deals with arbitrary symbolic expressions, which can represent absolutely anything. But then the language uses these expressions for many specific purposes. For example, it can use a symbol x to represent an algebraic variable. And given this, it has many functions for handling symbolic expressions—interpreted as mathematical or algebraic expressions—and doing various forms of math with them.

The emphasis of the math in *Mathematica* and the Wolfram Language today is on practical, calculational, math. And by now it certainly covers essentially all the math that has survived from the 19th century and before. But what about more recent math? Historically, math itself went through a transition about a century ago. Just around the time modernism swept through areas like the arts, math had its own version: consider systems that emerged purely from its own formalism, without regard for obvious connection to the outside world.

And this is the kind of math—through developments like Bourbaki and beyond—that came to dominate pure mathematics in the 20th century. And inevitably, a lot of this math is about defining abstract structures to study. In simple cases, it seems like one might represent these structures using some hierarchy of types. But the types need to be parametrized, and quite quickly one ends up with a whole algebra or calculus of types—and it’s just as well that in the Wolfram Language one can use general symbolic expressions, with arbitrary heads, rather than just simple type descriptions.

As I mentioned early in this blog post, it’s going to take all sorts of new built-in functions to capture the frameworks needed to represent modern pure mathematics—together with lots of entity-like objects. And it’ll certainly take years of careful design to make a broad system for pure mathematics that’s really clean and usable. But there’s nothing fundamentally difficult about having symbolic constructs that represent differentiability or moduli spaces or whatever. It’s just language design, like designing ways to represent 3D images or remote computation processes or unique external entity references.

So what about curating theorems from the literature? Through Wolfram|Alpha and the Wolfram Language, not to mention for example the Wolfram Functions Site and the Wolfram Connected Devices Project, we’ve now had plenty of experience at the process of curation, and in making potentially complex things computable.

But to get a concrete sense of what’s involved in curating mathematical theorems, we did a pilot project over the last couple of years through the Wolfram Foundation, supported by the Sloan Foundation. For this project we picked a very specific and well-defined area of mathematics: research on continued fractions. Continued fractions have been studied continually since antiquity, but were at their most popular between about 1780 and 1910. In all there are around 7000 books and papers about them, running to about 150,000 pages.

We chose about 2000 documents, then set about extracting theorems and other mathematical information from them. The result was about 600 theorems, 1500 basic formulas, and about 10,000 derived formulas. The formulas were directly in computable form—and were in effect immediately able to join the 300,000+ on the Wolfram Functions Site, that are all now included in Wolfram|Alpha. But with the theorems, our first step was just to treat them as entities themselves, with properties such as where they were first published, who discovered them, etc. And even at this level, we were able to insert some nice functionality into Wolfram|Alpha.

But we also started trying to actually encode the content of the theorems in computable form. It took introducing some new constructs like LebesgueMeasure, ConvergenceSet and LyapunovExponent. But there was no fundamental problem in creating precise symbolic representations of the theorems. And just from these representations, it became possible to do computations like this in Wolfram|Alpha:

An interesting feature of the continued fraction project (dubbed “eCF”) was how the process of curation actually led to the discovery of some new mathematics. For having done curation on 50+ papers about the Rogers–Ramanujan continued fraction, it became clear that there were missing cases that could now be computed. And the result was the filling of a gap left by Ramanujan for 100 years.

There’s always a tradeoff between curating knowledge and creating it afresh. And so, for example, in the Wolfram Functions Site, there was a core of relations between functions that came from reference books and the literature. But it was vastly more efficient to generate other relations than to scour the literature to find them.

But if the goal is curation, then what would it take to curate the complete literature of mathematics? In the eCF project, it took about 3 hours of mathematician time to encode each theorem in computable form. But all this work was done by hand, and in a larger-scale project, I am certain that an increasing fraction of it could be done automatically, not least using extensions of our Wolfram|Alpha natural language understanding system.

Of course, there are all sorts of practical issues. Newer papers are predominantly in TeX, so it’s not too difficult to pull out theorems with all their mathematical notation. But older papers need to be scanned, which requires math OCR, which has yet to be properly developed.

Then there are issues like whether theorems stated in papers are actually valid. And even whether theorems that were considered valid, say, 100 years ago are still considered valid today. For example, for continued fractions, there are lots of pre-1950 theorems that were successfully proved in their time, but which ignore branch cuts, and so wouldn’t be considered correct today.

And in the end of course it requires lots of actual, skilled mathematicians to guide the curation process, and to encode theorems. But in a sense this kind of mobilization of mathematicians is not completely unfamiliar; it’s something like what was needed when *Zentralblatt* was started in 1931, or *Mathematical Reviews* in 1941. (As a curious footnote, the founding editor of both these publications was Otto Neugebauer, who worked just down the hall from me at the Institute for Advanced Study in the early 1980s, but who I had no idea was involved in anything other than decoding Babylonian mathematics until I was doing research for this blog post.)

When it comes to actually constructing a system for encoding pure mathematics, there’s an interesting example: Theorema, started by Bruno Buchberger in 1995, and recently updated to version 2. Theorema is written in the Wolfram Language, and provides both a document-based environment for representing mathematical statements and proofs, and actual computation capabilities for automated theorem proving and so on.

No doubt it’ll be an element of what’s ultimately built. But the whole project is necessarily quite large—perhaps the world’s first example of “big math”. So can the project get done in the world today? A crucial part is that we now have the technical capability to design the language and build the infrastructure that’s needed. But beyond that, the project also needs a strong commitment from the world’s mathematics community—as well as lots of contributions from individual mathematicians from every possible field. And realistically it’s not a project that can be justified on commercial grounds—so the likely $100+ million that it will need will have to come from non-commercial sources.

But it’s a great and important project—that promises to be pivotal for pure mathematics. In almost every field there are golden ages when dramatic progress is made. And more often than not, such golden ages are initiated by new methodology and the arrival of new technology. And this is exactly what I think will happen in pure mathematics. If we can mobilize the effort to curate known mathematics and build the system to use and generate computational knowledge around it, then we will not only succeed in preserving and spreading the great heritage of pure mathematics, but we will also thrust pure mathematics into a period of dramatic growth.

Large projects like this rely on strong leadership. And I stand ready to do my part, and to contribute the core technology that is needed. Now to move this forward, what it takes is commitment from the worldwide mathematics community. We have the opportunity to make the second decade of the 21st century really count in the multi-millennium history of pure mathematics. Let’s actually make it happen!

To comment, please visit the original post at the Stephen Wolfram Blog »

]]>We recently posted a blog entry celebrating the anniversary of the Apollo 11 landing on the Moon. Now, just a couple weeks later, we are preparing for another first: the European Space Agency’s attempt to orbit and then land on a comet. The Rosetta spacecraft was launched in 2004 with the ultimate goal of orbiting and landing on comet 67P/Churyumov–Gerasimenko. Since the launch, Rosetta has already flown by asteroid Steins, in 2008, and asteroid 21 Lutetia, in 2010.

NASA and the European Space Agency (ESA) have a long history of sending probes to other solar system bodies that then orbit those bodies. The bodies have usually been nice, well-behaved, and spherical, making orbital calculations a fairly standard thing. But, as Rosetta recently started to approach comet 67P, we began to get our first views of this alien world. And it is far from spherical.

Credits: ESA/Rosetta/MPS for OSIRIS Team MPS/UPD/LAM/IAA/SSO/INTA/UPM/DASP/IDA

The comet’s nucleus can be vaguely described like a flat platform with a spheroidal rock attached to one end, with the entire system rotating in a complex manner. How do you orbit such an object? The exact dynamics to model are beyond the scope of this post, but we can make use of Wolfram|Alpha and the Wolfram Language, including new features in *Mathematica* 10, to do some thought experiments. Can we generate a simple model of the comet, and then try to simulate the probe’s approach and orbit of this irregular body? Keep in mind that comet nuclei are relatively small bodies with small masses. This means that there’s not a lot of gravity to hold you in orbit or to help you land. Initial estimates suggest this comet has an escape velocity of about .5 m/s. If you move any faster than this, you will fly off into space and never come back, so the approach must be very slow, on the order of 10 cm/s, or risk never being captured.

Wolfram|Alpha recently added support for physical systems that we can use to model our comet. First, we need the shape. Let’s model the comet as a massive and flat cuboid with a sphere sitting on one end. We can access the new Wolfram|Alpha functionality using `EntityValue` in the Wolfram Language.

Now we can render the resulting model.

The next step is to obtain the gravitational potential of this system. Let’s assume that 2/3 of the total mass is in the cuboid and 1/3 of the total mass is in the sphere. Let’s also assume that the densities of the sphere and the cuboid are the same.

Now, we can plot a slice of the potential using `ContourPlot`. A preliminary estimate of the comet’s total mass is also included.

We can also visualize the gravitational potential in 3D using `ContourPlot3D`. Here, we’ve chosen to look at a specific iso-potential near the surface of the model.

The force of gravity can be found by differentiating the gravitational potential.

One of the more difficult problems is that in order to orbit and land on the comet, you have to approach the comet and then slow down so that the gravity of the comet can capture you, the orbiter. This typically requires a rocket to be fired to slow the orbiter down. So here we define a new force, created by this rocket, that opposes the forward velocity of the probe.

For visualization purposes, we want to tag the times when the rocket burns and shuts off, so we create a couple of state variables. `NDSolve` can be used to handle all of the acceleration, velocity, and position changes. In addition, we can use `WhenEvent`‘s functionality to handle several possible conditions to watch out for, and take specific actions if those conditions are met.

We can start with one possible scenario where we choose an arbitrary starting point and velocity, but the velocity and burn times result in the orbiter hitting the comet instead of obtaining a capture orbit.

We can see that a single rocket burn takes place.

Red segments show where the rocket was fired to slow down.

With adjustments to the initial velocity and burn distances, we can achieve a more slowly decaying orbit that involves multiple burns.

In this case, there are nine separate rocket burns resulting in orbits that start out highly eccentric, but then settle down into closer, more circular orbits.

We can plot the velocity over the course of the orbital insertion and orbit phases. Velocities are measured in meters/second, and time is in seconds.

The ESA will have a much more difficult problem, likely requiring many separate burns over several months, and this is even before it attempts to drop a lander onto the comet in November. As the rocket approaches the comet, more information becomes available, such as refined mass estimates and the location of potentially hazardous debris from jets issuing from the icy comet. All of this must be carefully considered before refining the approach and orbital maneuvers. The next few months should be really exciting!

To experiment with the parameters of this simulation yourself, try using the deployed version created using `CloudDeploy` in the Wolfram Language.

Download this post as a Computable Document Format (CDF) file.

To comment, please visit the copy of this post at the Wolfram|Alpha Blog »

]]>The significance of the labels A, B, C, and D for these examples will soon become clear!

To begin with, let us recall the notion of a convergent infinite series using the following geometric series.

The general term of the series, starting at *n* = 0, is given by:

We now define the sum of the series from *i* = 0 up to some finite value *i* = *n*.

This finite sum is called a *partial sum* for the series.

A plot of the partial sums shows that they approach 2 as the value of *n* increases:

`Limit` confirms that the partial sums approach 2 as *n* approaches infinity.

`Sum` gives the same result for the sum from 0 to infinity.

We say that the above geometric series is *convergent* and that its *sum* is 2.

In general, an infinite series is convergent if the sequence of its partial sums approaches a definite value as the number of terms approaches infinity. In this case, the limiting value of the partial sums is called the sum of the series.

An infinite series that is not convergent is said to be *divergent*. By definition, divergent series cannot be summed using the method of partial sums that we illustrated above. However, mathematicians have devised various means of assigning finite values to such series. Such a finite value is called a *regularized sum* for the divergent series. The process of computing regularized sums is called *regularization* or *resummation*.

We are now ready to consider example A from the introduction.

The label A stands for Abel (Niels Henrik Abel,1802–1829), the famous Norwegian mathematician who provided a technique for regularizing a divergent series. During his brief life of 26 years, Abel made spectacular progress with solving some of the hardest mathematical problems. In particular, he showed that the general algebraic equation of fifth degree could not be solved in radicals, a problem that had remained unsolved for 250 years before his time.

In order to apply Abel’s method, we note that the general term of the given series is:

This can be seen by making a table of the first few values for *a*[*n*].

As the following plot shows, the partial sums have values 1 or 0 depending on whether *n* is even or odd.

Hence, `Sum` issues a message to indicate that the series is divergent.

Abel regularization can be applied to this series in two steps. During the first step, we sum the corresponding power series.

Next we take the limit of this sum as *x* approaches 1, noting that the power series converges for values of *x* arbitrarily close to, but less than, 1.

These two steps can be combined to define the *Abel sum* of the series.

We can obtain the same answer using the `Regularization` option for `Sum` as follows.

The answer 1/2 seems reasonable, since it is the average of the two possible values, 1 and 0, for the partial sums of this series. Also, the limiting procedure by which it was calculated is very intuitive because with *x* = 1, the power series coincides with the given infinite series. However, Abel was greatly concerned about the lack of rigor that existed in mathematical analysis before his time, and cautioned that:

*“Divergent series are the invention of the devil, and it is shameful to base on them any demonstration whatsoever. By using them, one may draw any conclusion he pleases and that is why these series have produced so many fallacies and so many paradoxes.” (N. H. Abel in a letter to his former teacher Berndt Holmböe, January 1826)*

We now turn to example B, which states that:

The label B stands for Borel (Émile Borel, 1871–1956), a French mathematician who worked on topics such as measure theory and probability. In particular, Borel is associated with the infinite monkey theorem, which states that a monkey hitting a random sequence of keys on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare.

In order to apply Borel’s method, we note that the general term of the given series is:

Borel regularization can be applied to this rapidly growing divergent series in two steps. During the first step, we calculate the exponential generating function of the series. The *n*! in the denominator below ensures that the infinite series converges for all values of *t*.

Next, we compute the Laplace transform of the exponential generating function at *s* = 1.

These two steps can be combined to define the *Borel sum* of the series.

The answer can be obtained directly using `Sum` as follows.

The definition of the Borel sum makes sense because it gives the same result as that obtained using the method of partial sums when it is applied to convergent series. For, in this case, one can interchange summation and integration, and then the definition of the `Gamma` function gives:

However, such an interchange is not valid for divergent series, which leads to interesting results when Borel regularization is applied to such series.

Borel summation provides a versatile technique for evaluating divergent series that occur in quantum field theory, and there is much literature on this topic.

Example C on our list states that:

The label C stands for Cesàro (Ernesto Cesàro,1859–1906), an Italian mathematician who made significant contributions to differential geometry, number theory, and mathematical physics. Cesàro was a prolific mathematician and wrote around 80 works during the period 1884–86, before being awarded a PhD in 1887!

As a first step in applying Cesàro’s method, we note that the general term of the series, starting at *n* = 0, is:

A plot of the partial sums shows that they oscillate wildly.

Cesàro’s method essentially uses successive averages of the partial sums to tame these oscillations, as seen in the following plot.

Formally, the *Cesàro sum* is defined as the limit of the successive averages of the partial sums. Computing this limit gives the expected answer -1/2 for example C.

The Cesàro sum can be obtained directly by setting `Regularization -> "Cesaro"` in `Sum`, as shown below.

Cesàro sums play an important role in the theory of Fourier series, which are trigonometric series used to represent periodic functions. The Fourier series for a continuous function may not converge, but the corresponding Cesàro sum (or Cesàro mean, as it is usually called) will always converge to the function. This beautiful result is called Fejér’s theorem.

Our final example requires us to show that the sum of the natural numbers is -1/12.

The label D stands for Dirichlet (Johann Peter Gustav Lejeune Dirichlet, 1805–1859), a German mathematician who made deep contributions to number theory and several other areas of mathematics. The breadth of Dirichlet’s contributions to the subject can be gauged by simply typing in the following wildcard search in *Mathematica* 10.

Dirichlet’s method of regularization takes its name from the notion of a Dirichlet series, which is defined by:

A special case of this series defines the Riemann `Zeta` function.

`SumConvergence` informs us that this series converges only if the real part of *s* is greater than 1.

However, the zeta function itself can be defined for other values of *s* using the process of analytic continuation from complex analysis. For example, with *s* = -1, we get:

But with *s* = -1, the series defining `Zeta` is precisely the series of natural numbers. Hence we get:

Another way to understand this answer is to introduce a parameter *ε* into the infinite sum and then apply `Series`, as shown below.

The first term 1/*ε*^{2} in the above series expansion diverges as *ε* approaches 0, while the third term *ε*^{2}/240 as well as all higher terms become infinitely small when taking this limit. If we discard these terms, then we are left with the Dirichlet sum -1/12 for the natural numbers. Thus, the Dirichlet sum is obtained by discarding both the infinitely small and the infinitely large terms in the series expansion. This is in sharp contrast to the convention of ignoring *only* infinitely small (infinitesimal) quantities that is used in college calculus, and accounts for the non-intuitive nature of the result given for the Dirichlet sum of the natural numbers.

The same argument can be used to justify the bizarre value 0 for the regularized sum of the squares of the natural numbers.

In this case, we are left with 0 after discarding terms with positive or negative powers of *ε* in the series expansion.

Dirichlet regularization is closely related to the process of zeta regularization used in modern theoretical physics. In a celebrated paper, the eminent British physicist Stephen Hawking (1977) applied this technique to the problem of computing Feynmann path integrals in a curved spacetime. Hawking’s paper described the process of zeta regularization in a systematic manner to the physics community, and has been cited widely since it was published.

Our knowledge about divergent series is based on profound theories developed by some of the finest thinkers of the last few centuries. Yet, I sympathize with readers who, like me, feel uneasy about their use in current physical theories. The mighty Abel may have been right when he called these series the invention of the devil. And it is possible that a future Einstein, working in isolation, will cast aside prevailing dogma and reformulate fundamental physics to avoid the use of divergent series. But even if such developments do take place, divergent series will already have provided us with a rich source of mathematical ideas, and will have illumined the way to a deeper understanding of our universe.

Download this post as a Computable Document Format (CDF) file.

]]>Today I’m happy to announce an update for *Mathematica* and the Wolfram Language for the Raspberry Pi that brings those new features to the Raspberry Pi. To get the new version of the Wolfram Language, simply run this command in a terminal on your Raspberry Pi:

`sudo apt-get update && sudo apt-get install wolfram-engine`

This new version will also be pre-installed in the next release of NOOBS, the easy setup system for the Raspberry Pi.

If you’ve never used the Wolfram Language on the Raspberry Pi, you should try this new fast introduction for programmers, which is a quick and easy way to learn to program in this language. The introduction covers everything from using the interactive user interface, basic evaluations, and expressions to more advanced topics such as natural language processing and cloud computations. You’ll also find a great introduction to the Wolfram Language in the Raspberry Pi Learning Resources.

This release of the Wolfram Language also includes integration with the newly released Wolfram Cloud, technology that allows you to do sophisticated computations on a remote server, using all of the knowledge from Wolfram|Alpha and the Wolfram Knowledgebase. It lets you define custom computations and deploy them as an Instant API on the cloud. The Wolfram Cloud is available with a free starter account. Paid subscriptions enable additional functionality.

Check Wolfram Community in the next couple of weeks for new examples that show you how to use the Wolfram Language with your Raspberry Pi.

]]>