Wolfram Blog http://blog.wolfram.com News, views, and ideas from the front lines at Wolfram Research. Fri, 12 Jan 2018 23:30:21 +0000 en hourly 1 http://wordpress.org/?v=3.2.1 Slicing Silhouettes of Jupiter: Processing JunoCam Images http://blog.wolfram.com/2018/01/12/slicing-silhouettes-of-jupiter-processing-junocam-images/ http://blog.wolfram.com/2018/01/12/slicing-silhouettes-of-jupiter-processing-junocam-images/#comments Fri, 12 Jan 2018 15:38:59 +0000 Jesse Dohmann http://blog.internal.wolfram.com/?p=40405 Juno images processing

With the images from the Juno mission being made available to the public, I thought it might be fun to try my hand at some image processing with them. Though my background is not in image processing, the Wolfram Language has some really nice tools that lessen the learning curve, so you can focus on what you want to do vs. how to do it.

The Juno mission arose out of the effort to understand planetary formation. Jupiter, being the most influential planet in our solar system—both literally (in gravitational effect) and figuratively (in the narrative surrounding our cosmic origin)—was the top contender for study. The Juno spacecraft was launched into orbit to send back high-res images of Jupiter’s apparent “surface” of gases back to Earth for study in order to answer some of the questions we have about our corner of the universe.

The images captured by the Juno spacecraft give us a complete photographic map of Jupiter’s surface in the form of color-filtered, surface patch images. Assembling them into a complete color map of the surface requires some geometric and image processing.

Preprocessing the Images

Images from the JunoCam were taken with four different filters: red, green, blue and near-infrared. The first three of these are taken on one spacecraft rotation (about two revolutions per minute), and the near-infrared image is taken on the second rotation. The final image product stitches all the single-filter images together, creating one projected image.

NASA has put together a gallery of images captured through the JunoCam that contains all the pieces used for this procedure, including the raw, unsorted image; the red, green and blue filtered images; and the final projected image.

Let’s first import the specific red, green and blue images:

Input 1

imgBlue =
imgGreen =
imgRed = Import[

Input 2

{imgRed, imgGreen, imgBlue}

To assemble an RGB image from these bands, I use ColorCombine:

Input 3

jup = ColorCombine[{imgRed, imgGreen, imgBlue}] //
  ImageResize[#, Scaled[.25]] &

To clear up some of the fogginess in the image, we need to adjust its contrast, brightness and gamma parameters:

Input 4

jupInit = ImageAdjust[IMAGE,{.14(*contrast*), .3(*brightness*), 2.(*gamma*)}]

You can see that there’s a shadowing effect that wasn’t as prominent to begin with in the initial color-combined image. To prevent the shadowing on the foreground image from disturbing any further analysis, the brightness needs to be uniform throughout the image. I first create a mask that limits the correction to the white area:

Input 5

newMask = Binarize[jupInit, {0.01, 1}]

When I apply this mask, I get:

Input 6

jupBright = BrightnessEqualize[jupInit, Masking -> newMask]

It’s much darker now, so I have to readjust the image. This time, I’m doing it interactively using a Manipulate:

Input 7

stretchImage[image_] := Block[{thumbnail},

   thumbnail = ImageResize[image, Scaled[.7]];
   With[{t = thumbnail},
     ImageAdjust[t, {c, b, g}],
     {{c, 0, "Contrast"}, -5.0, 5.0, 0.01},
     {{b, 0, "Brightness"}, -5.0, 5.0, 0.01},
     {{g, 2.0, "Gamma"}, 0.01, 5.0, 0.001},
     ControlPlacement -> {Bottom, Bottom, Bottom}


Input 8


I use the parameter values I found with the Manipulate to create an adjusted image:

Input 9

jupadj = ImageAdjust[IMAGE,{-.16, 3.14, 1.806}];

Any time an image is captured on camera, it’s always a little bit blurred. The Wolfram Language has a variety of deconvolution algorithms available for immediate use in computations—algorithms that reduce this unintended blur.

Most folks who do image processing, especially on astronomical images, have an intuition for how best to recover an image through deconvolution. Since I don’t, it’s better to do this interactively:

Input 10

deconvolveImage[image_] := Block[{thumbnail},

   thumbnail = ImageResize[image, Scaled[.7]];
   With[{t = thumbnail},
     ImageDeconvolve[t, GaussianMatrix[n], Method -> "RichardsonLucy"],
     {{n, 0, "Blur Correction Factor"}, 1, 3.0, 0.1},
     ControlPlacement -> Bottom


Input 11


Again, I use the blur correction I found interactively to make an unblurred image:

Input 12

jupUnblur =
  ImageDeconvolve[jupadj, GaussianMatrix[1.7],
   Method -> "RichardsonLucy"];

And as a sanity check, I’ll see how these changes look side by side:

Input 13

table = Transpose@
   {{"Original", jup},
    {"Initial Correction", jupInit},
    {"Uniform Brightness", jupBright},
    {"Better Adjustment", jupadj},
    {"Deconvolved Image", jupUnblur}};
  Panel[#2, #1, ImageSize -> Medium] &,

Processing the Image

Now that the image has been cleaned up and prepared for use, it can be analyzed in a variety of ways—though it’s not always apparent which way is best. This was a very exploratory process for me, so I tried a lot of methods that didn’t end up working right, like watershed segmentation or image Dilation and Erosion; these are methods that are great for binarized images, but the focus here is enhancing colorized images.

With Jupiter, there is a lot of concentration on the Great Red Spot, so why not highlight this feature of interest?

To start, I need to filter the image in a way that will easily distinguish three different regions: the background, the foreground and the Great Red Spot within the foreground. In order to do this, I apply a MeanShiftFilter:

Input 14

filtered = MeanShiftFilter[jupadj, 1, .5, MaxIterations -> 10]

This is useful because this filter removes the jagged edges of the Great Red Spot. Additionally, this filter preserves edges, making the boundary around the Great Red Spot smoother and easy for a computer to detect.

Using Manipulate once again, I can manually place seed points that indicate the locations of the three regions of the image (you can see how much the filter above helped separate out the regions):

Input 15Input 15Juno images processing

Manipulate[seeds = pts;
  {Image[jupadj, ImageSize -> All],
   Image[ImageForestingComponents[jupadj, pts] // Colorize,
    ImageSize -> All],
   Image[ImageForestingComponents[filtered, pts] // Colorize,
    ImageSize -> All]
  {pts, RandomReal[Min[ImageDimensions[jupadj]], {3, 2}]},
  {0, 0},
  Appearance -> Graphics[{Green, Disk[{0, 0}]},
    ImageSize -> 10],
  LocatorAutoCreate -> {2, 10}

The values of the seeds at these places are stored within a variable for further use:

Input 16


Using these seeds, I can do segmentation programmatically:

Input 17

Colorize[label = ImageForestingComponents[filtered, seeds, 2]]

With the regions segmented, I create a mask for the Great Red Spot:

Input 18

mask = Colorize[DeleteBorderComponents[label]]

I apply this mask to the image:

Input 19

ImageApply[{1, 0, 0} &, jupadj, Masking -> mask]

This is great, but looking at it more, I wish I had an approximate numerical boundary for the Great Red Spot region in the image. Luckily, that’s quite straightforward to do in the Wolfram Language.

Our interactive right-click menu helped me navigate the image to find necessary coordinates for creating this numerical boundary:

Coordinates tool 1

It’s a handy UI feature within our notebook front end—intuitively guiding me through finding roughly where the y coordinate within the Great Red Spot is at a minimum:

Coordinates tool 2

As well as where the x coordinate within that same region is at a minimum:

Coordinates tool 3

I also did this for the maximum values for each coordinate. Using these values, I numerically generate ranges of numbers with a step size of .1:

Input 20

x = Range[144, 275, .1];
y = Range[264, 350, .1];

I construct the major and minor axes:

Input 21

xRadius = (Max[x] - Min[x])/2;
yRadius = (Max[y] - Min[y])/2;

And I approximate the center:

Input 22

center = {Min[x] + xRadius, Min[y] + yRadius}

And finally, I create the bounding ellipse:

Input 23

bounds = Graphics[{Thick, Blue,
    Ellipsoid[center, {xRadius, yRadius}]

This bounding ellipse is applied to the image:

Input 24

HighlightImage[jupadj, bounds]

More Neat Analysis on Jupiter

Aside from performing image processing on external JunoCam images in order to better understand Jupiter, there are a lot of built-in properties for Jupiter (and any other planet in our solar system) already present in the language itself, readily available for computation:

Input 25

DynamicModuleBox[{Typeset`query$$ = "Jupiter", Typeset`boxes$$ =
RowBox[{"Entity", "[",
RowBox[{"\"Planet\"", ",", "\"Jupiter\""}], "]"}],
        "\"Entity[\\\"Planet\\\", \\\"Jupiter\\\"]\"", "\"planet\""},
      Typeset`allassumptions$$ = {{
       "type" -> "Clash", "word" -> "Jupiter",
        "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
        "Values" -> {{
          "name" -> "Planet", "desc" -> "a planet",
           "input" -> "*C.Jupiter-_*Planet-"}, {
          "name" -> "Mythology", "desc" -> "a mythological figure",
           "input" -> "*C.Jupiter-_*Mythology-"}, {
          "name" -> "GivenName", "desc" -> "a given name",
           "input" -> "*C.Jupiter-_*GivenName-"}}}},
      Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
      Typeset`querystate$$ = {
      "Online" -> True, "Allowed" -> True,
       "mparse.jsp" -> 0.926959`6.418605518937624, "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{149., {7., 17.}},
        Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
         Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
SelectWithContents->True]\)["Properties"] //
 Take[#, 30] &

Included here is a textured equirectangular projection of the surface of Jupiter: perfect for 3D reconstruction!

Input 26

surface =
 Entity["Planet", "Jupiter"][
    EntityProperty["Planet", "CylindricalEquidistantTexture"]] //
   NestList[Sharpen, #, 2] & //
  #[[-1]] &

Using this projection, I can map it to a spherical graphic primitive:

Input 27

sphere[image_] := Block[{plot},

  plot = SphericalPlot3D[1, {theta, 0, Pi}, {phi, 0, 2 Pi},
    Mesh -> None, TextureCoordinateFunction -> ({#5, 1 - #4} &),
    PlotStyle -> Directive[Texture[image]], Lighting -> "Neutral",
    Axes -> False, Boxed -> False, PlotPoints -> 30]

Input 28Jupiter Juno image post-processing animation


Final Thoughts…

I started out knowing next to nothing about image processing, but with very few lines of code I was able to mine and analyze data in a fairly thorough way—even with little intuition to guide me.

The Wolfram Language abstracted away a lot of the tediousness that would’ve come with processing images, and helped me focus on what I wanted to do. Because of this, I’ve found some more interesting things to try, just with this set of data—like assembling the raw images using ImageAssemble, or trying to highlight features of interest by color instead of numerically—and feel much more confident in my ability to extract the kind of information I’m looking for.

If you’d like to work with the code you read here today, you can download this post as a Wolfram Notebook.

http://blog.wolfram.com/2018/01/12/slicing-silhouettes-of-jupiter-processing-junocam-images/feed/ 0
Goodbye, 2017! It Was a Marvelous Year for Wolfram Research http://blog.wolfram.com/2018/01/04/goodbye-2017-it-was-a-marvelous-year-for-wolfram-research/ http://blog.wolfram.com/2018/01/04/goodbye-2017-it-was-a-marvelous-year-for-wolfram-research/#comments Thu, 04 Jan 2018 15:04:59 +0000 Michael Gammon http://blog.internal.wolfram.com/?p=40346 Release features and logogram analysis

Whew! So much has happened in a year. Consider this number: we added 230 new functions to the Wolfram Language in 2017! The Wolfram Blog traces the path of our company’s technological advancement, so let’s take a look back at 2017 for the blog’s year in review.

Announcing New Products and Features

The year 2017 saw two Wolfram Language releases, a major release of Wolfram SystemModeler, the new Wolfram iOS Player hit the app store, Wolfram|Alpha pumping up its already-unmatched educational value and a host of features and capabilities related to these releases. We’ll start with the Wolfram Language releases.

“The R&D Pipeline Continues: Launching Version 11.1”

Stephen Wolfram says it’s “a minor release that’s not minor.” And if you look at the summary of new features, you’ll see why:

Release features

Stephen continues, “There’s a lot here. One might think that a .1 release, nearly 29 years after Version 1.0, wouldn’t have much new any more. But that’s not how things work with the Wolfram Language, or with our company. Instead, as we’ve built our technology stack and our procedures, rather than progressively slowing down, we’ve been continually accelerating.”

“It’s Another Impressive Release! Launching Version 11.2 Today”

11.2 word cloud

The launch of Wolfram Language 11.2 continues the tradition of significant releases. Stephen says, “We have a very deliberate strategy for our releases. Integer releases (like 11) concentrate on major complete new frameworks that we’ll be building on far into the future. ‘.1’ releases (like 11.2) are intended as snapshots of the latest output from our R&D pipeline—delivering new capabilities large and small as soon as they’re ready.”

“Launching the Wolfram Data Repository: Data Publishing That Really Works”

Wolfram Data Repository

“It’s been one of my goals with the Wolfram Language to build into it as much data as possible—and make all of that data immediately usable and computable.” To this end, Stephen and company have been working on the Wolfram Data Repository, which is now available. Over time, this resource will snowball into a massive trove of computable information. Read more about it in Stephen’s post. But, more importantly, contribute to the Repository with your own data!

“A New Level of Step-by-Step Solutions in Wolfram|Alpha”

Step-by-step solutions

Our post about Wolfram|Alpha Pro upgrades was one of the most popular of the year. And all the web traffic around Wolfram|Alpha’s development of step-by-step solutions is not surprising when you consider that this product is the educational tool for anyone studying (or teaching!) mathematics in high school or early college. Read the post to find out why students and forward-thinking teachers recommend Wolfram|Alpha Pro products.

“Notebooks in Your Pocket—WolframPlayer for iOS Is Now Shipping”

Wolfram Player

John Fultz, Wolfram’s director of user interface technology, announced the release of a highly anticipated product—Wolfram Player for iOS. “The beta is over, and we are now shipping Wolfram Player in the App Store. Wolfram Player for iOS joins Wolfram CDF Player on Windows, Mac and Linux as a free platform for sharing your notebook content with the world.” Now Wolfram Notebooks are the premium data presentation tool for every major platform.

“Announcing SystemModeler 5: Symbolic Parametric Simulation, Modular Reconfigurability and 200 New Built-in Components”

Wolfram SystemModeler features

The Wolfram MathCore and R&D teams announced a major leap for SystemModeler. “As part of the 4.1, 4.2, 4.3 sequence of releases, we completely rebuilt and modernized the core computational kernel of SystemModeler. Now in SystemModeler 5, we’re able to build on this extremely strong framework to add a whole variety of new capabilities.”

Some of the headlines include:

  • Support for continuous media such as fluids and gases, using the latest Modelica libraries
  • Almost 200 additional Modelica components, including Media, PowerConverters and Noise libraries
  • Complete visual redesign of almost 6,000 icons, for consistency and improved readability
  • Support for new GUI workspaces optimized for different levels of development and presentation
  • Almost 500 built-in example models for easy exploration and learning
  • Modular reconfigurability, allowing different parts of models to be easily switched and modified
  • Symbolic parametric simulation: the ability to create a fully computable object representing variations of model parameters
  • Importing and exporting FMI 2 models for broad model interchange and system integration

“Communication in Industry 4.0 with Wolfram SystemModeler and OPC UA”

SystemModeler OPC UA and Industry 4.0

Earlier last year Markus Dahl, applications engineer, announced another advancement within the SystemModeler realm—the integration of OPC Unified Architecture (OPC UA). “Wolfram SystemModeler can be utilized very effectively when combining different Modelica libraries, such as ModelPlug and OPCUA, to either create virtual prototypes of systems or test them in the real world using cheap devices like Arduinos or Raspberry Pis. The tested code for the system can then easily be exported to another system, or used directly in a HIL (hardware-in-the-loop) simulation.”

Case-Use Blogs That Hit Big

In 2017 we had some blog posts that made quite a splash by showing off Wolfram technology. From insights into the science behind movies to timely new views on history, the Wolfram Language provided some highlight moments in public conversations this year. Let’s check out a few…

“Hidden Figures: Modern Approaches to Orbit and Reentry Calculations”

Hidden Figures input

The story of mathematician Katherine Johnson and two of her NASA colleagues, Dorothy Vaughan and Mary Jackson, was in the spotlight at the 2017 Academy Awards, where the film about these women—Hidden Figures—was nominated for three Oscars. Three Wolfram scientists took a look at the math/physics problems the women grappled with, albeit with the luxury of modern computational tools found in the Wolfram Language. Our scientists commented on the crucial nature of Johnson’s work: “Computers were in their early days at this time, so Johnson and her team’s ability to perform complicated navigational orbital mechanics problems without the use of a computer provided an important sanity check against the early computer results.”

“Analyzing and Translating an Alien Language: Arrival, Logograms and the Wolfram Language”

Black and white logogram

Another Best Picture nominee in 2017 was Arrival, a film for which Stephen and Christoper Wolfram served as scientific advisors. Stephen wrote an often-cited blog post about the experience, Quick, How Might the Alien Spacecraft Work?. On the set, Christopher was tasked with analyzing and writing code for a fictional nonlinear visual language. On January 31, he demonstrated the development process he went through in a livecoding event broadcast on LiveEdu.tv. This livecoding session garnered almost 60,000 views.

“Exploring a Boxing Legend’s Career with the Wolfram Language: Ali at 75”

Ali career graphs

Wolfram celebrated the birthday of the late, great Muhammad Ali with a blog post from one of our data scientists, Jofre Espigule-Pons. Using charts and graphs from histograms and network plots, Espigule-Pons examined Ali’s boxing career, his opponent pool and even his poetry. This tribute to the boxing icon was one of the most-loved blog posts of 2017.

“Analyzing Social Networks of Colonial Boston Revolutionaries with the Wolfram Language”

Revolutionary social networks

For the Fourth of July holiday, Swede White, Wolfram’s media and communications specialist, used a variety of functions in the Wolfram Language to analyze the social networks of the revolutionaries who shaped our nation. (Yes, social networks existed before Facebook was a thing!) The data visualizations are enlightening. It turns out that Paul Revere was the right guy to spread the warning: although he never rode through towns shouting, “The British are coming,” he had the most social connections.

“Finding X in Espresso: Adventures in Computational Lexicology”

Espresso poster

So you say there’s no X in espresso. But are you certain? Vitaliy Kaurov, academic director of the Wolfram Science and Innovation Initiatives, examines the history behind this point of contention. This blog post is truly a shining example of what computational analysis can do for fields such as linguistics and lexicology. And it became a social media hit to boot, especially in certain circles of the Reddit world where pop culture debates can be virtually endless.

“How to Win at Risk: Exact Probabilities”

Risk battle graph

Just in time for the holiday board game season, popular Wolfram blogger Jon McLoone, director of technical communication and strategy, breaks down the exact probabilities of winning Risk. There are other Risk win/loss estimators out there, but they are just that—estimations. John uses the Wolfram Language to give exact odds for each battle possibility the game offers. Absolute candy for gamer math enthusiasts!

We had a great year at Wolfram Research, and we wish you a productive and rewarding 2018!

http://blog.wolfram.com/2018/01/04/goodbye-2017-it-was-a-marvelous-year-for-wolfram-research/feed/ 0
Spikey Bird: Creating a Flappy Bird Mod in the Wolfram Language http://blog.wolfram.com/2017/12/28/spikey-bird-creating-a-flappy-bird-mod-in-the-wolfram-language/ http://blog.wolfram.com/2017/12/28/spikey-bird-creating-a-flappy-bird-mod-in-the-wolfram-language/#comments Thu, 28 Dec 2017 18:18:15 +0000 Kevin Daily http://blog.internal.wolfram.com/?p=40056 Flappy Bird and Spikey Bird

An earlier version of this post appeared on Wolfram Community, where the creation of a game interface earned the author a staff pick from the forum moderators. Be sure to head over to Wolfram Community and check out other innovative uses of the Wolfram Language!

If you like video games and you’re interested in designing them, you should know that the Wolfram Language is great at making dynamic interfaces. I’ve taken a simple game, reproduced it and modded it with ease. Yes, it’s true—interactive games are yet another avenue for creative people to use the versatile Wolfram Language to fulfill their electronic visions.

The game I’m using for this demonstration is Flappy Bird, a well-known mobile game with a simple yet captivating interactive element that has helped many people kill a lot of time. The goal of the game is to navigate a series of pipes, where each successful pass adds a point to your score. The challenge is that the character, the bird, is not so easy to control. Gravity is constantly pulling it down. You “flap” to boost yourself upward by repeatedly tapping the screen, but you must accurately time your flaps to navigate the narrow gaps between pipes.

So follow along and see what kind of graphical gaming mayhem is possible in just a few short lines of code!

Creating Spikey

I’m going to make Spikey Bird by implementing the gameplay features of Flappy Bird. Our character is going to be Spikey:

Input 1

spikey = RemoveBackground[
    Rasterize[Style["\[MathematicaIcon]", FontSize -> 200],
     ImageResolution -> 700]]];
spikey = ImageResize[spikey, ImageDimensions[spikey]/40]

Considering the season, let’s make Spikey festive:

Input 3

santaHat = ImageCrop[RemoveBackground[\!\(\*GraphicsBox[
TagBox[RasterBox[CompressedData["CUSTOMIMAGEDATA"], {{0,
         2338}, {2200, 0}}, {0, 255},ColorFunction->RGBColor],
        "Byte", ColorSpace -> "RGB", Interleaving -> True],
       ImageSize->{29., Automatic},ImageSizeRaw->{2200, 2338},
       PlotRange->{{0, 2200}, {0, 2338}}]\)]];
spikey = ImageCompose[
  ImagePad[spikey, {{0, 0}, {0, 7}}, RGBColor[1, 1, 1, 0]],
  ImageResize[santaHat, ImageDimensions[santaHat]/46]]

Game Plan

The gameplay is going to make heavy use of Dynamic within a Graphics expression. Going into all details of how Dynamic works is beyond the scope of this blog, but you can learn more about it watching this great video and reading the extensive documentation.

I’m going to purposely put the most dynamic expressions into an infinite loop. Each update to the dynamic expression is going to take only a fraction of a second—the faster, the better.

The rest of this post is organized around the game’s main mechanic elements:

  • Gravity
  • Player input (controls)
  • Creating objects with automatic movement
  • Hit detection
  • Scoring
  • Sprite animations, sounds and other aesthetics

Designing Gravity

Gravity acts to accelerate an object downward. An acceleration is just a change in velocity per time step, and velocity is just a change in position per time step. One method for modeling gravity is to have velocity update before the position updates:

Input 5

SetAttributes[updateSpikeyPosition, HoldFirst];
updateSpikeyPosition[{pos_, vel_}, gravity_, ups_] := (
  vel += gravity/ups/ups;
  pos += vel/ups;

The rate that Dynamic can update—the “updates per second,” or UPS for short—is analogous to frames per second in modern video games. This sets a timescale that is included to slow down the apparent movement; two factors are used in acceleration (distance per second per second) and one factor in velocity (position per second).

The use of HoldFirst allows external variables, provided in the first argument, to update their definitions from within the function:

Input 7

posExample = velExample = 10;
data = Table[
   updateSpikeyPosition[{posExample, velExample}, -1, 1], {i, 25}];
ListPlot[data, AxesLabel -> {"Time Step", "Height"},
 PlotMarkers -> {"\[MathematicaIcon]", 20}, PlotStyle -> Red]

This example has the UPS set to 1. For the given parameters, it took 20 time steps to have the position return to 0. In practice, I set the UPS to be about 30, but this depends on the speed of your CPU.

Player Input

A “flap” is modeled by instantly changing the velocity opposite to gravity. Let’s set this action to the keyboard’s Ctrl key. Because this key controls Spikey’s movement, I put it inside Spikey’s movement code. If I put it somewhere else, like within some other Dynamic expression, you might perceive a slight lag when pressing the key.

It would be cheating if repeated flapping was applied by holding down the key. Thus, an instance of Switch is used to track that the key was pressed only once.

Lastly, I don’t want Spikey to leave the game world, so I include upper and lower bounds to its movement. Changes to updateSpikeyPosition are highlighted:

Input 10

SetAttributes[updateSpikeyPosition, HoldFirst];
updateSpikeyPosition[{pos_, vel_, previousKeyState_}, keyState_,
  boost_, gravity_, ups_] := (
  Switch[{previousKeyState, keyState},
   {False, True}, vel = boost; pos += vel/ups; previousKeyState = True,
   {True, False}, previousKeyState = False];
  vel += gravity/ups/ups;
  pos += vel/ups;
   pos < 0, vel = pos = 0,
   pos > 14, pos = 14; vel = 0];

I can create a set of controls that can modify gameplay even while you play. I separate the controls into different groups, such as the controls that affect Spikey:

Input 12

SetAttributes[playerEffects, HoldAll];
playerEffects[hPos_, kick_] := Panel[
    {"Flap Speed", LabeledSlider[Dynamic[kick], {-5, 5}]},
    {"Hor. Position", LabeledSlider[Dynamic[hPos], {0, 20}]}},
   Alignment -> Left],

And a set of controls that affects the whole game environment:

Input 14

SetAttributes[environmentEffects, HoldAll];
environmentEffects[ups_, gravity_, worldEdge_, imageSize_] := Panel[
    {"UPS Factor", LabeledSlider[Dynamic[ups], {1, 60}]},
    {"Gravity", LabeledSlider[Dynamic[gravity], {-50, 50}]},
    {"Right Edge", LabeledSlider[Dynamic[worldEdge], {10, 20}]},
    {"Image Size", LabeledSlider[Dynamic[imageSize], {100, 500}]}},
   Alignment -> Left],

Spikey is not a graphics primitive, so I use Inset in order to insert it into the Graphics expression. The key here is to use at least the four-argument syntax for Inset so you can specify the size in the fourth argument. Otherwise, the inset object’s size does not scale with the image size of the graphics.

Putting the pieces together in an instance of DynamicModule yields our first basic interface and keeps the variables locally scoped. You can play around with different combinations of gameplay factors, even while the disk is in motion:

Input 16

DynamicModule[{vPos = 10, hPos = 3, vel = 0, ups = 30, gravity = -30,
  kick = 2, previousKeyState = False, worldEdge = 10, imageSize = 200},
        updateSpikeyPosition[{vPos, vel, previousKeyState},
         CurrentValue["ControlKey"], kick, gravity, ups]]},
      Center, 1.2],
     Frame -> True,
     PlotRange -> {{0, Dynamic[worldEdge]}, {0, 14}},
     ImageSize -> Dynamic[imageSize]
      {playerEffects[hPos, kick]},
      {environmentEffects[ups, gravity, worldEdge, imageSize]}
      }, Alignment -> {Left, Top}]}}], SaveDefinitions -> True

Output 16 animation

Obstacle Movement

Obstacles in Spikey Bird consist of pipes, i.e. rectangles. I’m going to use vertices instead of the Rectangle primitive in anticipation of how I’ll implement hit detection:

Input 17

pipeVertices[hPos_, vPos_, pWidth_, pGap_] := {
  {{hPos, 0}, {hPos, vPos}, {pWidth + hPos, vPos}, {pWidth + hPos, 0}},
  {{hPos, vPos + pGap}, {hPos, 14}, {pWidth + hPos,
    14}, {pWidth + hPos, vPos + pGap}}}

The function allows for flexible creation of a pair of obstacles with a fixed size and gap:

Input 18

 Graphics[Polygon[pipeVertices[hPos, vPos, pWidth, pGap]],
  PlotRange -> {{0, 10}, {0, 14}}, Frame -> True],
 {{hPos, 5, "Horizontal Position"}, 0, 10},
 {{vPos, 5, "Vertical Position"}, 0, 14 - pGap},
 {{pWidth, 1, "Pipe Width"}, 1, 3},
 {{pGap, 1, "Gap Width"}, 1, 4}, SaveDefinitions -> True]

I use pipeVertices for each update in order to also allow the pipe gap and other factors to dynamically change even while the game is active. The horizontal and vertical positions are extracted from the previous instance of the blocks using the Part function, but only the horizontal position is updated.

I won’t create new objects every time they leave the plot range. Instead, I reuse the object but reset its position if it goes offscreen:

Input 19

SetAttributes[updateBlockPairPosition, HoldFirst];
updateBlockPairPosition[{vertices_}, speed_, ups_, pipeWidth_,
  pipeGap_, worldEdge_] := (
  vertices =
   pipeVertices[vertices[[1, 1, 1]] + speed/ups,
    vertices[[-1, -1, -1]] - pipeGap, pipeWidth, pipeGap];
   Max[vertices[[All, All, 1]]] < 0 && speed < 0,
   vertices =
    pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap],
   Min[vertices[[All, All, 1]]] > worldEdge && speed > 0,
   vertices =
    pipeVertices[-pipeWidth, RandomReal[{2, 11}], pipeWidth, pipeGap]

Let’s add more gameplay controls for the obstacles:

Input 21

SetAttributes[obstacleEffects, HoldAll];
obstacleEffects[scrollSpeed_, pipeWidth_, pipeGap_] :=
    {"Scroll Speed", LabeledSlider[Dynamic[scrollSpeed], {-5, 5}]},
    {"Pipe Width", LabeledSlider[Dynamic[pipeWidth], {0.5, 4}]},
    {"Pipe Gap", LabeledSlider[Dynamic[pipeGap], {1, 6}]}},
   Alignment -> Left],

Let’s also include some fun buttons. One resets the blocks, which is needed if you change the size of the playing area. One inverts gravity. And one pauses the game by temporarily setting all movement variables to zero:

Input 23

SetAttributes[buttonEffects, HoldAll];
buttonEffects[velocity_, kick_, gravity_, scrollSpeed_, obstacle1_,
  obstacle2_, pipeWidth_, pipeGap_, worldEdge_] :=
 DynamicModule[{previousMovement, pauseToggle = False,
   gravityToggle = False},
       pauseToggle = False; {gravity, scrollSpeed, kick, velocity} =
       pauseToggle = True;
       previousMovement = {gravity, scrollSpeed, kick, velocity};
       gravity = scrollSpeed = kick = velocity = 0]],
     Button["Reset Block Spacing",
      If[scrollSpeed < 0,
       obstacle1 =
        pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth,
       obstacle2 =
        pipeVertices[3/2 worldEdge, RandomReal[{2, 11}], pipeWidth,
       obstacle1 =
        pipeVertices[-pipeWidth, RandomReal[{2, 11}], pipeWidth,
       obstacle2 =
        pipeVertices[-worldEdge/2 - pipeWidth, RandomReal[{2, 11}],
         pipeWidth, pipeGap];]],
     Button["Invert Gravity",
      gravity = -gravity; velocity = 0; kick = -kick]}
    }, Alignment -> Left]

Now I have something that looks like a simplified version of Flappy Bird. First I create two pairs of pipes that are roughly equally spaced along the horizontal direction, but off of the right side of the visible plot range. There’s no hit detection yet, so you can change with the parameters to adjust the difficulty without frustration.

Input 25

DynamicModule[{vPos = 10, hPos = 3, vel = 0, ups = 30, gravity = -50,
  kick = 2, scrollSpeed = -1.6, pipeWidth = 2, pipeGap = 3.25,
  previousKeyState = False, obstacle1, obstacle2, worldEdge = 12,
  imageSize = 200},
 obstacle1 =
  pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap];
 obstacle2 =
  pipeVertices[3/2 worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap];

        spikey, {Dynamic[hPos],
         Dynamic[updateSpikeyPosition[{vPos, vel, previousKeyState},
           CurrentValue["ControlKey"], kick, gravity, ups]]}, Center,
        Dynamic[updateBlockPairPosition[{obstacle1}, scrollSpeed, ups,
           pipeWidth, pipeGap, worldEdge]]],
        Dynamic[updateBlockPairPosition[{obstacle2}, scrollSpeed, ups,
           pipeWidth, pipeGap, worldEdge]]]
      Frame -> False, PlotRange -> {{0, Dynamic[worldEdge]}, {0, 14}},
       ImageSize -> Dynamic[imageSize]
      ], FrameMargins -> -1],
      {buttonEffects[vel, kick, gravity, scrollSpeed, obstacle1,
        obstacle2, pipeWidth, pipeGap, worldEdge]},
      {environmentEffects[ups, gravity, worldEdge, imageSize]},
      {playerEffects[hPos, kick]},
      {obstacleEffects[scrollSpeed, pipeWidth, pipeGap]}
      }, Spacings -> {1, 1}, Alignment -> {Left, Top}]}}],
 SaveDefinitions -> True]

Output 25 animation

Hit Detection

There are many ways to implement hit detection. I’ll implement a perimeter of points around each object and track how close Spikey gets to each point. This allows us to use more interesting polygons later with minimal effort.

Let’s add points to our existing polygons. I’ll take a walk around the polygon, so let’s add the starting point to the end of our perimeter:

Input 26

originalPolyPoints = pipeVertices[5, 5, 2, 2][[1]];
cyclic = Append[originalPolyPoints, First[originalPolyPoints]]

Now accumulate the distances between the vertices in the walk around the polygon, including zero at the start:

Input 28

accumulatedDistance =
 Prepend[Accumulate[Norm /@ Differences[cyclic]], 0.]

Linearly interpolate the x and y coordinates. This creates a function for our walk around the perimeter:

Input 29

intX = Interpolation[
   Transpose[{accumulatedDistance, cyclic[[All, 1]]}],
   InterpolationOrder -> 1];
intY = Interpolation[
   Transpose[{accumulatedDistance, cyclic[[All, 2]]}],
   InterpolationOrder -> 1];

Our walk consists of a number of steps with equal spacing. The spacing between the chosen points is small enough that Spikey can’t squeeze through two adjacent points. Before I start the interpolated walk, I join the original vertex distances to the equally spaced ones:

Input 31

spacing = 0.8;
steps = Union[Range[0, Last[accumulatedDistance], spacing],
newPts = Transpose[{intX[steps], intY[steps]}]

I put all of these steps into a function:

Input 34

generatePerimeterPoints[originalPoints_, spacing_] :=
 Module[{cyclic, accumulatedDistance, intX, intY, steps},
  cyclic = Append[originalPoints, First[originalPoints]];
  accumulatedDistance =
   Prepend[Accumulate[Norm /@ Differences[cyclic]], 0.];
  intX = Interpolation[
    Transpose[{accumulatedDistance, cyclic[[All, 1]]}],
    InterpolationOrder -> 1];
  intY = Interpolation[
    Transpose[{accumulatedDistance, cyclic[[All, 2]]}],
    InterpolationOrder -> 1];
  steps =
   Union[Range[0, Last[accumulatedDistance], spacing],
    Most[accumulatedDistance]]; Transpose[{intX[steps], intY[steps]}]

Spikey hits a pipe if it gets within a minimal distance to any of the points:

Input 35

hit[position_, obstaclePoints_, dist_] :=
 Min[Norm /@ Transpose[Transpose[obstaclePoints] - position]] < dist ||
   position[[2]] > 13.5 || position[[2]] < 0.5

Here's a visualization of this effect in action. A detected hit colors the circle red:

Input 36

DynamicModule[{pos = {1, 1}},
      Dynamic[If[hit[pos, newPts, 0.5], Red, Green]], Thick,
      Circle[Dynamic[pos], 0.5]},
     ImageSize -> Small, PlotRange -> {{0, 10}, {-0.1, 6}},
     Frame -> True, Axes -> False, AspectRatio -> Automatic],
    Slider2D[Dynamic[pos], {{0.5, 0.4}, {9, 5.5}}]}}],
 SaveDefinitions -> True

Output 36 animation


Let's reward the player with a point. Every time a pair of blocks passes the player, I add a point to the score. I also set a flag to False to indicate that I should not score a second time until the block resets:

Input 37

SetAttributes[scoreFunction, HoldFirst];
scoreFunction[{score_, scoreFlag_}, hPos_, worldEdge_, speed_,
  obstacle_] := (
  If[scoreFlag &&
    (Max[obstacle[[All, All, 1]]] < hPos && speed < 0 ||
      Min[obstacle[[All, All, 1]]] > hPos && speed > 0),
   score++; scoreFlag = False];
   Max[obstacle[[All, All, 1]]] < 0 && speed < 0 ||
    Min[obstacle[[All, All, 1]]] > worldEdge && speed > 0,
   scoreFlag = True];

Now let's add the scoring feature and hit detection to our game. The game objects get extra vertices by mapping generatePerimeterPoints over the pipe vertices, but updateBlockPairPosition only outputs the original vertices. This allows easier application of textures.

Scoring is added by including scoreFunction. Changes to updateBlockPairPosition are highlighted:

Input 39

SetAttributes[updateBlockPairPosition, HoldFirst];
updateBlockPairPosition[{vertices_, score_, scoreFlag_}, hPos_,
  speed_, ups_, pipeWidth_, pipeGap_, worldEdge_] :=
  originalVertices =
   pipeVertices[vertices[[1, 1, 1]] + speed/ups,
    vertices[[-1, -1, -1]] - pipeGap, pipeWidth, pipeGap];
  vertices = generatePerimeterPoints[#, 0.5] & /@ originalVertices;
  scoreFunction[{score, scoreFlag}, hPos, worldEdge, speed,
   Max[vertices[[All, All, 1]]] < 0 && speed < 0,
   originalVertices =
    pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap];
   vertices = generatePerimeterPoints[#, 0.5] & /@ originalVertices;
   Min[vertices[[All, All, 1]]] > worldEdge && speed > 0,
   originalVertices =
    pipeVertices[-pipeWidth, RandomReal[{2, 11}], pipeWidth,
   vertices = generatePerimeterPoints[#, 0.5] & /@ originalVertices;

Let's add a button to reset the game with the current settings so I don't have to reevaluate the function every time:

Input 41

SetAttributes[resetButton, HoldFirst];
resetButton[{score_, velocity_, kick_, gravity_, scrollSpeed_,
   obstacle1_, obstacle2_}, startingValues_, pipeWidth_, pipeGap_,
  worldEdge_] := Button["Reset Game",
  {gravity, velocity, kick, scrollSpeed} = startingValues;
  score = 0;
  If[scrollSpeed < 0,
   obstacle1 =
    pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap];
   obstacle2 =
    pipeVertices[3/2 worldEdge, RandomReal[{2, 11}], pipeWidth,
   obstacle1 =
    pipeVertices[-pipeWidth, RandomReal[{2, 11}], pipeWidth,
   obstacle2 =
    pipeVertices[-worldEdge/2 - pipeWidth, RandomReal[{2, 11}],
     pipeWidth, pipeGap];]]

Let's add a button to toggle the hit detection so it's only on when I'm ready for it:

Input 43

SetAttributes[hitEffects, HoldAll];
hitEffects[hitToggle_] :=
 Button["Hit Detection",
  If[hitToggle, hitToggle = False, hitToggle = True],
  Appearance -> Dynamic[If[hitToggle, "Pressed", Automatic]]]

Putting It All Together

In addition to the hit detection and scoring, some finishing touches include styling the objects and the background. Changes to the interface are highlighted:

Input 45

DynamicModule[{vPos = 10, hPos = 3, vel = 0, ups = 30, gravity = -50,
  kick = 2, scrollSpeed = -1.6, pipeWidth = 2, pipeGap = 3.25,
  previousKeyState = False, obstacle1, obstacle2, worldEdge = 14.5,
  imageSize = 500, hitToggle = True, score = 0, canScore1 = True,
  canScore2 = True, startingValues},
 obstacle1 =
  generatePerimeterPoints[#, 0.5] & /@
   pipeVertices[worldEdge, RandomReal[{2, 11}], pipeWidth, pipeGap];
 obstacle2 =
  generatePerimeterPoints[#, 0.5] & /@
   pipeVertices[3/2 worldEdge, RandomReal[{2, 11}], pipeWidth,
 startingValues = {gravity, vel, kick, scrollSpeed};


           If[hit[{hPos, vPos},
             Flatten[Join[obstacle1, obstacle2], 1], 0.5],

            If[gravity != 0 || vel != 0 || kick != 0 ||
              scrollSpeed != 0,
             startingValues = {gravity, vel, kick, scrollSpeed}];
            gravity = vel = kick = scrollSpeed = 0]];

          updateSpikeyPosition[{vPos, vel, previousKeyState},
           CurrentValue["ControlKey"], kick, gravity, ups]
          ]}, Center, 1.2],
        Texture[{{{1, 0, 0}, {1, 1, 1}, {1, 1, 1}}}],
         Dynamic[updateBlockPairPosition[{obstacle1, score,
            canScore1}, hPos, scrollSpeed, ups, pipeWidth, pipeGap,
         VertexTextureCoordinates ->
          3 {{0, 0}, {1, 1}, {2, 0}, {1, -1}}]},
        Texture[{{{1, 0, 0}, {1, 1, 1}, {1, 1, 1}}}],
         Dynamic[updateBlockPairPosition[{obstacle2, score,
            canScore2}, hPos, scrollSpeed, ups, pipeWidth, pipeGap,
         VertexTextureCoordinates ->
          3 {{0, 0}, {1, 1}, {2, 0}, {1, -1}}]},
       Text[Style[Dynamic[score], 30], Scaled[{0.5, 0.8}], Center],

      Frame -> False, PlotRange -> {{0, Dynamic[worldEdge]}, {0, 14}},
       ImageSize -> Dynamic[imageSize], Background -> Darker[Green]
      ], FrameMargins -> -1],
      {buttonEffects[{vel, kick, gravity, scrollSpeed, obstacle1,
         obstacle2}, pipeWidth, pipeGap, worldEdge]},
      {resetButton[{score, vel, kick, gravity, scrollSpeed, obstacle1,
          obstacle2}, startingValues, pipeWidth, pipeGap,
      {environmentEffects[ups, gravity, worldEdge, imageSize]},
      {playerEffects[hPos, kick]},
      {obstacleEffects[scrollSpeed, pipeWidth, pipeGap]}},
     Alignment -> Left]}
   }, Spacings -> {1, 1}, Alignment -> {Left, Top}],
 SaveDefinitions -> True

Output 45 animation

Final Thoughts

With just a couple hundred unique lines of code, I was able to implement Spikey Bird using the Wolfram Language. Not only that, all of the gameplay parameters were left open to change while you play, kind of like not-so-hidden developers' tools. If you got rid of the developers' tools, then the code would be considerably shorter!

Here are some suggestions you can try on your own:

  • Play the game using only the "Invert Gravity" button.
  • Modify the blocks to fall from the top of the screen to the bottom (similar to classic "flight" games).
  • Change the obstacles to randomly generated polygons instead of rectangles, and add more of them!
  • Add sound effects in appropriate places using EmitSound.

Bonus Suggestion 1: Using Sprites Instead of Primitives

I'm not an artist, but the Wolfram Language makes it simple to include bitmap graphics. For example, I can screen capture images from the internet and import them. Like I did with Spikey, it's often sufficient to just use ImageCrop and RemoveBackground, then use Inset to include it in a graphics expression. It's important to keep the image size small, though, or updates to the image can take time to render for each update. The polygons will still be needed for hit detection, but the images sit on top of them:

Input 46

pipe = \!\(\*GraphicsBox[
"], {{0, 361}, {44, 0}}, {0, 255},
     "Byte", ColorSpace -> "RGB", Interleaving -> True,
      MetaInformation -> Association[
       "Exif" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "DateTime" -> DateObject[{2017, 7, 25, 9, 33, 57.},
            "Instant", "Gregorian", -6.]],
        "Comments" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "Creation Time" -> DateObject[{2017, 7, 25, 9, 33, 57.},
            "Instant", "Gregorian", -6.]]]],Selectable->False],
    DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{44, 361},
    PlotRange->{{0, 44}, {0, 361}}]\); city = \!\(\*GraphicsBox[
"], {{0, 131}, {320, 0}}, {0, 255},
     "Byte", ColorSpace -> "RGB", Interleaving -> True,
      MetaInformation -> Association[
       "Exif" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "DateTime" -> DateObject[{2017, 7, 25, 9, 37, 51.},
            "Instant", "Gregorian", -6.]],
        "Comments" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "Creation Time" -> DateObject[{2017, 7, 25, 9, 37, 51.},
            "Instant", "Gregorian", -6.]]]],Selectable->False],
    DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{320, 131},
    PlotRange->{{0, 320}, {0, 131}}]\); ground = \!\(\*GraphicsBox[
"], {{0, 158}, {400, 0}}, {0, 65535},ColorFunction->RGBColor],
     "Bit16", ColorSpace -> "RGB", Interleaving -> True,
      MetaInformation -> Association[
       "Exif" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "DateTime" -> DateObject[{2017, 7, 25, 9, 51, 31.},
            "Instant", "Gregorian", -6.]],
        "Comments" -> Association[
         "Software" -> "Created with the Wolfram Language : \
          "Creation Time" -> DateObject[{2017, 7, 25, 9, 51, 31.},
            "Instant", "Gregorian", -6.]]]],Selectable->False],
    DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{400, 158},
    PlotRange->{{0, 400}, {0, 158}}]\);

You can even restyle your images using ImageRestyle:

Input 47

artStyles = {\!\(\*GraphicsBox[
TagBox[RasterBox[CompressedData["CUSTOMIMAGEDATA"], {{0, 150}, {150, 0}}, {0., 1.},
       "Real", ColorSpace -> "RGB", Interleaving -> True],
      ImageSizeRaw->{150, 150},
      PlotRange->{{0, 150}, {0, 150}}]\), \!\(\*GraphicsBox[
TagBox[RasterBox[CompressedData["CUSTOMIMAGEDATA"], {{0, 300}, {480, 0}}, {0,
       "Byte", ColorSpace -> "RGB", ImageResolution -> {72, 72},
        Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Software" -> "Created with the Wolfram Language : \
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{480, 300},
      PlotRange->{{0, 480}, {0, 300}}]\), \!\(\*GraphicsBox[
TagBox[RasterBox[CompressedData["CUSTOMIMAGEDATA"], {{0, 276}, {400, 0}}, {0,
       "Byte", ColorSpace -> ColorProfileData[CompressedData[
        ImageResolution -> {300, 300}, Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Orientation" -> Association[
             "CameraTopOrientation" -> Top, "Mirrored" -> False],
            "XResolution" -> 300, "YResolution" -> 300,
            "ResolutionUnit" -> "inch",
            "Software" -> "Created with the Wolfram Language : \
            "DateTime" -> DateObject[{2017, 3, 9, 11, 56, 18.},
              "Instant", "Gregorian", 2.], "PixelXDimension" -> 750,
            "PixelYDimension" -> 519]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{400, 276},
      PlotRange->{{0, 400}, {0, 276}}]\), \!\(\*GraphicsBox[
TagBox[RasterBox[CompressedData["CUSTOMIMAGEDATA"], {{0, 250}, {400, 0}}, {0, 255},
       "Byte", ColorSpace -> "RGB", ImageResolution -> {72, 72},
        Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Orientation" -> Association[
             "CameraTopOrientation" -> Top, "Mirrored" -> False],
            "XResolution" -> 72, "YResolution" -> 72,
            "ResolutionUnit" -> "inch",
            "Software" -> "Created with the Wolfram Language : \
            "DateTime" -> DateObject[{2017, 5, 30, 17, 18, 42.},
              "Instant", "Gregorian", 2.], "ExifVersion" -> "2.21",
            "ComponentsConfiguration" -> "Y",
            "FlashpixVersion" -> "1.00", "ColorSpace" -> "RGBColor",
            "PixelXDimension" -> 1024, "PixelYDimension" -> 640,
            "SceneCaptureType" -> "Standard"]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{400, 250},
      PlotRange->{{0, 400}, {0, 250}}]\)};

Input 48

cityStyles =
  Prepend[SetAlphaChannel[ImageRestyle[city, #],
      AlphaChannel[city]] & /@ artStyles, city];

Input 49

groundStyles =
  Prepend[ImageRestyle[ground, #] & /@ artStyles, ground];

Input 50

groundStyles =
  Prepend[ImageRestyle[ground, #] & /@ artStyles, ground];

Input 51

domColors = {RGBColor[
   0.44313725490196076`, 0.7725490196078432, 0.8117647058823529],
   GrayLevel[0.9], RGBColor[
   0.8257116158191437, 0.7624872802598549, 0.6834260771904932],
   0.9275118744483869, 0.8953480709038453, 0.7543210863558273],
   0.2386547585427879, 0.4876022587659778, 0.7371873300590905]};

Input 52

styleControl =
  Thread[Range[5] ->
    Prepend[artStyles, Image[ConstantArray[1, {150, 150}]]]];

Here's an example of what the game could look like with a little more effort:

Input 53

   Inset[cityStyles[[art]], {-5, 2}, Scaled[{0, 0}], 10],
   Inset[spikey, {-2, 9}, Center, 1.2],
   Inset[ImageReflect[pipeStyles[[art]], Top], {x0, height},
    Scaled[{0, 1}], pWidth],
   Inset[pipeStyles[[art]], {x0, height + pGap}, Scaled[{0, 0}],
    {FaceForm[None], EdgeForm[Red],
     Polygon[pipeVertices[x0, height, pWidth, pGap]]},
   Inset[groundStyles[[art]], {-5, 2.3}, Scaled[{0, 1}], 10]},
  PlotRange -> {{-5, 5}, {0, 14}}, Background -> domColors[[art]],
  PlotRangeClipping -> True],
 {{x0, 1.08, "Hor. Position"}, -5, 5},
 {{pWidth, 1.745, "Pipe Width"}, 1, 4},
 {{height, 7.62, "Gap Position"}, 0, 12},
 {{pGap, 2.248, "Gap Width"}, 1, 3},
 {{draw, False, "Draw Hit Box"}, {True, False}},
 {{art, 1, "Art Styles"}, styleControl}, SaveDefinitions -> True]

Bonus Suggestion 2: Animation

Animation can add another polished element to your game, but it will use the CPU continuously:

Input 54

animationFrames = {\!\(\*GraphicsBox[
"], {{0, 29}, {38, 0}}, {0, 255},ColorFunction->RGBColor],
       "Byte", ColorSpace -> "RGB", Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Software" -> "Created with the Wolfram Language : \
          "Comments" -> Association[
           "Software" -> "Created with the Wolfram Language : \
            "Creation Time" -> DateObject[{2017, 7, 25, 9, 27, 5.},
              "Instant", "Gregorian", -6.]]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{38, 29},
      PlotRange->{{0, 38}, {0, 29}}]\), \!\(\*GraphicsBox[
"], {{0,
        29}, {39, 0}}, {0, 255},ColorFunction->RGBColor],
       "Byte", ColorSpace -> "RGB", Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Software" -> "Created with the Wolfram Language : \
          "Comments" -> Association[
           "Software" -> "Created with the Wolfram Language : \
            "Creation Time" -> DateObject[{2017, 7, 25, 9, 27, 5.},
              "Instant", "Gregorian", -6.]]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{39, 29},
      PlotRange->{{0, 39}, {0, 29}}]\), \!\(\*GraphicsBox[
"], {{0, 30}, {41,
        0}}, {0, 255},ColorFunction->RGBColor],BoxForm`ImageTag[
       "Byte", ColorSpace -> "RGB", Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Software" -> "Created with the Wolfram Language : \
          "Comments" -> Association[
           "Software" -> "Created with the Wolfram Language : \
            "Creation Time" -> DateObject[{2017, 7, 25, 9, 27, 5.},
              "Instant", "Gregorian", -6.]]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{41, 30},
      PlotRange->{{0, 41}, {0, 30}}]\), \!\(\*GraphicsBox[
"], {{0,
        29}, {39, 0}}, {0, 255},ColorFunction->RGBColor],
       "Byte", ColorSpace -> "RGB", Interleaving -> True,
        MetaInformation -> Association[
         "Exif" -> Association[
           "Software" -> "Created with the Wolfram Language : \
          "Comments" -> Association[
           "Software" -> "Created with the Wolfram Language : \
            "Creation Time" -> DateObject[{2017, 7, 25, 9, 27, 5.},
              "Instant", "Gregorian", -6.]]]],Selectable->False],
      DefaultBaseStyle->"ImageGraphics",ImageSizeRaw->{39, 29},
      PlotRange->{{0, 39}, {0, 29}}]\)};
ListAnimate[animationFrames, AnimationRate -> 2]

The ground is unique in that the image never changes. To create the illusion of continuous movement, you can reset its position at a certain point, exactly like I do with the obstacles:

Input 56

SetAttributes[updateGroundPosition, HoldFirst];
updateGroundPosition[{groundPos_}, groundResetValue_, vel_,
  ups_] := (
  groundPos += vel/ups;
  If[groundPos < -groundResetValue, groundPos = 0];

For continuously updating sprites, like the bird, the Clock function is often sufficient to cycle through the animation frames at a fixed rate:

Input 58

DynamicModule[{vPos = 10, hPos = 3, vel = 0, ups = 30, gravity = -50,
  kick = 2, previousKeyState = False, worldEdge = 10, imageSize = 250,
   groundPos = 0, groundResetValue = 0.71},

    Inset[city, {0, -0.6}, Scaled[{0, 0}], 10],
    Inset[Dynamic[animationFrames[[Clock[{1, 4, 1}, 0.5]]]],
       updateSpikeyPosition[{vPos, vel, previousKeyState},
        CurrentValue["ControlKey"], kick, gravity, ups]
       ]}, Center, 1.2],
     ground, {Dynamic[
       updateGroundPosition[{groundPos}, groundResetValue, -0.7,
        ups]], -0.3}, Scaled[{0, 1}], 12]
   PlotRange -> {{0, worldEdge}, {-2, 12}}, ImageSize -> imageSize,
   Background ->
    RGBColor[0.44313725490196076`, 0.7725490196078432,
   ], FrameMargins -> -1], SaveDefinitions -> True

Output 58 animation

The worst case is noncontinuous movement, such as movement that is triggered by pressing a button. I still want the dynamic expression to update as fast as possible such that our position updates like normal, but I don't want the animation frames to cycle at the same rate. I tend to use an instance of Mod to introduce a delay into the update of the frame index:

Input 59

DynamicModule[{animationCounter = 0, animationDelay = 10,
  frameCounter = 1},
   If[Mod[animationCounter, animationDelay] == 0,
    animationCounter = 0; frameCounter++];
   If[frameCounter == 5, frameCounter = 1]];
  ], SaveDefinitions -> True

If you'd like to play around with the code you read about today, you can download this post as a Wolfram Notebook.

http://blog.wolfram.com/2017/12/28/spikey-bird-creating-a-flappy-bird-mod-in-the-wolfram-language/feed/ 0
Computational Gastronomy: Using the Wolfram Language to Prepare a Sumptuous Holiday Feast http://blog.wolfram.com/2017/12/22/computational-gastronomy-using-the-wolfram-language-to-prepare-a-sumptuous-holiday-feast/ http://blog.wolfram.com/2017/12/22/computational-gastronomy-using-the-wolfram-language-to-prepare-a-sumptuous-holiday-feast/#comments Fri, 22 Dec 2017 15:21:37 +0000 Micah Lindley http://blog.internal.wolfram.com/?p=40062 Plated meal restyled

In recent years there’s been a growing interest in the intersection of food and technology. However, many of the new technologies used in the kitchen are cooking tools and devices such as immersion circulators, silicone steam baskets and pressure ovens. Here at Wolfram, our approach has been a bit different, with a focus on providing tools that can query for, organize, visualize and compute data about food, cooking and nutrition.

Last Christmas I went home to Tucson, Arizona, to spend time with my family over the holidays. Because I studied the culinary arts and food science, I was quickly enlisted to cook Christmas dinner. There were going to be a lot of us at my parents’ house, so I was aware this would be no small task. But I curate food and nutrition data for Wolfram|Alpha, so I knew the Wolfram technology stack had some excellent resources for pulling off this big meal without a hitch.

Building a Christmas Dinner Survey

Our family has diverse tastes, and we had at least one vegan in the bunch last year. I wanted to make sure that everyone felt included and was served a meal they could really enjoy. So with the help of one my Wolfram colleagues, I created a Christmas dinner survey using the Wolfram Cloud and Wolfram Data Drop.

Setting Up a Databin

The Wolfram Data Drop is a great way to collect and store data from IoT (Internet of Things) devices. Whether you’re collecting weather data from your desk or mileage data from your car, Data Drop provides a great platform to collect data and store it safely in the Wolfram Cloud, and it allows you to use tools in the Wolfram Language to analyze the data at your leisure. The Wolfram Data Drop can also be used to collect data from people through web forms built in the Wolfram Language.

The first step in using the Wolfram Data Drop is to create a new databin where the data will be stored. The code shown here demonstrates how easy this is and includes options for naming the databin, specifying the administrator and setting permissions.

foodSurveyDatabin code

foodSurveyDatabin = CreateDatabin[
   	"Name" -> "Christmas Dinner Survey 2017",
   	"Administrator" -> "micahl@wolfram.com",
   	Permissions -> "Public"

Creating a Web Form

Making web forms in the Wolfram Language is one of my personal favorite features introduced in the last few years. While creating web forms may not generate headlines in the way functions like Classify or ImageRestyle might, the practical applications of web forms in a variety of contexts are nearly limitless—and they are a great introduction to writing code and computational thinking because the web forms are easy to design, build, test and tweak, even for people new to the Wolfram Language and programming. Using web forms is a great way to collect data because the forms are easy to fill out. We can also deploy them to the web so we can access them from desktop or mobile platforms. They also store the interpreted data in a uniform structure, making the later steps of data analysis and visualization much, much easier.

Using FormFunction, DatabinAdd, and CloudDeploy, we can build a custom survey that collects and interprets data. This is then stored using WDF (Wolfram Data Framework), which utilizes the Wolfram Knowledgebase and knows about Entities (such as cities and foods), as well as quantities, images and networks.

Data Drop screenshot

Christmas dinner survey screen shot

(Here’s a downloadable notebook with the code to build this web form.)

Sending Out Emails Programmatically

After I built the survey, I needed an easy way to invite my family members to respond to it. Using tools available in the Wolfram Language, I was able to quickly design a form email that I could deploy in just a few lines of code that would also address each family member individually. I constructed an Association with the names and email addresses of my family members and wrote a brief email asking the recipient to fill out the Christmas dinner survey. Then I used SendMail to slot in the name, email address, email body and survey hyperlink, and sent the emails en masse to everyone invited to Christmas dinner.

emailList codeemailBody code

emailBody =
  "We're looking forward to having you over for Christmas dinner! As \
we'll be hosting a decent-sized crowd, I've created a brief survey \
where you can fill out any dietary preferences, restrictions, \
allergies, or recommendations. Here's the link: \
hyperlink = Hyperlink[
   "Christmas Dinner Survey 2017",
    "To" -> #EmailAddress,
    "Subject" -> "Christmas Dinner Survey 2017",
    "Body" -> StringTemplate["Dear ``,\n\n``"][#Name, emailBody],
    "Signature" -> "Thanks,\nMicah"
   ] & /@ emailList

A European Holiday Food Map

Gathering food preferences from my family was just the beginning. I always want to wow the people I cook for, so I also needed some inspiration for enhancing the dishes I’d serve. Using tools in the Wolfram Language, it’s easy to build visual aids to assist with culinary experimentation, which often helps get my creative juices flowing. I’ve personally put loads of food images into Wolfram|Alpha, and I know that the Wolfram Language has access to the treasure trove of content in Wolfram|Alpha. So I thought I’d play around with this resource.

Data visualizations come in many forms these days, and the Wolfram Language provides numerous built-in functions to create them, such as WordCloud and ImageCollage. But I thought I’d take a holiday-food visualization one step further…

I was thinking about how particular holiday dishes and preparations are associated with the nations where they originated. Certain ingredients may be revered or taboo, depending on the local culture and religion. Geography also plays an important role due to the environmental needs of source ingredients. For example, rye and oats are the grains of choice in Northern Europe because wheat doesn’t grow well in that colder climate. I decided using a map with images of traditional holiday dishes could lead to the “aha” moment I was looking for.

To get started, I curated image URLs of holiday dishes from European countries and gathered them in an Association. Associations are a great way to store data because they are very fast and have labeled Key/value pairs, which are easy to understand and query. Next, I used Put to create a package file that stores Wolfram Language expressions as a separate file that allows for better organization of data and code. Then, using CloudObject and CopyFile, I uploaded the package file to the Wolfram Cloud, setting SetPermisions to "Public", which allows anyone to access the data. Finally, I used a simple CloudGet on the CloudObject to download the data from the Wolfram Cloud directly into a notebook.

The next steps in the process take the image data from the Wolfram Cloud and visualize the images using geography data built into the Wolfram Language. Using functions such as GeoGraphics and GeoStyling, I was able to construct a map of images of traditional holiday foods displayed over their home nations. The Tooltip function provides a clean and nifty way to display the name of each dish without cluttering the map with textual labels or unsightly keys. ImageAssemble tiles the images into a collage, so the dish in question is easier to see when displayed on its country of origin. And EdgeForm defines the borders of each country, making the image easier to recognize as a map of Europe.

To collect the images I searched Creative Commons. From there, I simply grabbed the image file name, author attribution, dish name and country and placed them in a List of Associations.

Creating this map requires a bit of curation. I assembled the images I needed in this package file you can download. Just make sure you place it in the same file directory as the target notebook.

euroHolidayFoodsCloudObject code

euroHolidayFoodsCloudObject =
CopyFile["EuropeanChristmasDishImages.m", euroHolidayFoodsCloudObject];
SetPermissions[euroHolidayFoodsCloudObject, "Public"];
euroHolidayFoods = CloudGet[euroHolidayFoodsCloudObject]

(This output is only a sample set with the first three entries. The package file I mentioned earlier has the complete set.)

The map is a slick bit of code that uses GeoGraphics to tile each image over its source country. Tooltip allows you to hover the cursor over each country to see a pop-up of the associated food.

GeoGraphics Europe map with holiday foods

     {image = Import[#ImageURL]},
            {{.2, .2}, {.8, .8}},
            DataRange -> {{0, 1}, {0, 1}}

           i_Image /; (ImageDimensions[i][[1]] > 1000) :>
            ImageResize[i, 500]
          {3, 3}
       Framed[Row@{#DishName, "   ", image}, FrameStyle -> Gray]]
    & /@ euroHolidayFoods
 GeoRange -> \!\(\*
DynamicModuleBox[{Typeset`query$$ = "europe", Typeset`boxes$$ =
RowBox[{"Entity", "[",
RowBox[{"\"GeographicRegion\"", ",", "\"Europe\""}], "]"}],
        "\"Entity[\\\"GeographicRegion\\\", \\\"Europe\\\"]\"",
        "\"geographic region\""}, "Entity"],
      Typeset`allassumptions$$ = {{
       "type" -> "Clash", "word" -> "europe",
        "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
        "Values" -> {{
          "name" -> "GeographicRegion", "desc" -> "a continent",
           "input" -> "*C.europe-_*GeographicRegion-"}, {
          "name" -> "CountryClass", "desc" -> "a class of countries",
           "input" -> "*C.europe-_*CountryClass-"}, {
          "name" -> "Word", "desc" -> "a word",
           "input" -> "*C.europe-_*Word-"}}}},
      Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
      Typeset`querystate$$ = {
      "Online" -> True, "Allowed" -> True,
       "mparse.jsp" -> 3.091041`6.941649759140215, "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{217., {7., 17.}},
        Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
         Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
 ImageSize -> Large,
 GeoRangePadding -> {{Quantity[-50, "Kilometers"],
    Quantity[-600, "Kilometers"]}, {Quantity[-250,
     "Kilometers"], -Quantity[575, "Kilometers"]}}

Europe holiday foods map

Computational Solutions to Culinary Conundrums

My Christmas dinner project was well on its way. I had dish requests. I had some ideas for sprucing them up with some unusual ingredients and preparation methods from “the old countries.” But would I have everything I needed to make it all happen?

Running out of ingredients while cooking a meal for a large group is one of the most discouraging and frustrating experiences for home cooks. Cooking during the holidays can be especially challenging because many grocery stores and markets are closed or have limited hours. Plus, when cooking a large meal, it can be difficult to find the time to duck out and purchase the missing ingredients due to the various tasks that must be carefully attended throughout meal preparation. Fortunately, the Wolfram Language has you covered.

My brother Caleb requested carrot purée as a side dish for Christmas dinner. I was eager to cater to my brother’s request, but I also wanted to add a creative twist to the dish—however, I needed a bit of assistance. Despite years of studying food, I couldn’t remember all the different colors carrots can be. But the Wolfram Language knows about thousands of foods, including carrots. In a couple of lines of code, it was easy to access data including an image of the food and information about interior and exterior colors. I found data about price look-up codes (commonly referred to as PLUs), allowing for a quick confirmation that the gnarled root vegetable in my hand was a carrot and not a parsnip.

Carrot comparisons screen shot

Vegetable PLUs screen shot

Before I ran to the grocery store, I wanted a backup plan in case they didn’t have any non-orange carrots. Fortunately, a simple query in the Wolfram Language can provide carefully curated ingredient substitutions. With some extra code, the ingredients are visualized in a grid, which helps to digest the data more easily. The list is also ordered so the first ingredient (or set of ingredients) listed is considered the best substitution and the second ingredient listed is the second-best substitute. As someone who taught cooking classes for several years, I know that one of the most common meal-killers rears its ugly head when people are missing that one crucial ingredient. But with the help of the Wolfram Language, it’s like having a professional chef with you in the kitchen to suggest substitutions and much more, such as offering proportion calculations, providing nutrition information, giving you descriptions of new ingredients… the list goes on!

visualizeReplacements carrot output

(Here’s a downloadable notebook with the code to build the visualizeReplacements function.)

Keep Wolfram in Your Kitchen

Our big family Christmas dinner last year was a tremendous hit. Whenever I cook at my parents’ house I always encounter a few curve balls, so I appreciated the help I got from the Wolfram tech stack. My family wanted me to do it again this year, but I decided to stay in Illinois, eat some takeout Chinese food and watch kung fu movies instead. However, if you have an ambitious holiday cooking project ahead of you, I encourage you to experiment in the kitchen with the Wolfram Language.

A great place to start exploring computational gastronomy is the Wolfram|Alpha Examples page for food and nutrition. There you can see the wide variety of Wolfram|Alpha query fields on cooking, ingredients and other food-related data, as well as food-themed Wolfram|Alpha blog posts. If the Wolfram|Alpha app isn’t on your smartphone, it should be… especially when you’re in the thick of meal prep and could use some data or number crunching! Happy computational cooking!

Appendix: Creating the Lead Image

My dad studied art in college, so growing up, my parents’ house was always full of art made by my dad and friends of his from college. In addition, I always try to take pictures of the dishes I’ve prepared when I cook a nice meal for friends and family. Since this blog is about cooking and family, I thought, “Why not combine these images using my favorite new Wolfram Language function, ImageRestyle?” In one quick line of code, any image can be rendered in the style of another image or list of images. So I simply took a picture of a dish I prepared and blended it with a list containing a painting by my dad’s friend Mike and the same picture of a dish I prepared (the original image is added to keep the colors in the final image brighter) and voilà, I get an image that looks like a painting of food (and without a drop of paint on my clothes).

Here’s another dish I photographed and then restyled using the same technique and art piece:

Lead image restyle

http://blog.wolfram.com/2017/12/22/computational-gastronomy-using-the-wolfram-language-to-prepare-a-sumptuous-holiday-feast/feed/ 0
Creating Mathematical Gems in the Wolfram Language http://blog.wolfram.com/2017/12/14/creating-mathematical-gems-in-the-wolfram-language/ http://blog.wolfram.com/2017/12/14/creating-mathematical-gems-in-the-wolfram-language/#comments Thu, 14 Dec 2017 15:00:10 +0000 Michael Gammon http://blog.internal.wolfram.com/?p=39801 The Wolfram Community group dedicated to visual arts is abound with technically and aesthetically stunning contributions. Many of these posts come from prolific contributor Clayton Shonkwiler, who has racked up over 75 “staff pick” accolades. Recently I got the chance to interview him and learn more about the role of the Wolfram Language in his art and creative process. But first, I asked Wolfram Community’s staff lead, Vitaliy Kaurov, what makes Shonkwiler a standout among mathematical artists.

Stereo Vision Rise Up“Stereo Vision” and “Rise Up”

“Clay, I think, pays special attention to expressing a math concept behind the art,” Kaurov says. “It is there, like a hidden gem, but a layman will not recognize it behind the beautiful visual. So Clay’s art is a thing within a thing, and there is more to it than meets the eye. That mystery is intriguing once you know it is there. But it’s not easy to express something as abstract and complex as math in something as compact and striking as a piece of art perceivable in a few moments. This gap is bridged with help from the Wolfram Language, because it’s a very expressive, versatile medium inspiring the creative process.”

Shonkwiler is a mathematics professor at Colorado State University and an avid visual artist, specializing in Wolfram Language–generated GIF animations and static images based on nontrivial math. “I am interested in geometric models of physical systems. Currently I’m mostly focused on geometric approaches to studying random walks with topological constraints, which are used to model polymers,” he says.

In describing how he generates ideas, he says, “There are some exceptions, but there are two main starting points. Often I get it into my head that I should be able to make an animation from some interesting piece of mathematics. For example, in recent months I’ve made animations related to the Hopf fibration.”

Stay Upright
“Stay Upright”

"Stay Upright" code

DynamicModule[{n = 60, a = \[Pi]/4,
  viewpoint = {1, 1.5, 2.5}, \[Theta] = 1.19, r = 2.77, plane,
  cols = RGBColor /@ {"#f43530", "#e0e5da", "#00aabb", "#46454b"}},
 plane = NullSpace[{viewpoint}];
    Table[{Blend[cols[[;; -2]], r/\[Pi]],
       RotationMatrix[\[Theta]].plane.# & /@ {{Cot[r] Csc[a], 0,
          Cot[a]}, {0, Cot[r] Sec[a], -Tan[a]}}]}, {r, \[Pi]/(2 n) +
       s, \[Pi], 2 \[Pi]/n}]}, Background -> cols[[-1]],
   PlotRange -> r, ImageSize -> 540], {s, 0., 2 \[Pi]/n}]]

Like many artists, Shonkwiler draws inspiration from existing art and attempts to recreate it or improve upon it using his own process. He says, “Whether or not I actually succeed in reproducing a piece, I usually get enough of a feel for the concept to then go off in some new direction with it.”

As to the artists who inspire him, Shonkwiler says, “There’s an entire community of geometric GIF artists on social media that I find tremendously inspiring, including Charlie Deck, davidope, Saskia Freeke and especially Dave Whyte. I should also mention David Mrugala, Alberto Vacca Lepri, Justin Van Genderen and Pierre Voisin, who mostly work in still images rather than animations.” If you want to see other “math art” that has inspired Shonkwiler, check out Frank Farris, Kerry Mitchell, Henry Segerman, Craig Kaplan and Felicia Tabing.

Another artistic element in Shonkwiler’s pieces is found in the title he creates for each one. You’ll find clever descriptors, allusions to ancient literature and wordplay with mathematical concepts. He says he usually creates the title after the piece is completely done. “I post my GIFs in a bunch of places online, but Wolfram Community is usually first because I always include a description and the source code in those posts, and I like to be able to point to the source code when I post to other places. So what often happens is I’ll upload a GIF to Wolfram Community, then spend several minutes staring at the post preview, trying to come up with a title.” Although he takes title creation seriously, Shonkwiler says, “Coming up with titles is tremendously frustrating because I’m done with the piece and ready to post it and move on, but I need a title before I can do that.”


"Interlock" code

Stereo[{x1_, y1_, x2_, y2_}] := {x1/(1 - y2), y1/(1 - y2),
   x2/(1 - y2)};

With[{n = 30, m = 22, viewpoint = 5 {1, 0, 0},
  cols = RGBColor /@ {"#2292CA", "#EEEEEE", "#222831"}},
        RotationTransform[-s, {{1, 0, 0, 0}, {0, 0, 0, 1}}][
         1/Sqrt[2] {Cos[\[Theta]], Sin[\[Theta]], Cos[\[Theta] + t],
           Sin[\[Theta] + t]}]], {\[Theta], 0., 2 \[Pi],
        2 \[Pi]/n}]], {t, 0., 2 \[Pi], 2 \[Pi]/m}]},
   ViewPoint -> viewpoint, Boxed -> False, Background -> cols[[-1]],
   ImageSize -> 500, PlotRange -> 10, ViewAngle -> \[Pi]/50,
   Lighting -> {{"Point", cols[[1]], {0, -1, 0}}, {"Point",
      cols[[2]], {0, 1, 0}}, {"Ambient", RGBColor["#ff463e"],
      viewpoint}}], {s, 0, \[Pi]}]]

(This code plots the curves with fewer points so as to increase the responsiveness of the Manipulate.)

Other Wolfram Community members have complimented Shonkwiler on the layers of color he gives his geometric animations. Likewise, his use of shading often enhances the shapes within his art. But interestingly, his work usually begins monochromatically. “Usually I start in black and white when I’m working on the geometric form and trying to make the animation work properly. That stuff is usually pretty nailed down before I start thinking about colors. I’m terrible at looking at a bunch of color swatches and envisioning how they will look in an actual composition, so usually I have to try a lot of different color combinations before I find one I like.”

Shonkwiler says that the Wolfram Language makes testing out color schemes a quick process. “If you look at the code for most of my animations, you’ll find a variable called cols so that I can easily change colors just by changing that one variable.”

Magic Carpet Square Up“Magic Carpet” and “Square Up”

I asked Shonkwiler if he conceives the visual outcome before he starts his work, or if he plays with the math and code until he finds something he decides to keep. He said it could go either way, or it might be a combination. “‘Magic Carpet’ started as a modification of ‘Square Up,’ which was colored according to the z coordinate from the very earliest versions, so that’s definitely a case where I had something in my head that turned out to require some extra fiddling to implement. But often I’m just playing around until I find something that grabs me in some way, so it’s very much an exploration.”

Renewable Resources Inner Light“Renewable Resources” and “Inner Light”

Shonkwiler actually has a lot of pieces that are related to each other mathematically. Regarding the two above, “They’re both visualizations of the same Möbius transformation. A Möbius transformation of a sphere is just a map from the sphere to itself that preserves the angles everywhere. They’re important in complex analysis, algebraic geometry, hyperbolic geometry and various other places, which means there are lots of interesting ways to think about them. They come up in my research in the guise of automorphisms of the projective line and as isometries of the hyperbolic plane, so they’re often on my mind.”

“To make ‘Inner Light,’ I took a bunch of concentric circles in the plane and just started scaling the plane by more and more, so that each individual circle is getting bigger and bigger. Then I inverse-stereographically project up to the sphere, where the circles become circles of latitude and I make a tube around each one. ‘Renewable Resource’ is basically the same thing, except I just have individual points on each circle and I’m only showing half of the sphere in the final image rather than the whole sphere.”

When I asked Shonkwiler about his philosophy on the relationship between math and aesthetics, he said, “Part of becoming a mathematician is developing a very particular kind of aesthetic sense that tells you whether an argument or a theory is beautiful or ugly, but this has probably been overemphasized to the point of cliché.”

However, Shonkwiler continued to mull the question. “I do think that when you make a visualization of a piece of interesting mathematics, it is often the case that it is visually compelling on some deep level, even if not exactly beautiful in a traditional sense. That might just be confirmation bias on my part, so there’s definitely an empirical question as to whether that’s really true and, if it is, you could probably have a metaphysical or epistemological debate about why that might be. But in any case, I think it’s an interesting challenge to find those visually compelling fragments of mathematics and then to try to present them in a way that also incorporates some more traditional aesthetic considerations. That’s something I feel like I’ve gotten marginally better at over the years, but I’m definitely still learning.”

Here is Shonkwiler with one of his GIFs at MediaLive x Ello: International GIF Competition in Boulder, Colorado:

Clay at GIF competition

Check out Clayton Shonkwiler’s Wolfram Community contributions. To explore his work further, visit his blog and his website. Of course, if you have Wolfram Language–based art, post it on Wolfram Community to strike up a conversation with other art and Wolfram Language enthusiasts. It’s easy and free to sign up for a Community account.

http://blog.wolfram.com/2017/12/14/creating-mathematical-gems-in-the-wolfram-language/feed/ 0
Tracking a Descent to Savagery with the Wolfram Language: Plotting Sentiment Analysis in Lord of the Flies http://blog.wolfram.com/2017/12/07/tracking-a-descent-to-savagery-with-the-wolfram-language-plotting-sentiment-analysis-in-lord-of-the-flies/ http://blog.wolfram.com/2017/12/07/tracking-a-descent-to-savagery-with-the-wolfram-language-plotting-sentiment-analysis-in-lord-of-the-flies/#comments Thu, 07 Dec 2017 15:36:30 +0000 Jon McLoone http://blog.internal.wolfram.com/?p=39864 Computation is no longer the preserve of science and engineering, so I thought I would share a simple computational literary analysis that I did with my daughter.

Shell Lord of the Flies

Hannah’s favorite book is Lord of the Flies by William Golding, and as part of a project she was doing, she wanted to find some quantitative information to support her critique.

Spoiler alert: for those who don’t know it, the book tells the story of a group of schoolboys shipwrecked on an island. Written as a reaction to The Coral Island, an optimistic and uplifting book with a similar initial premise, Lord of the Flies instead relates the boys’ descent into savagery once they are separated from societal influence.

The principle data that Hannah asked for was a timeline of the appearance of the characters. This is a pretty straightforward bit of counting. For a given character name, I can search the text for the positions it appears in, and while I am at it, label the data with Legended so that it looks nicer when plotted.

nameData[name_, label_] :=    Legended[First /@ StringPosition[$lotf, name, IgnoreCase -> True]/    StringLength[$lotf], label]; nameData[name_] := nameData[name, name];

The variable $lotf contains the text of the book (there is some discussion later about how to get that). By dividing the string position by the length of the book, I am rescaling the range to 0–1 to make alignment with later work easier. Now I simply create a Histogram of the data. I used a SmoothHistogram, as it looks nicer. The smoothing parameter of 0.06 is rather subjective, but gave this rather pleasingly smooth overview without squashing all the details.
characters = SmoothHistogram[{    nameData["Ralph"],    nameData["Jack"],    nameData["Beast", "The Beast"]}, 0.06, PlotStyle -> Thickness[0.01],   PlotRange -> {{0, 1}, All}, PlotLabel -> "Character appearances",   Ticks -> None]

Already we can see some of the narrative arc of the book. The protagonist, Ralph, makes an early appearance, closely followed by the antagonist, Jack. The nonexistent Beast appears as a minor character early in the book as the boys explore the island before becoming a major feature in the middle of the book, preceding Jack’s rise as Jack exploits fear of the Beast to take power. Ralph becomes significant again toward the end as the conflict between he and Jack reaches its peak.

But most of Hannah’s critique was about meaning, not plot, so we started talking about the tone of the book. To quantify this, we can use a simple machine learning classifier on the sentences and then do basic statistics on the result.

By breaking the text into sentences and then using the built-in sentiment analyzer, we can hunt out the sentence most likely to be a positive one.

MaximalBy[TextSentences[$lotf],   Classify["Sentiment", #, {"Probability", "Positive"}] &]

The classifier returns only "Positive", "Negative" and "Neutral" classes, so if we map those to numbers we can take a moving average to produce an average sentiment vector with a window of 500 sentences.

sentimentData =    Legended[MeanFilter[      ReplaceAll[       Classify["Sentiment", TextSentences[$lotf]] , {"Positive" -> 1,         "Negative" -> -1, "Neutral" | Indeterminate -> 0}], 500]*20,     "Sentiment"];

Putting that together with the character occurrences allows some interesting insights.

Lord of the Flies high-res

We can see that there is an early negative tone as the boys are shipwrecked, which quickly becomes positive as they explore the island and their newfound freedom. The tone becomes more neutral as concerns rise about the Beast, and turn negative as Jack rises to power. There is a brief period of positivity as Ralph returns to prominence before the book dives into bleak territory as everything goes bad (especially for Piggy).

Digital humanities is a growing field, and I think the relative ease with which the Wolfram Language can be applied to text analysis and other kinds of data science should help computation make useful contributions to many fields that were once considered entirely subjective.

Appendix: Notes on Data Preparation

Because I wanted to avoid needing to OCR my hard copy of the book, I used a digital copy. However, the data was corrupted by some page headers, artifacts of navigation hyperlinks and an index page. So, here is some rather dirty string pattern work to strip out those extraneous words and numbers to produce a clean string containing only narrative:

$lotf = StringReplace[    StringDelete[     StringReplace[      StringDrop[       StringDelete[        Import["http://<redacted>.txt"], {"Page " ~~ Shortest[___] ~~           " of 290\n\nGo Back\n\nFull Screen\n\nClose\n\nQuit",                           "Home Page\n\nTitle Page\n\nContents\n\n!!\n\n\"\"\n\n!\n\n\ \""}], 425],       "\n" -> " "], {"Home Page  " ~~ Shortest[___] ~~        "  Title Page  Contents", "!!  \"\"  !  \"    ", "\f"}],     "  " -> " "];

Appendix 2: Text Labels

This is the code for labeling the plot with key moments in the book:

sentenceLabel[{sentence_String, label_}] :=   Block[{pos =      Once[First[FirstPosition[TextSentences[$lotf], sentence]]]},   Callout[{pos/Length[TextSentences[$lotf]], sentimentData[[1, pos]]},     label, Appearance -> "Balloon"]]

eventLabels = ListPlot[sentenceLabel /@     {      {"As they watched, a flash of fire appeared at the root of one \ wisp, and then the smoke thickened.", "The fire"},            {"Only the beast lay still, a few yards from the sea.",        "Simon dies"},      {"Samneric were savages like the rest; Piggy was dead, and the \ conch smashed to powder.", "Piggy dies"},      {"\[OpenCurlyDoubleQuote]I\[CloseCurlyQuote]m chief then.\ \[CloseCurlyDoubleQuote]", "Ralph leads"},      {"We aren\[CloseCurlyQuote]t enough to keep the fire burning.\ \[CloseCurlyDoubleQuote]", "Ralph usurped"}}];

Appendix 3: Conch Shell Word Cloud Image

It doesn’t provide much insight, but the conch word cloud at the top of the article was generated with this code:

img = Import["http://pngimg.com/uploads/conch/conch_PNG18242.png"]; ImageMultiply[  Rasterize[   WordCloud[DeleteStopwords[$lotf], AlphaChannel[img], MaxItems -> 50,     WordSelectionFunction -> (StringLength[#] >= 4 &)],    ImageSize -> ImageDimensions[img]], ImageAdjust[img, {-0.7, 1.3}]]

Reference—conch shell: Source image from pngimg.com Creative Commons 4.0 BY-NC.

http://blog.wolfram.com/2017/12/07/tracking-a-descent-to-savagery-with-the-wolfram-language-plotting-sentiment-analysis-in-lord-of-the-flies/feed/ 0
Finding X in Espresso: Adventures in Computational Lexicology http://blog.wolfram.com/2017/11/30/finding-x-in-espresso-adventures-in-computational-lexicology/ http://blog.wolfram.com/2017/11/30/finding-x-in-espresso-adventures-in-computational-lexicology/#comments Thu, 30 Nov 2017 19:08:03 +0000 Vitaliy Kaurov http://blog.internal.wolfram.com/?p=39579 When Does a Word Become a Word?

“A shot of expresso, please.” “You mean ‘espresso,’ don’t you?” A baffled customer, a smug barista—media is abuzz with one version or another of this story. But the real question is not whether “expresso” is a correct spelling, but rather how spellings evolve and enter dictionaries. Lexicographers do not directly decide that; the data does. Long and frequent usage may qualify a word for endorsement. Moreover, I believe the emergent proliferation of computational approaches can help to form an even deeper insight into the language. The tale of expresso is a thriller from a computational perspective.

X in expresso data analysis poster

In the past I had taken the incorrectness of expresso for granted. And how could I not, with the thriving pop-culture of “no X in espresso” posters, t-shirts and even proclamations from music stars such as “Weird Al” Yankovic. Until a statement in a recent note by Merriam-Webster’s online dictionary caught my eye: “… expresso shows enough use in English to be entered in the dictionary and is not disqualified by the lack of an x in its Italian etymon.” Can this assertion be quantified? I hope this computational treatise will convince you that it can. But to set the backdrop right, let’s first look into the history.

Expresso in video segmentNo X in espresso poster

History of Industry and Language

In the 19th century’s steam age, many engineers tackled steam applications accelerating the coffee-brewing process to increase customer turnover, as coffee was a booming business in Europe. The original espresso machine is usually attributed to Angelo Moriondo from Turin, who obtained a patent in 1884 for “new steam machinery for the economic and instantaneous confection of coffee beverage.” But despite further engineering improvements (see the Smithsonian), for decades espresso remained only a local Italian delight. And for words to jump between languages, industries need to jump the borders—this is how industrial evolution triggers language evolution. The first Italian to truly venture the espresso business internationally was Achille Gaggia, a coffee bartender from Milan.

Expresso timeline

In 1938 Gaggia patented a new method using the celebrated lever-driven piston mechanism allowing new record-brewing pressures, quick espresso shots and, as a side effect, even crema foam, a future signature of an excellent espresso. This allowed the Gaggia company (founded in 1948) to commercialize the espresso machines as a consumer product for use in bars. There was about a decade span between the original 1938 patent and its 1949 industrial implementation.

Original espresso maker

Around 1950, espresso machines began crossing Italian borders to the United Kingdom, America and Africa. This is when the first large spike happens in the use of the word espresso in the English language. The spike and following rapid growth are evident from the historic WordFrequencyData of published English corpora plotted across the 20th century:

history[w_] :=   WordFrequencyData[w, "TimeSeries", {1900, 2000}, IgnoreCase -> True]

The function above gets TimeSeries data for the frequencies of words w in a fixed time range from 1900–2000 that, of course, can be extended if needed. The data can be promptly visualized with DateListPlot:

DateListPlot[history[{"espresso", "expresso"}], PlotRange -> All,   PlotTheme -> "Wide"]

The much less frequent expresso also gains its popularity slowly but steadily. Its simultaneous growth is more obvious with the log-scaled vertical frequency axis. To be able to easily switch between log and regular scales and also improve the visual comprehension of multiple plots, I will define a function:

vkWordFreqPlot[list_, plot_] :=    plot[MovingAverage[#, 3] & /@      WordFrequencyData[list, "TimeSeries", {1900, 2000},       IgnoreCase -> True], PlotTheme -> "Detailed", AspectRatio -> 1/3,     Filling -> Bottom, PlotRange -> All, InterpolationOrder -> 2,     PlotLegends -> Placed[Automatic, {Left, Top}]];

The plot below also compares the espresso/expresso pair to a typical pair acknowledged by dictionaries, unfocused/unfocussed, stemming from American/British usage:
vkWordFreqPlot[{"espresso", "expresso", "unfocused",    "unfocussed"}, DateListLogPlot]

The overall temporal behavior of frequencies for these two pairs is quite similar, as it is for many other words of alternative orthography acknowledged by dictionaries. So why is espresso/expresso so controversial? A good historical account is given by Slate Magazine, which, as does Merriam-Webster, supports the official endorsement of expresso. And while both articles give a clear etymological reasoning, the important argument for expresso is its persistent frequent usage (even in such distinguished publications as The New York Times). As it stands as of the date of this blog, the following lexicographic vote has been cast in support of expresso by some selected trusted sources I scanned through. Aye: Merriam-Webster online, Harper Collins online, Random House online. Nay: Cambridge Dictionary online, Oxford Learner’s Dictionaries online, Oxford Dictionaries online (“The spelling expresso is not used in the original Italian and is strictly incorrect, although it is common”; see also the relevant blog), Garner’s Modern American Usage, 3rd edition (“Writers frequently use the erroneous form [expresso]”).

In times of dividing lines, data helps us to refocus on the whole picture and dominant patterns. To stress diversity of alternative spellings, consider the pair amok/amuck:

vkWordFrequencyPlot[{"amok", "amuck"}, DateListPlot]

Of a rather macabre origin, amok came to English around the mid-1600s from the Malay amuk, meaning “murderous frenzy,” referring to a psychiatric disorder of a manic urge to murder. The pair amok/amuck has interesting characteristics. Both spellings can be found in dictionaries. The WordFrequencyData above shows the rich dynamics of oscillating popularity, followed by the competitive rival amuck becoming the underdog. The difference in orthography does not have a typical British/American origin, which should affect how alternative spellings are sampled for statistical analysis further below. And finally, the Levenshtein EditDistance is not equal to 1…

EditDistance["amok", "amuck"]

… in contrast to many typical cases such as:

EditDistance @@@ {{"color", "colour"}, {"realize",     "realise"}, {"aesthetic", "esthetic"}}

This will also affect the sampling of data. My goal is to extract from a dictionary a data sample large enough to describe the diversity of alternatively spelled words that are also structurally close to the espresso/expresso pair. If the basic statistics of this sample assimilate the espresso/expresso pair well, then it quantifies and confirms Merriam-Webster’s assertion that “expresso shows enough use in English to be entered in the dictionary.” But it also goes a step further, because now all pairs from the dictionary sample can be considered as precedents for legitimizing expresso.

Dictionary as Data

Alternative spellings come in pairs and should not be considered separately because there is statistical information in their relation to each other. For instance, the word frequency of expresso should not be compared with the frequency of an arbitrary word in a dictionary. Contrarily, we should consider an alternative spelling pair as a single data point with coordinates {f+, f} denoting higher/lower word frequency of more/less popular spelling correspondingly, and always in that order. I will use the weighted average of a word frequency over all years and all data corpora. It is a better overall metric than a word frequency at a specific date, and avoids the confusion of a frequency changing its state between higher f+ and lower f at different time moments (as we saw for amok/amuck). Weighted average is the default value of WordFrequencyData when no date is specified as an argument.

The starting point is a dictionary that is represented in the Wolfram Language by WordList and contains 84,923 definitions:

Length[words = WordList["KnownWords"]]

There are many types of dictionaries with quite varied sizes. There is no dictionary in the world that contains all words. And, in fact, all dictionaries are outdated as soon as they are published due to continuous language evolution. My assumption is that the exact size or date of a dictionary is unimportant as long as it is “modern and large enough” to produce a quality sample of spelling variants. The curated built-in data of the Wolfram Language, such as WordList, does a great job at this.

We notice right away that language is often prone to quite simple laws and patterns. For instance, it is widely assumed that lengths of words in an English dictionary…

Histogram[StringLength[words], Automatic, "PDF",   PlotTheme -> "Detailed", PlotRange -> All]

… follow quite well one of the simplest statistical distributions, the PoissonDistribution. The Wolfram Language machine learning function FindDistribution picks up on that easily:


Show[%%, DiscretePlot[PDF[%, k], {k, 0, 33}, Joined -> True]]

My goal is to search for such patterns and laws in the sample of alternative spellings. But first they need to be extracted from the dictionary.

Extracting Spelling Variants

For ease of data processing and analysis, I will make a set of simplifications. First of all, only the following basic parts of speech are considered to bring data closer to the espresso/expresso case:

royalTypes = {"Noun", "Adjective", "Verb", "Adverb"};

This reduces the dictionary to 84,487 words:

royals = DeleteDuplicates[    Flatten[WordList[{"KnownWords", #}] & /@ royalTypes]]; Length[royals]

Deletion of duplicates is necessary, because the same word can be used as several parts of speech. Further, the words containing any characters beyond the lowercase English alphabet are excluded:

outlaws = Complement[Union[Flatten[Characters[words]]], Alphabet[]]

This also removes all proper names, and drops the number of words to 63,712:

laws = Select[royals, ! StringContainsQ[#, outlaws] &]; Length[laws]

Every word is paired with the list of its definitions, and every list of definitions is sorted alphabetically to ensure exact matches in determining alternative spellings:

Define[w_] := w -> Sort[WordDefinition[w]]; defs = Define /@ laws;

Next, words are grouped by their definitions; single-word groups are removed, and definitions themselves are removed too. The resulting dataset contains 8,138 groups:

samedefs =   Replace[GatherBy[defs, Last], {_ -> _} :> Nothing, 1][[All, All, 1]]


Different groups of words with the same definition have a variable number of words n ≥ 2…

Framed[TableForm[Transpose[groups = Sort[Tally[Length /@ samedefs]]],    TableHeadings -> {groupsHead = {"words, n", "groups, m"}, None},    TableSpacing -> {1, 2}]]

… where m is the number of groups. They follow a remarkable power law. Very roughly for order for magnitudes m~200000 n-5.

Show[ListLogLogPlot[groups, PlotTheme -> "Business",    FrameLabel -> groupsHead],  Plot[Evaluate[Fit[Log[groups], {1, x}, x]], {x, Log[2], Log[14]},    PlotStyle -> Red]]

Close synonyms are often grouped together:

Select[samedefs, Length[#] == 10 &]

This happens because WordDefinition is usually quite concise:

WordDefinition /@ {"abjure", "forswear", "recant"}

To separate synonyms from alternative spellings, I could use heuristics based on orthographic rules formulated for classes such as British versus American English. But that would be too complex and unnecessary. It is much easier to consider only word pairs that differ by a small Levenshtein EditDistance. It is highly improbable for synonyms to differ by just a few letters, especially a single one. So while this excludes not only synonyms but also alternative spellings such as amok/amuck, it does help to select words closer to espresso/expresso and hopefully make the data sample more uniform. The computations can be easily generalized to a larger Levenshtein EditDistance, but it would be important and interesting to first check the most basic case:

EditOne[l_] :=    l[[#]] & /@ Union[Sort /@ Position[Outer[EditDistance, l, l], 1]]; samedefspair = Flatten[EditOne /@ samedefs, 1]

This reduces the sample size to 2,882 pairs:


Mutations of Spellings

Alternative spellings are different orthographic states of the same word that have different probabilities of occurrence in the corpora. They can inter-mutate based on the context or environment they are embedded into. Analysis of such mutations seems intriguing. The mutations can be extracted with help of the SequenceAlignment function. It is based on algorithms from bioinformatics identifying regions of similarity in DNA, RNA or protein sequences, and often wandering into other fields such as linguistics, natural language processing and even business and marketing research. The mutations can be between two characters or a character and a “hole” due to character removal or insertion:

SequenceAlignment @@@ {{"color", "colour"}, {"mesmerise",     "mesmerize"}}

In the extracted mutations’ data, the “hole” is replaced by a dash (-) for visual distinction:

mutation =   Cases[SequenceAlignment @@@ samedefspair, _List, {2}] /. "" -> "-"

The most probable letters to participate in a mutation between alternative spellings can be visualized with Tally. The most popular letters are s and z thanks to the British/American endings -ise/-ize, surpassed only by the popularity of the “hole.” This probably stems from the fact that dropping letters often makes orthography and phonetics easier.

vertex = Association[Rule @@@ SortBy[Tally[Flatten[mutation]], Last]]; optChart = {ColorFunction -> "Rainbow", BaseStyle -> 15,     PlotTheme -> "Web"}; inChar = PieChart[vertex, optChart, ChartLabels -> Callout[Automatic],     SectorOrigin -> -Pi/9]; BarChart[Reverse[vertex], optChart, ChartLabels -> Automatic,  Epilog -> Inset[inChar, Scaled[{.6, .5}], Automatic, Scaled[1.1]]]

Querying Word Frequencies

The next step is to get the WordFrequencyData for all
2 x 2882 = 5764 words of alternative spelling stored in the variable samedefspair. WordFrequencyData is a very large dataset, and it is stored on Wolfram servers. To query frequencies for a few thousands words efficiently, I wrote some special code that can be found in the notebook attached at the end of this blog. The resulting data is an Association containing alternative spellings with ordered pairs of words as keys and ordered pairs of frequencies as values. The higher-frequency entry is always first:


The size of the data is slightly less than the original queried set because for some words, frequencies are unknown:

{Length[data], Length[samedefspair] - Length[data]}

Basic Analysis

Having obtained the data, I am now ready to check how well the frequencies of espresso/expresso fall within this data:

esex = Values[   WordFrequencyData[{"espresso", "expresso"}, IgnoreCase -> True]]

As a start, I will examine if there are any correlations between lower and higher frequencies. Pearson’s Correlation coefficient, a measure of the strength of the linear relationship between two variables, gives a high value for lower versus higher frequencies:

Correlation @@ Transpose[Values[data]]

But plotting frequency values at their natural scale hints that a log scale could be more appropriate:

ListPlot[Values[data], AspectRatio -> Automatic,   PlotTheme -> "Business", PlotRange -> All]

And indeed for log-values of frequencies, the Correlation strength is significantly higher:

Correlation @@ Transpose[Log[Values[data]]]

Fitting the log-log of data reveals a nice linear fit…

lmf = LinearModelFit[Log[Values[data]], x, x]; lmf["BestFit"]

… with sensible statistics of parameters:


In the frequency space, this shows a simple and quite remarkable power law that sheds light on the nature of correlations between the frequencies of less and more popular spellings of the same word:

Reduce[Log[SubMinus[f]] == lmf["BestFit"] /.    x -> Log[SubPlus[f]], SubMinus[f], Reals]

Log-log space gives a clear visualization of the data. Obviously due to {greater, smaller} sorting of coordinates {f+, f}, all data points cannot exceed the Log[f]==Log[f+] limiting orange line. The purple line is the linear fit of the power law. The red circle is the median of the data, and the red dot is the value of the espresso/expresso frequency pair:
ListLogLogPlot[data, PlotRange -> All, AspectRatio -> Automatic,   PlotTheme -> "Detailed",  		ImageSize -> 800, Epilog -> {{Purple, Thickness[.004], Opacity[.4],     	Line[Transpose[{{-30, 0}, Normal[lmf] /. x -> {-30, 0}}]]},    		{Orange, Thickness[.004], Opacity[.4],      Line[{-30 {1, 1}, -10 {1, 1}}]},    		{Red, Opacity[.5], PointSize[.02], Point[Log[esex]]},    		{Red, Opacity[.5], Thickness[.01],      Circle[Median[Log[Values[data]]], .2]}}]

A simple, useful transformation of the coordinate system will help our understanding of the data. Away from log-frequency vs. log-frequency space we go. The distance from a data point to the orange line Log[f]==Log[f+] is the measure of how many times larger the higher frequency is than the lower. It is given by a linear transformation—rotation of the coordinate system by 45 degrees. Because this distance is given by difference of logs, it relates to the ratio of frequencies:


This random variable is well fit by the very famous and versatile WeibullDistribution, which is used universally for weather forecasting to describe wind speed distributions; survival analysis; reliability, industrial and electrical engineering; extreme value theory; forecasting technological change; and much more—including, now, word frequencies:

dist = FindDistribution[   trans = (#1 - #2)/Sqrt[2] & @@@ Log[Values[data]]]

One of the most fascinating facts is “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,” which is the title of a 1960 paper by the physicist Eugene Wigner. One of its notions is that mathematical concepts often apply uncannily and universally far beyond the context in which they were originally conceived. We might have glimpsed at that in our data.

Using statistical tools, we can figure out that in the original space the frequency ratio obeys a distribution with a nice analytic formula:

Assuming[SubPlus[f]/SubMinus[f] > 1,   PDF[TransformedDistribution[E^(Sqrt[2] u),     u \[Distributed] WeibullDistribution[a, b]], SubPlus[f]/SubMinus[   f]]]

It remains to note that the other corresponding transformed coordinate relates to the frequency product…

TraditionalForm[PowerExpand[Log[(SubPlus[f] SubMinus[f])^2^(-1/2)]]]

… and is the position of a data point along the orange line Log[f]==Log[f+]. It reflects how popular, on average, a specific word pair is among other pairs. One can see that the espresso/expresso value lands quite above the median, meaning the frequency of its usage is higher than half of the data points.

Nearest can find the closest pairs to espresso/expresso measured by EuclideanDistance in the frequency space. Taking a look at the 50 nearest pairs shows just how typical the frequencies espresso/expresso are, shown below by a red dot. Many nearest neighbors, such as energize/energise and zombie/zombi, belong to the basic everyday vocabulary of most frequent usage:

neighb = Nearest[data, esex, 50]; ListPlot[Association @@ Thread[neighb -> data /@ neighb],  	PlotRange -> All, AspectRatio -> Automatic, PlotTheme -> "Detailed",  	Epilog -> {{Red, Opacity[.5], PointSize[.03], Point[esex]}}]

The temporal behavior of frequencies for a few nearest neighbors shows significant diversity and often is generally reminiscent of such behavior for the espresso/expresso pair that was plotted at the beginning of this article:

Multicolumn[vkWordFreqPlot[#, DateListPlot] & /@ neighb[[;; 10]], 2]

Networks of Mutation

Frequencies allow us to define a direction of mutation, which can be visualized by a DirectedEdge always pointing from lower to higher frequency. A Tally of the edges defines weights (or not-normalized probabilities) of particular mutations.

muteWeigh =    Tally[Cases[SequenceAlignment @@@ Keys[data], _List, {2}] /.      "" -> "-"]; edge = Association[Rule @@@ Transpose[{       DirectedEdge @@ Reverse[#] & /@ muteWeigh[[All, 1]],        N[Rescale[muteWeigh[[All, 2]]]]}]];

For clarity of visualization, all edges with weights less than 10% of the maximum value are dropped. The most popular mutation is sz->1, with maximum weight 1. It is interesting to note that reverse mutations might occur too; for instance, zs->0.0347938, but much less often:

cutEdge = ReverseSort[ Select[edge, # > .01 &]]

PieChart[cutEdge, optChart, ChartLabels -> Callout[Automatic]]

Thus a letter can participate in several types of mutations, and in this sense mutations form a network. The size of the vertex is correlated with the probability of a letter to participate in any mutation (see the variable vertex above):

vs = Thread[Keys[vertex] -> 2 N[.5 + Rescale[Values[vertex]]]];

The larger the edge weight, the brighter the edge:

es = Thread[    Keys[cutEdge] -> (Directive[Thickness[.003], Opacity[#]] & /@        N[Values[cutEdge]^.3])];

The letters r and g participate mostly in the deletion mutation. Letters with no edges participate in very rare mutations.

graphHighWeight =   Graph[Keys[vertex], Keys[cutEdge], PerformanceGoal -> "Quality",   VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 15,    VertexSize -> vs, EdgeStyle -> es]

Among a few interesting substructures, one of the obvious is the high clustering of vowels. A Subgraph of vowels can be easily extracted…

vowels = {"a", "e", "i", "o", "u"}; Subgraph[graphHighWeight, vowels, GraphStyle -> "SmallNetwork"]

… and checked for completeness, which yields False due to many missing edges from and to u:


Nevertheless, as you might remember, the low-weight edges were dropped for a better visual of high-weight edges. Are there any interesting observations related to low-weight edges? As a matter of fact, yes, there are. Let’s quickly rebuild a full subgraph for only vowels. Vertex sizes are still based on the tally of letters in mutations:

vowelsVertex =   Association @@    Cases[Normal[vertex], Alternatives @@ (# -> _ & /@ vowels)]

vsVow = Thread[    Keys[vowelsVertex] -> .2 N[.5 + Rescale[Values[vowelsVertex]]]];

All mutations of vowels in the dictionary can be extracted with the help of MemberQ:

vowelsMute =    Select[muteWeigh, And @@ (MemberQ[vowels, #] & /@ First[#]) &]; vowelsEdge = Association[Rule @@@    Transpose[     MapAt[DirectedEdge @@ Reverse[#] & /@ # &, Transpose[vowelsMute],       1]]]

In order to visualize exactly the number of vowel mutations in the dictionary, the edge style is kept uniform and edge labels are used for nomenclature:

vowelGraph = Graph[Keys[vowelsVertex], Keys[vowelsEdge],   EdgeWeight -> vowelsMute[[All, 2]], PerformanceGoal -> "Quality",    VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 20,    VertexSize -> vsVow, EdgeLabels -> "EdgeWeight",    EdgeLabelStyle -> Directive[15, Bold]]

And now when we consider all (even small-weight) mutations, the graph is complete:


But this completeness is quite “weak” in the sense that there are many edges with a really small weight, in particular two edges with weight 1:

Select[vowelsMute, Last[#] == 1 &]

This means that there is only one alternative word pair for eu mutations, and likewise for io mutations. With the help of a lookup function…

lookupMute[l_] := With[{keys = Keys[data]}, keys[[Position[       SequenceAlignment @@@ keys /. "" -> "-",        Alternatives @@ l, {2}][[All, 1]]]]]

… these pairs can be found as:

lookupMute[{{"o", "i"}, {"u", "e"}}]

Thus, thanks to these unique and quite exotic words, our dictionaries have eu and io mutations. Let’s check WordDefinition for these terms:

TableForm[WordDefinition /@ #,     TableHeadings -> {#, None}] &@{"corticofugal", "yarmulke"}

The word yarmulke is a quite curious case. First of all, it has three alternative spellings:

Nearest[WordData[], "yarmulke", {All, 1}]

Additionally, the Merriam-Webster Dictionary suggests a rich etymology: “Yiddish yarmlke, from Polish jarmułka & Ukrainian yarmulka skullcap, of Turkic origin; akin to Turkish yağmurluk rainwear.” The Turkic class of languages is quite wide:

EntityList[EntityClass["Language", "Turkic"]]

Together with the other mentioned languages, Turkic languages mark a large geographic area as the potential origin and evolution of the word yarmulke:

locs = DeleteDuplicates[Flatten[EntityValue[     {EntityClass["Language", "Turkic"],       EntityClass["Language", "Yiddish"], Entity["Language", "Polish"],       Entity["Language", "Ukrainian"]},      EntityProperty["Language", "PrimaryOrigin"]]]]

GeoGraphics[GeoMarker[locs, "Scale" -> Scaled[.03]],   GeoRange -> "World", GeoBackground -> "Coastlines",   GeoProjection -> "WinkelTripel"]

This evolution has Yiddish as an important stage before entering English, while Yiddish itself has a complex cultural history. English usage of yarmulke spikes around 1940–1945, hence World War II and the consequent Cold War era are especially important in language migration, correlated probably to the world migration and changes in Jewish communities during these times.

vkWordFreqPlot[{"yarmulke", "yarmelke", "yarmulka"}, DateListLogPlot]

These complex processes brought many more Yiddish words to English (my personal favorites are golem and glitch), but only a single one resulted in the introduction of the mutation eu in the whole English dictionary (at least within our dataset). So while there are really no sx mutations currently in English (as in espresso/expresso), this is not a negative indicator because there are cases of mutations that are unique to a single or just a few words. And actually, there are many more such mutations with a small weight than with a large weight:

ListLogLogPlot[Sort[Tally[muteWeigh[[All, 2]]]],   PlotTheme -> "Detailed",  PlotRange -> All,   FrameLabel -> {"mutation weight", "number of weights"},   Epilog -> Text[Style["s" \[DirectedEdge] "z", 15], Log@{600, 1.2}],   Filling -> Bottom]

So while the sz mutation happens in 777 words, it is the only mutation with that weight:

MaximalBy[muteWeigh, Last]

On the other hand, there are 61 unique mutations that happen only once in a single word, as can be seen from the plot above. So in this sense, the most weighted sz mutation is an outlier, and if expresso enters a dictionary, then the espresso/expresso pair will join the majority of unique mutations with weight 1. These are the mutation networks for the first four small weights:

vkWeight[n_] := Select[muteWeigh, Last[#] == n &][[All, 1]] vkMutationNetwork[n_] :=   Graph[DirectedEdge @@ Reverse[#] & /@ vkWeight[n],   VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 15,   VertexSize -> Scaled[.07], AspectRatio -> 1,    PerformanceGoal -> "Quality",   PlotLabel -> "Mutation Weight = " <> ToString[n]] Grid[Partition[vkMutationNetwork /@ Range[4], 2], Spacings -> {1, 1},   Frame -> All]

As the edge weight gets larger, networks become simpler—degenerating completely for very large weights. Let’s examine a particular set of mutations with a small weight—for instance, weight 2:

DirectedEdge @@ Reverse[#] & /@   Select[muteWeigh, Last[#] == 1 &][[All, 1]]

This means there are only two unique alternative spellings (four words) for each mutation out of the whole dictionary:

Multicolumn[  Row /@ Replace[    SequenceAlignment @@@ (weight2 = lookupMute[vkWeight[2]]) /.      "" -> "-", {x_, y_} :> Superscript[x, Style[y, 13, Red]], {2}], 4]

Red marks a less popular letter, printed as a superscript of the more popular one. While the majority of these pairs are truly alternative spellings with a sometimes curiously dynamic history of usage…

vkWordFreqPlot[{"fjord", "fiord"}, DateListPlot]

… some occasional pairs, like distrust/mistrust, indicate blurred lines between alternative spellings and very close synonyms with close orthographic forms—here the prefixes mis- and dis-. Such rare situations can be considered as a source of noise in our data if someone does not want to accept them as true alternative spellings. My personal opinion is that the lines are blurred indeed, as the prefixes mis- and dis- themselves can be considered alternative spellings of the same semantic notion.

These small-weight mutations (white dots in the graph below) are distributed among the rest of the data (black dots) really well, which reflects on their typicality. This can be visualized by constructing a density distribution with SmoothDensityHistogram, which uses SmoothKernelDistribution behind the scenes:

SmoothDensityHistogram[Log[Values[data]],  Mesh -> 50, ColorFunction -> "DarkRainbow", MeshStyle -> Opacity[.2],  PlotPoints -> 200, PlotRange -> {{-23, -11}, {-24, -12}}, Epilog -> {    {Black, Opacity[.4], PointSize[.002], Point[Log[Values[data]]]},    {White, Opacity[.7], PointSize[.01],      Point[Log[weight2 /. Normal[data]]]},    {Red, Opacity[1], PointSize[.02], Point[Log[esex]]},    {Red, Opacity[1], Thickness[.01],      Circle[Median[Log[Values[data]]], .2]}}]

Some of these very exclusive, rare alternative spellings are even more or less frequently used than espresso/expresso, as shown above for the example of weight 2, and can be also shown for other weights. Color and contour lines provide a visual guide for where the values of density of data points lie.


The following factors affirm why expresso should be allowed as a valid alternative spelling.

  • Espresso/expresso falls close to the median usage frequencies of 2,693 official alternative spellings with Levenshtein EditDistance equal to 1
  • The frequency of espresso/expresso usage as whole pair is above the median, so it is more likely to be found in published corpora than half of the examined dataset
  • Many nearest neighbors of espresso/expresso in the frequency space belong to a basic vocabulary of the most frequent everyday usage
  • The history of espresso/expresso usage in English corpora shows simultaneous growth for both spellings, and by temporal pattern is reminiscent of many other official alternative spellings
  • The uniqueness of the sx mutation in the espresso/expresso pair is typical, as numerous other rare and unique mutations are officially endorsed by dictionaries

So all in all, it is ultimately up to you how to interpret this analysis or spell the name of the delightful Italian drink. But if you are a wisenheimer type, you might consider being a tinge more open-minded. The origin of words, as with the origin of species, has its dark corners, and due to inevitable and unpredictable language evolution, one day your remote descendants might frown on the choice of s in espresso.

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download. If you would like to change parameters to make your own data exploration, download the full notebook.

http://blog.wolfram.com/2017/11/30/finding-x-in-espresso-adventures-in-computational-lexicology/feed/ 5
How to Win at Risk: Exact Probabilities http://blog.wolfram.com/2017/11/20/how-to-win-at-risk-exact-probabilities/ http://blog.wolfram.com/2017/11/20/how-to-win-at-risk-exact-probabilities/#comments Mon, 20 Nov 2017 17:54:43 +0000 Jon McLoone http://blog.internal.wolfram.com/?p=39293 The classic board game Risk involves conquering the world by winning battles that are played out using dice. There are lots of places on the web where you can find out the odds of winning a battle given the number of armies that each player has. However, all the ones that I have seen do this by Monte Carlo simulation, and so are innately approximate. The Wolfram Language makes it so easy to work out the exact values that I couldn’t resist calculating them once and for all.

Risk battle odds flow chart

Here are the basic battle rules: the attacker can choose up to three dice (but must have at least one more army than dice), and the defender can choose up to two (but must have at least two armies to use two). To have the best chances of winning, you always use the most dice possible, so I will ignore the other cases. Both players throw simultaneously and then the highest die from each side is paired, and (if both threw at least two dice) the next highest are paired. The highest die kills an army and, in the event of a draw, the attacker is the loser. This process is repeated until one side runs out of armies.

So my goal is to create a function pBattle[a,d] that returns the probability that the battle ends ultimately as a win for the attacker, given that the attacker started with a armies and the defender started with d armies.

I start by coding the basic game rules. The main case is when both sides have enough armies to fight with at least two dice. There are three possible outcomes for a single round of the battle. The attacker wins twice or loses twice, or both sides lose one army. The probability of winning the battle is therefore the sum of the probabilities of winning after the killed armies are removed multiplied by the probability of that outcome.

pBattle[a_, d_] /; (a >= 3 && d >= 2) := Once[    pWin2[a, d] pBattle[a, d - 2] +      pWin1Lose1[a, d] pBattle[a - 1, d - 1] +      pLose2[a, d] pBattle[a - 2, d]    ];

We also have to cover the case that either side has run low on armies and there is only one game piece at stake.

pBattle[a_, d_] /; (a > 1 && d >= 1) := Once[    pWin1[a, d] pBattle[a, d - 1] + pLose1[a, d] pBattle[a - 1, d]    ];

This sets up a recursive definition that defines all our battle probabilities in terms of the probabilities of subsequent stages of the battle. Once prevents us working those values out repeatedly. We just need to terminate this recursion with the end-of-battle rules. If the attacker has only one army, he has lost (since he must have more armies than dice), so our win probability is zero. If our opponent has run out of armies, then the attacker has won.

pBattle[1, _] = 0; pBattle[_, 0] = 1;

Now we have to work out the probabilities of our five individual attack outcomes: pWin2, pWin1Lose1, pLose2, pWin1 and pLose1.

When using two or three dice, we can describe the distribution as an OrderDistribution of a DiscreteUniformDistribution because we always want to pair the highest throws together.

diceDistribution[n : 3 | 2] :=    OrderDistribution[{DiscreteUniformDistribution[{1, 6}], n}, {n - 1,      n}];

For example, here is one outcome of that distribution; the second number will always be the largest, due to the OrderDistribution part.


The one-die case is just a uniform distribution; our player has to use the value whether it is good or not. However, for programming convenience, I am going to describe a distribution of two numbers, but we will never look at the first.

diceDistribution[1] := DiscreteUniformDistribution[{{1, 6}, {1, 6}}];

So now the probability of winning twice is that both attacker dice are greater than both defenders. The defender must be using two dice, but the attacker could be using two or three.

pWin2[a_, d_] /; a >= 3 && d >= 2 := Once[    Probability[     a1 > d1 &&       a2 > d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[2]}]    ];

The lose-twice probability has a similar definition.

pLose2[a_, d_] := Once[    Probability[     a1 <= d1 &&       a2 <= d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[2]}]    ];

And the draw probability is what’s left.

pWin1Lose1[a_, d_] := Once[1 - pWin2[a, d] - pLose2[a, d]]

The one-army battle could be because the attacker is low on armies or because the defender is. Either way, we look only at the last value of our distributions.

pWin1[a_, d_] /; a === 2 || d === 1 := Once[    Probability[     a2 > d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[Min[d, 2]]}]    ];

And pLose1 is just the remaining case.

pLose1[a_, d_] := 1 - pWin1[a, d];

And we are done. All that is left is to use the function. Here is the exact (assuming fair dice, and no cheating!) probability of winning if the attacker starts with 18 armies and the defender has only six.

pBattle[18, 6]

We can approximate this to 100 decimal places.

N[%, 100]

We can quickly enumerate the probabilities for lots of different starting positions.

table = Text@   Grid[Prepend[     Table[Prepend[Table[pBattle[a, d], {d, 1, 4}],        StringForm["Attack with " <> ToString[a]]], {a, 2, 16}],     Prepend[      Table[StringForm["Defend with " <> ToString[n]], {n, 1, 4}],       ""]], Frame -> All, FrameStyle -> LightGray]

Risk odds table 1

Here are the corresponding numeric values to only 20 decimal places.

N[table, 20]

Risk odds table 2

You can download tables of more permutations here, with exact numbers, and here, approximated to 20 digits.

Of course, this level of accuracy is rather pointless. If you look at the 23 vs. 1 battle, the probability of losing is about half the probability that you will actually die during the first throw of the dice, and certainly far less than the chances of your opponent throwing the board in the air and refusing to play ever again.

Appendix: Code for Generating the Outcomes Graph

vf[{x_, y_}, name_, {w_, h_}] := {Black, Circle[{x, y}, w], Black,     Text[If[StringQ[name], Style[name, 12],       Style[Row[name, "\[ThinSpace]vs\[ThinSpace]"], 9]], {x, y}]}; edge[e_, th_] :=    Property[e, EdgeStyle -> {Arrowheads[th/15], Thickness[th/40]}]; Graph[Flatten[Table[If[a >= 3 && d >= 2,      {       edge[{a, d} -> {a, d - 2}, pWin2[a, d]],       edge[{a, d} -> {a - 1, d - 1}, pWin1Lose1[a, d]],       edge[{a, d} -> {a - 2, d}, pLose2[a, d]]              },      {       edge[{a, d} -> {a, d - 1}, pWin1[a, d]],       edge[{a, d} -> {a - 1, d}, pLose1[a, d]]              }], {a, 2, 6}, {d, 1, 4}]] /. {{a_, 0} -> "Win", {1, d_} ->      "Lose"}, ImageSize -> Full, VertexShapeFunction -> vf,   VertexSize -> 1]

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

http://blog.wolfram.com/2017/11/20/how-to-win-at-risk-exact-probabilities/feed/ 1
What Is a Computational Essay? http://blog.wolfram.com/2017/11/14/what-is-a-computational-essay/ http://blog.wolfram.com/2017/11/14/what-is-a-computational-essay/#comments Tue, 14 Nov 2017 18:54:55 +0000 Stephen Wolfram http://blog.internal.wolfram.com/?p=39273 A Powerful Way to Express Ideas

People are used to producing prose—and sometimes pictures—to express themselves. But in the modern age of computation, something new has become possible that I’d like to call the computational essay.

I’ve been working on building the technology to support computational essays for several decades, but it’s only very recently that I’ve realized just how central computational essays can be to both the way people learn, and the way they communicate facts and ideas. Professionals of the future will routinely deliver results and reports as computational essays. Educators will routinely explain concepts using computational essays. Students will routinely produce computational essays as homework for their classes.

Here’s a very simple example of a computational essay:

Simple computational essay example

There are basically three kinds of things here. First, ordinary text (here in English). Second, computer input. And third, computer output. And the crucial point is that these all work together to express what’s being communicated.

The ordinary text gives context and motivation. The computer input gives a precise specification of what’s being talked about. And then the computer output delivers facts and results, often in graphical form. It’s a powerful form of exposition that combines computational thinking on the part of the human author with computational knowledge and computational processing from the computer.

But what really makes this work is the Wolfram Language—and the succinct representation of high-level ideas that it provides, defining a unique bridge between human computational thinking and actual computation and knowledge delivered by a computer.

In a typical computational essay, each piece of Wolfram Language input will usually be quite short (often not more than a line or two). But the point is that such input can communicate a high-level computational thought, in a form that can readily be understood both by the computer and by a human reading the essay.

It’s essential to all this that the Wolfram Language has so much built-in knowledge—both about the world and about how to compute things in it. Because that’s what allows it to immediately talk not just about abstract computations, but also about real things that exist and happen in the world—and ultimately to provide a true computational communication language that bridges the capabilities of humans and computers.

An Example

Let’s use a computational essay to explain computational essays.

Let’s say we want to talk about the structure of a human language, like English. English is basically made up of words. Let’s get a list of the common ones.

Generate a list of common words in English:



How long is a typical word? Well, we can take the list of common words, and make a histogram that shows their distribution of lengths.

Make a histogram of word lengths:



Do the same for French:

Histogram[StringLength[WordList[Language -> "French"]]]

Histogram[StringLength[WordList[Language -> "French"]]]

Notice that the word lengths tend to be longer in French. We could investigate whether this is why documents tend to be longer in French than in English, or how this relates to quantities like entropy for text. (Of course, because this is a computational essay, the reader can rerun the computations in it themselves, say by trying Russian instead of French.)

But as something different, let’s compare languages by comparing their translations for, say, the word “computer”.

Find the translations for “computer” in the 10 most common languages:

Take[WordTranslation["computer", All], 10]

Take[WordTranslation["computer", All], 10]

Find the first translation in each case:

First /@ Take[WordTranslation["computer", All], 10]

First /@ Take[WordTranslation["computer", All], 10]

Arrange common languages in “feature space” based on their translations for “computer”:

FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

From this plot, we can start to investigate all sorts of structural and historical relationships between languages. But from the point of view of a computational essay, what’s important here is that we’re sharing the exposition between ordinary text, computer input, and output.

The text is saying what the basic point is. Then the input is giving a precise definition of what we want. And the output is showing what’s true about it. But take a look at the input. Even just by looking at the names of the Wolfram Language functions in it, one can get a pretty good idea what it’s talking about. And while the function names are based on English, one can use “code captions” to understand it in another language, say Japanese:


FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

But let’s say one doesn’t know about FeatureSpacePlot. What is it? If it was just a word or phrase in English, we might be able to look in a dictionary, but there wouldn’t be a precise answer. But a function in the Wolfram Language is always precisely defined. And to know what it does we can start by just looking at its documentation. But much more than that, we can just run it ourselves to explicitly see what it does.

FeatureSpacePlot page

And that’s a crucial part of what’s great about computational essays. If you read an ordinary essay, and you don’t understand something, then in the end you really just have to ask the author to find out what they meant. In a computational essay, though, there’s Wolfram Language input that precisely and unambiguously specifies everything—and if you want to know what it means, you can just run it and explore any detail of it on your computer, automatically and without recourse to anything like a discussion with the author.


How does one actually create a computational essay? With the technology stack we have, it’s very easy—mainly thanks to the concept of notebooks that we introduced with the first version of Mathematica all the way back in 1988. A notebook is a structured document that mixes cells of text together with cells of Wolfram Language input and output, including graphics, images, sounds, and interactive content:

A typical notebook

In modern times one great (and very hard to achieve!) thing is that full Wolfram Notebooks run seamlessly across desktop, cloud and mobile. You can author a notebook in the native Wolfram Desktop application (Mac, Windows, Linux)—or on the web through any web browser, or on mobile through the Wolfram Cloud app. Then you can share or publish it through the Wolfram Cloud, and get access to it on the web or on mobile, or download it to desktop or, now, iOS devices.

Notebook environments

Sometimes you want the reader of a notebook just to look at it, perhaps opening and closing groups of cells. Sometimes you also want them to be able to operate the interactive elements. And sometimes you want them to be able to edit and run the code, or maybe modify the whole notebook. And the crucial point is that all these things are easy to do with the cloud-desktop-mobile system we’ve built.

A New Form of Student Work

Computational essays are great for students to read, but they’re also great for students to write. Most of the current modalities for student work are remarkably old. Write an essay. Give a math derivation. These have been around for millennia. Not that there’s anything wrong with them. But now there’s something new: write a computational essay. And it’s wonderfully educational.

A computational essay is in effect an intellectual story told through a collaboration between a human author and a computer. The computer acts like a kind of intellectual exoskeleton, letting you immediately marshall vast computational power and knowledge. But it’s also an enforcer of understanding. Because to guide the computer through the story you’re trying to tell, you have to understand it yourself.

When students write ordinary essays, they’re typically writing about content that in some sense “already exists” (“discuss this passage”; “explain this piece of history”; …). But in doing computation (at least with the Wolfram Language) it’s so easy to discover new things that computational essays will end up with an essentially inexhaustible supply of new content, that’s never been seen before. Students will be exploring and discovering as well as understanding and explaining.

When you write a computational essay, the code in your computational essay has to produce results that fit with the story you’re telling. It’s not like you’re doing a mathematical derivation, and then some teacher tells you you’ve got the wrong answer. You can immediately see what your code does, and whether it fits with the story you’re telling. If it doesn’t, well then maybe your code is wrong—or maybe your story is wrong.

What should the actual procedure be for students producing computational essays? At this year’s Wolfram Summer School we did the experiment of asking all our students to write a computational essay about anything they knew about. We ended up with 72 interesting essays—exploring a very wide range of topics.

In a more typical educational setting, the “prompt” for a computational essay could be something like “What is the typical length of a word in English” or “Explore word lengths in English”.

There’s also another workflow I’ve tried. As the “classroom” component of a class, do livecoding (or a live experiment). Create or discover something, with each student following along by doing their own computations. At the end of the class, each student will have a notebook they made. Then have their “homework” be to turn that notebook into a computational essay that explains what was done.

And in my experience, this ends up being a very good exercise—that really tests and cements the understanding students have. But there’s also something else: when students have created a computational essay, they have something they can keep—and directly use—forever.

And this is one of the great general features of computational essays. When students write them, they’re in effect creating a custom library of computational tools for themselves—that they’ll be in a position to immediately use at any time in the future. It’s far too common for students to write notes in a class, then never refer to them again. Yes, they might run across some situation where the notes would be helpful. But it’s often hard to motivate going back and reading the notes—not least because that’s only the beginning; there’s still the matter of implementing whatever’s in the notes.

But the point is that with a computational essay, once you’ve found what you want, the code to implement it is right there—immediately ready to be applied to whatever has come up.

Any Subject You Want

What can computational essays be about? Almost anything! I’ve often said that for any field of study X (from archaeology to zoology), there either is now, or soon will be, a “computational X”. And any “computational X” can immediately be explored and explained using computational essays.

But even when there isn’t a clear “computational X” yet,  computational essays can still be a powerful way to organize and present material. In some sense, the very fact that a sequence of computations are typically needed to “tell the story” in an essay helps define a clear backbone for the whole essay. In effect, the structured nature of the computational presentation helps suggest structure for the narrative—making it easier for students (and others) to write essays that are easy to read and understand.

But what about actual subject matter? Well, imagine you’re studying history—say the history of the English Civil War. Well, conveniently, the Wolfram Language has a lot of knowledge about history (as about so many other things) built in. So you can present the English Civil War through a kind of dialog with it. For example, you can ask it for the geography of battles:

DynamicModuleBox[{Typeset`query$$ = "English Civil War",
      Typeset`boxes$$ = TemplateBox[{"\"English Civil War\"",
RowBox[{"Entity", "[",
RowBox[{"\"MilitaryConflict\"", ",", "\"EnglishCivilWar\""}], "]"}],
        "\"Entity[\\\"MilitaryConflict\\\", \
\\\"EnglishCivilWar\\\"]\"", "\"military conflict\""}, "Entity"],
      Typeset`allassumptions$$ = {{
       "type" -> "Clash", "word" -> "English Civil War",
        "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
        "Values" -> {{
          "name" -> "MilitaryConflict",
           "desc" -> "a military conflict",
           "input" -> "*C.English+Civil+War-_*MilitaryConflict-"}, {
          "name" -> "Word", "desc" -> "a word",
           "input" -> "*C.English+Civil+War-_*Word-"}, {
          "name" -> "HistoricalEvent", "desc" -> "a historical event",
            "input" -> "*C.English+Civil+War-_*HistoricalEvent-"}}}, {
       "type" -> "SubCategory", "word" -> "English Civil War",
        "template" -> "Assuming ${desc1}. Use ${desc2} instead",
        "count" -> "4",
        "Values" -> {{
          "name" -> "EnglishCivilWar",
           "desc" -> "English Civil War (1642 - 1651)",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
EnglishCivilWar-"}, {
          "name" -> "FirstEnglishCivilWar",
           "desc" -> "English Civil War (1642 - 1646)",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
FirstEnglishCivilWar-"}, {
          "name" -> "SecondEnglishCivilWar",
           "desc" -> "Second English Civil War",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
SecondEnglishCivilWar-"}, {
          "name" -> "ThirdEnglishCivilWar",
           "desc" -> "Third English Civil War",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
ThirdEnglishCivilWar-"}}}}, Typeset`assumptions$$ = {},
      Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {
      "Online" -> True, "Allowed" -> True,
       "mparse.jsp" -> 1.305362`6.5672759594240935,
       "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{265., {7., 17.}},
        Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
         Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],

You could ask for a timeline of the beginning of the war (you don’t need to say “first 15 battles”, because if one cares, one can just read that from the Wolfram Language code):

DynamicModuleBox[{Typeset`query$$ = "English Civil War",
       Typeset`boxes$$ = TemplateBox[{"\"English Civil War\"",
RowBox[{"Entity", "[",
RowBox[{"\"MilitaryConflict\"", ",", "\"EnglishCivilWar\""}], "]"}],
         "\"Entity[\\\"MilitaryConflict\\\", \\\"EnglishCivilWar\\\"]\
\"", "\"military conflict\""}, "Entity"],
       Typeset`allassumptions$$ = {{
        "type" -> "Clash", "word" -> "English Civil War",
         "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
         "Values" -> {{
           "name" -> "MilitaryConflict",
            "desc" -> "a military conflict",
            "input" -> "*C.English+Civil+War-_*MilitaryConflict-"}, {
           "name" -> "Word", "desc" -> "a word",
            "input" -> "*C.English+Civil+War-_*Word-"}, {
           "name" -> "HistoricalEvent",
            "desc" -> "a historical event",
            "input" -> "*C.English+Civil+War-_*HistoricalEvent-"}}}, {
        "type" -> "SubCategory", "word" -> "English Civil War",
         "template" -> "Assuming ${desc1}. Use ${desc2} instead",
         "count" -> "4",
         "Values" -> {{
           "name" -> "EnglishCivilWar",
            "desc" -> "English Civil War (1642 - 1651)",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*EnglishCivilWar-"}, {
           "name" -> "FirstEnglishCivilWar",
            "desc" -> "English Civil War (1642 - 1646)",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*FirstEnglishCivilWar-"}, {
           "name" -> "SecondEnglishCivilWar",
            "desc" -> "Second English Civil War",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*SecondEnglishCivilWar-"}, {
           "name" -> "ThirdEnglishCivilWar",
            "desc" -> "Third English Civil War",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*ThirdEnglishCivilWar-"}}}}, Typeset`assumptions$$ = {},
       Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {
       "Online" -> True, "Allowed" -> True,
        "mparse.jsp" -> 1.305362`6.5672759594240935,
        "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{275., {7., 17.}},
         Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
          Typeset`assumptions$$, Typeset`open$$,
SelectWithContents->True]\)["Battles"], 15]]

You could start looking at how armies moved, or who won and who lost at different points. At first, you can write a computational essay in which the computations are basically just generating custom infographics to illustrate your narrative. But then you can go further—and start really doing “computational history”. You can start to compute various statistical measures of the progress of the war. You can find ways to quantitatively compare it to other wars, and so on.

Can you make a “computational essay” about art? Absolutely. Maybe about art history. Pick 10 random paintings by van Gogh:

van Gogh paintings output

DynamicModuleBox[{Typeset`query$$ = "van gogh", Typeset`boxes$$ =
       TemplateBox[{"\"Vincent van Gogh\"",
RowBox[{"Entity", "[",
RowBox[{"\"Person\"", ",", "\"VincentVanGogh::9vq62\""}], "]"}],
         "\"Entity[\\\"Person\\\", \\\"VincentVanGogh::9vq62\\\"]\"",
         "\"person\""}, "Entity"],
       Typeset`allassumptions$$ = {{
        "type" -> "Clash", "word" -> "van gogh",
         "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "4",
         "Values" -> {{
           "name" -> "Person", "desc" -> "a person",
            "input" -> "*C.van+gogh-_*Person-"}, {
           "name" -> "Movie", "desc" -> "a movie",
            "input" -> "*C.van+gogh-_*Movie-"}, {
           "name" -> "SolarSystemFeature",
            "desc" -> "a solar system feature",
            "input" -> "*C.van+gogh-_*SolarSystemFeature-"}, {
           "name" -> "Word", "desc" -> "a word",
            "input" -> "*C.van+gogh-_*Word-"}}}},
       Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
       Typeset`querystate$$ = {
       "Online" -> True, "Allowed" -> True,
        "mparse.jsp" -> 0.472412`6.125865914333281,
        "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{227., {7., 17.}},
         Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
          Typeset`assumptions$$, Typeset`open$$,
SelectWithContents->True]\)["NotableArtworks"], 10], "Image"]

Then look at what colors they use (a surprisingly narrow selection):



Or maybe one could write a computational essay about actually creating art, or music.

What about science? You could rediscover Kepler’s laws by looking at properties of planets:

DynamicModuleBox[{Typeset`query$$ = "planets", Typeset`boxes$$ =
RowBox[{"EntityClass", "[",
RowBox[{"\"Planet\"", ",", "All"}], "]"}],
       "\"EntityClass[\\\"Planet\\\", All]\"", "\"planets\""},
     Typeset`allassumptions$$ = {{
      "type" -> "Clash", "word" -> "planets",
       "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "4",
       "Values" -> {{
         "name" -> "PlanetClass", "desc" -> " referring to planets",
          "input" -> "*C.planets-_*PlanetClass-"}, {
         "name" -> "ExoplanetClass",
          "desc" -> " referring to exoplanets",
          "input" -> "*C.planets-_*ExoplanetClass-"}, {
         "name" -> "MinorPlanetClass",
          "desc" -> " referring to minor planets",
          "input" -> "*C.planets-_*MinorPlanetClass-"}, {
         "name" -> "Word", "desc" -> "a word",
          "input" -> "*C.planets-_*Word-"}}}},
     Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
     Typeset`querystate$$ = {
     "Online" -> True, "Allowed" -> True,
      "mparse.jsp" -> 0.400862`6.054539882441674, "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{171., {7., 17.}},
       Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
        Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
SelectWithContents->True]\)[{"DistanceFromSun", "OrbitPeriod"}]



Maybe you could go on and check it for exoplanets. Or you could start solving the equations of motion for planets.

You could look at biology. Here’s the first beginning of the reference sequence for the human mitochondrion:

GenomeData[{"Mitochondrion", {1, 150}}]

GenomeData[{"Mitochondrion", {1, 150}}]

You can start off breaking it into possible codons:

StringPartition[%, 3]

StringPartition[%, 3]

There’s an immense amount of data about all kinds of things built into the Wolfram Language. But there’s also the Wolfram Data Repository, which contains all sorts of specific datasets. Like here’s a map of state fairgrounds in the US:

GeoListPlot[  ResourceData["U.S. State Fairgrounds"][All, "GeoPosition"]]

 ResourceData["U.S. State Fairgrounds"][All, "GeoPosition"]]

And here’s a word cloud of the constitutions of countries that have been enacted since 2010:

  Normal[ResourceData["World Constitutions"][
    Select[#YearEnacted > \!\(\*
DynamicModuleBox[{Typeset`query$$ = "year 2010", Typeset`boxes$$ =
           RowBox[{"DateObject", "[",
RowBox[{"{", "2010", "}"}], "]"}],
           Typeset`allassumptions$$ = {{
            "type" -> "MultiClash", "word" -> "",
             "template" -> "Assuming ${word1} is referring to \
${desc1}. Use \"${word2}\" as ${desc2}.", "count" -> "2",
             "Values" -> {{
               "name" -> "PseudoTokenYear", "word" -> "year 2010",
                "desc" -> "a year",
                "input" -> "*MC.year+2010-_*PseudoTokenYear-"}, {
               "name" -> "Unit", "word" -> "year", "desc" -> "a unit",
                 "input" -> "*MC.year+2010-_*Unit-"}}}},
           Typeset`assumptions$$ = {}, Typeset`open$$ = {1},
           Typeset`querystate$$ = {
           "Online" -> True, "Allowed" -> True,
            "mparse.jsp" -> 0.542662`6.186074404594303,
            "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{86., {7., 18.}},
             Typeset`query$$, Typeset`boxes$$,
              Typeset`allassumptions$$, Typeset`assumptions$$,
              Typeset`open$$, Typeset`querystate$$}],
SelectWithContents->True]\) &], "Text"]]]]

Quite often one’s interested in dealing not with public data, but with some kind of local data. One convenient source of this is the Wolfram Data Drop. In an educational setting, particular databins (or cloud objects in general) can be set so that they can be read (and/or added to) by some particular group. Here’s a databin that I accumulate for myself, showing my heart rate through the day. Here it is for today:



Of course, it’s easy to make a histogram too:



What about math? A key issue in math is to understand why things are true. The traditional approach to this is to give proofs. But computational essays provide an alternative. The nature of the steps in them is different—but the objective is the same: to show what’s true and why.

As a very simple example, let’s look at primes. Here are the first 50:

Table[Prime[n], {n, 50}]

Table[Prime[n], {n, 50}]

Let’s find the remainder mod 6 for all these primes:

Mod[Table[Prime[n], {n, 50}], 6]

Mod[Table[Prime[n], {n, 50}], 6]

But why do only 1 and 5 occur (well, after the trivial cases of the primes 2 and 3)? We can see this by computation. Any number can be written as 6n+k for some n and k:

Table[6 n + k, {k, 0, 5}]

Table[6 n + k, {k, 0, 5}]

But if we factor numbers written in this form, we’ll see that 6n+1 and 6n+5 are the only ones that don’t have to be multiples:



What about computer science? One could for example write a computational essay about implementing Euclid’s algorithm, studying its running time, and so on.

Define a function to give all steps in Euclid’s algorithm:

gcdlist[a_, b_] :=   NestWhileList[{Last[#], Apply[Mod, #]} &, {a, b}, Last[#] != 0 &, 1]

gcdlist[a_, b_] :=
 NestWhileList[{Last[#], Apply[Mod, #]} &, {a, b}, Last[#] != 0 &, 1]

Find the distribution of running lengths for the algorithm for numbers up to 200:

Histogram[Flatten[Table[Length[gcdlist[i, j]], {i, 200}, {j, 200}]]]

Histogram[Flatten[Table[Length[gcdlist[i, j]], {i, 200}, {j, 200}]]]

Or in modern times, one could explore machine learning, starting, say, by making a feature space plot of part of the MNIST handwritten digits dataset:

FeatureSpacePlot[RandomSample[Keys[ResourceData["MNIST"]], 50]]

FeatureSpacePlot[RandomSample[Keys[ResourceData["MNIST"]], 50]]

If you wanted to get deeper into software engineering, you could write a computational essay about the HTTP protocol. This gets an HTTP response from a site:



And this shows the tree structure of the elements on the webpage at that URL:

TreeForm[Import["http://www.wolframalpha.com", {"HTML", "XMLObject"}],   VertexLabeling -> False, AspectRatio -> 1/2]

TreeForm[Import["http://www.wolframalpha.com", {"HTML", "XMLObject"}],
  VertexLabeling -> False, AspectRatio -> 1/2]

Or—in a completely different direction—you could talk about anatomy:

AnatomyPlot3D[left foot]

AnatomyPlot3D[Entity["AnatomicalStructure", "LeftFoot"]]

What Makes a Good Computational Essay?

As far as I’m concerned, for a computational essay to be good, it has to be as easy to understand as possible. The format helps quite a lot, of course. Because a computational essay is full of outputs (often graphical) that are easy to skim, and that immediately give some impression of what the essay is trying to say. It also helps that computational essays are structured documents, that deliver information in well-encapsulated pieces.

But ultimately it’s up to the author of a computational essay to make it clear. But another thing that helps is that the nature of a computational essay is that it must have a “computational narrative”—a sequence of pieces of code that the computer can execute to do what’s being discussed in the essay. And while one might be able to write an ordinary essay that doesn’t make much sense but still sounds good, one can’t ultimately do something like that in a computational essay. Because in the end the code is the code, and actually has to run and do things.

So what can go wrong? Well, like English prose, Wolfram Language code can be unnecessarily complicated, and hard to understand. In a good computational essay, both the ordinary text, and the code, should be as simple and clean as possible. I try to enforce this for myself by saying that each piece of input should be at most one or perhaps two lines long—and that the caption for the input should always be just one line long. If I’m trying to do something where the core of it (perhaps excluding things like display options) takes more than a line of code, then I break it up, explaining each line separately.

Another important principle as far as I’m concerned is: be explicit. Don’t have some variable that, say, implicitly stores a list of words. Actually show at least part of the list, so people can explicitly see what it’s like. And when the output is complicated, find some tabulation or visualization that makes the features you’re interested in obvious. Don’t let the “key result” be hidden in something that’s tucked away in the corner; make sure the way you set things up makes it front and center.

Use the structured nature of notebooks. Break up computational essays with section headings, again helping to make them easy to skim. I follow the style of having a “caption line” before each input. Don’t worry if this somewhat repeats what a paragraph of text has said; consider the caption something that someone who’s just “looking at the pictures” might read to understand what a picture is of, before they actually dive into the full textual narrative.

The technology of Wolfram Notebooks makes it straightforward to put in interactive elements, like Manipulate, into computational essays. And sometimes this is very helpful, and perhaps even essential. But interactive elements shouldn’t be overused. Because whenever there’s an element that requires interaction, this reduces the ability to skim the essay.

Sometimes there’s a fair amount of data—or code—that’s needed to set up a particular computational essay. The cloud is very useful for handling this. Just deploy the data (or code) to the Wolfram Cloud, and set appropriate permissions so it can automatically be read whenever the code in your essay is executed.

Notebooks also allow “reverse closing” of cells—allowing an output cell to be immediately visible, even though the input cell that generated it is initially closed. This kind of hiding of code should generally be avoided in the body of a computational essay, but it’s sometimes useful at the beginning or end of an essay, either to give an indication of what’s coming, or to include something more advanced where you don’t want to go through in detail how it’s made.

OK, so if a computational essay is done, say, as homework, how can it be assessed? A first, straightforward question is: does the code run? And this can be determined pretty much automatically. Then after that, the assessment process is very much like it would be for an ordinary essay. Of course, it’s nice and easy to add cells into a notebook to give comments on what’s there. And those cells can contain runnable code—that for example can take results in the essay and process or check them.

Are there principles of good computational essays? Here are a few candidates:

0. Understand what you’re talking about (!)

1. Find the most straightforward and direct way to represent your subject matter

2. Keep the core of each piece of Wolfram Language input to a line or two

3. Use explicit visualization or other information presentation as much as possible

4. Try to make each input+caption independently understandable

5. Break different topics or directions into different subsections

Learning the Language

At the core of computational essays is the idea of expressing computational thoughts using the Wolfram Language. But to do that, one has to know the language. Now, unlike human languages, the Wolfram Language is explicitly designed (and, yes, that’s what I’ve been doing for the past 30+ years) to follow definite principles and to be as easy to learn as possible. But there’s still learning to be done.

One feature of the Wolfram Language is that—like with human languages—it’s typically easier to read than to write. And that means that a good way for people to learn what they need to be able to write computational essays is for them first to read a bunch of essays. Perhaps then they can start to modify those essays. Or they can start creating “notes essays”, based on code generated in livecoding or other classroom sessions.

As people get more fluent in writing the Wolfram Language, something interesting happens: they start actually expressing themselves in the language, and using Wolfram Language input to carry significant parts of the narrative in a computational essay.

When I was writing An Elementary Introduction to the Wolfram Language (which itself is written in large part as a sequence of computational essays) I had an interesting experience. Early in the book, it was decently easy to explain computational exercises in English (“Make a table of the first 10 squares”). But a little later in the book, it became a frustrating process.

It was easy to express what I wanted in the Wolfram Language. But to express it in English was long and awkward (and had a tendency of sounding like legalese). And that’s the whole point of using the Wolfram Language, and the reason I’ve spent 30+ years building it: because it provides a better, crisper way to express computational thoughts.

It’s sometimes said of human languages that the language you use determines how you think. It’s not clear how true this is of human languages. But it’s absolutely true of computer languages. And one of the most powerful things about the Wolfram Language is that it helps one formulate clear computational thinking.

Traditional computer languages are about writing code that describes the details of what a computer should do. The point of the Wolfram Language is to provide something much higher level—that can immediately talk about things in the world, and that can allow people as directly as possible to use it as a medium of computational thinking. And in a sense that’s what makes a good computational essay possible.

The Long Path to Computational Essays

Now that we have full-fledged computational essays, I realize I’ve been on a path towards them for nearly 40 years. At first I was taking interactive computer output and Scotch-taping descriptions into it:

Interactive computer output sketch

By 1981, when I built SMP, I was routinely writing documents that interspersed code and explanations:

Code interspersed with explanations

But it was only in 1986, when I started documenting what became Mathematica and the Wolfram Language, that I started seriously developing a style close to what I now favor for computational essays:

Wolfram Language Version 1 documentation

And with the release of Mathematica 1.0 in 1988 came another critical element: the invention of Wolfram Notebooks. Notebooks arrived in a form at least superficially very similar to the way they are today (and already in many ways more sophisticated than the imitations that started appearing 25+ years later!): collections of cells arranged into groups, and capable of containing text, executable code, graphics, etc.

Early Mac notebooks

At first notebooks were only possible on Mac and NeXT computers. A few years later they were extended to Microsoft Windows and X Windows (and later, Linux). But immediately people started using notebooks both to provide reports about they’d done, and to create rich expository and educational material. Within a couple of years, there started to be courses based on notebooks, and books printed from notebooks, with interactive versions available on CD-ROM at the back:

Notebook publication example

So in a sense the raw material for computational essays already existed by the beginning of the 1990s. But to really make computational essays come into their own required the development of the cloud—as well as the whole broad range of computational knowledge that’s now part of the Wolfram Language.

By 1990 it was perfectly possible to create a notebook with a narrative, and people did it, particularly about topics like mathematics. But if there was real-world data involved, things got messy. One had to make sure that whatever was needed was appropriately available from a distribution CD-ROM or whatever. We created a Player for notebooks very early, that was sometimes distributed with notebooks.

But in the last few years, particularly with the development of the Wolfram Cloud, things have gotten much more streamlined. Because now you can seamlessly store things in the cloud and use them anywhere. And you can work directly with notebooks in the cloud, just using a web browser. In addition, thanks to lots of user-assistance innovations (including natural language input), it’s become even easier to write in the Wolfram Language—and there’s ever more that can be achieved by doing so.

And the important thing that I think has now definitively happened is that it’s become lightweight enough to produce a good computational essay that it makes sense to do it as something routine—either professionally in writing reports, or as a student doing homework.

Ancient Educational History

The idea of students producing computational essays is something new for modern times, made possible by a whole stack of current technology. But there’s a curious resonance with something from the distant past. You see, if you’d learned a subject like math in the US a couple of hundred years ago, a big thing you’d have done is to create a so-called ciphering book—in which over the course of several years you carefully wrote out the solutions to a range of problems, mixing explanations with calculations. And the idea then was that you kept your ciphering book for the rest of your life, referring to it whenever you needed to solve problems like the ones it included.

Well, now, with computational essays you can do very much the same thing. The problems you can address are vastly more sophisticated and wide-ranging than you could reach with hand calculation. But like with ciphering books, you can write computational essays so they’ll be useful to you in the future—though now you won’t have to imitate calculations by hand; instead you’ll just edit your computational essay notebook and immediately rerun the Wolfram Language inputs in it.

I actually only learned about ciphering books quite recently. For about 20 years I’d had essentially as an artwork a curious handwritten notebook (created in 1818, it says, by a certain George Lehman, apparently of Orwigsburg, Pennsylvania), with pages like this:

Ciphering book

I now know this is a ciphering book—that on this page describes how to find the “height of a perpendicular object… by having the length of the shadow given”. And of course I can’t resist a modern computational essay analog, which, needless to say, can be a bit more elaborate.

Find the current position of the Sun as azimuth, altitude:



Find the length of a shadow for an object of unit height:



Given a 10-ft shadow, find the height of the object that made it:



The Path Ahead

I like writing textual essays (such as blog posts!). But I like writing computational essays more. Because at least for many of the things I want to communicate, I find them a purer and more efficient way to do it. I could spend lots of words trying to express an idea—or I can just give a little piece of Wolfram Language input that expresses the idea very directly and shows how it works by generating (often very visual) output with it.

When I wrote my big book A New Kind of Science (from 1991 to 2002), neither our technology nor the world was quite ready for computational essays in the form in which they’re now possible. My research for the book filled thousands of Wolfram Notebooks. But when it actually came to putting together the book, I just showed the results from those notebooks—including a little of the code from them in notes at the back of the book.

But now the story of the book can be told in computational essays—that I’ve been starting to produce. (Just for fun, I’ve been livestreaming some of the work I’m doing to create these.)  And what’s very satisfying is just how clearly and crisply the ideas in the book can be communicated in computational essays.

There is so much potential in computational essays. And indeed we’re now starting the project of collecting “topic explorations” that use computational essays to explore a vast range of topics in unprecedentedly clear and direct ways. It’ll be something like our Wolfram Demonstrations Project (that now has 11,000+ Wolfram Language–powered Demonstrations). Here’s a typical example I wrote:

The Central Limit Theorem

Computational essays open up all sorts of new types of communication. Research papers that directly present computational experiments and explorations. Reports that describe things that have been found, but allow other cases to be immediately explored. And, of course, computational essays define a way for students (and others) to very directly and usefully showcase what they’ve learned.

There’s something satisfying about both writing—and reading—computational essays. It’s as if in communicating ideas we’re finally able to go beyond pure human effort—and actually leverage the power of computation. And for me, having built the Wolfram Language to be a computational communication language, it’s wonderful to see how it can be used to communicate so effectively in computational essays.

It’s so nice when I get something sent to me as a well-formed computational essay. Because I immediately know that I’m going to get a straight story that I can actually understand. There aren’t going to be all sorts of missing sources and hidden assumptions; there’s just going to be Wolfram Language input that stands alone, and that I can take out and study or run for myself.

The modern world of the web has brought us a few new formats for communication—like blogs, and social media, and things like Wikipedia. But all of these still follow the basic concept of text + pictures that’s existed since the beginning of the age of literacy. With computational essays we finally have something new—and it’s going to be exciting to see all the things it makes possible.

http://blog.wolfram.com/2017/11/14/what-is-a-computational-essay/feed/ 0
Limits without Limits in Version 11.2 http://blog.wolfram.com/2017/11/09/limits-without-limits-in-version-11-2/ http://blog.wolfram.com/2017/11/09/limits-without-limits-in-version-11-2/#comments Thu, 09 Nov 2017 17:13:23 +0000 Devendra Kapadia http://blog.internal.wolfram.com/?p=39140 Limits lead image

Here are 10 terms in a sequence:

Table[(2/(2 n + 1)) ((2 n)!!/(2 n - 1)!!)^2, {n, 10}]

And here’s what their numerical values are:


But what is the limit of the sequence? What would one get if one continued the sequence forever?

In Mathematica and the Wolfram Language, there’s a function to compute that:

DiscreteLimit[(2/(2 n + 1)) ((2 n)!!/(2 n - 1)!!)^2, n -> \[Infinity]]

Limits are a central concept in many areas, including number theory, geometry and computational complexity. They’re also at the heart of calculus, not least since they’re used to define the very notions of derivatives and integrals.

Mathematica and the Wolfram Language have always had capabilities for computing limits; in Version 11.2, they’ve been dramatically expanded. We’ve leveraged many areas of the Wolfram Language to achieve this, and we’ve invented some completely new algorithms too. And to make sure we’ve covered what people want, we’ve sampled over a million limits from Wolfram|Alpha.

Let’s talk about a limit that Hardy and Ramanujan worked out in 1918. But let’s build up to that. First, consider the sequence a(n) that is defined as follows:

a[n_] := (-1)^n/n

Here is a table of the first ten values for the sequence.

Table[a[n], {n, 1, 10}]

The following plot indicates that the sequence converges to 0 as n approaches Infinity.

DiscretePlot[a[n], {n, 1, 40}]

The DiscreteLimit function, which was introduced in Version 11.2, confirms that the limit of this sequence is indeed 0.

DiscreteLimit[a[n], n -> \[Infinity]]

Many sequences that arise in practice (for example, in signal communication) are periodic in the sense that their values repeat themselves at regular intervals. The length of any such interval is called the period of the sequence. As an example, consider the following sequence that is defined using Mod.

a[n_] := Mod[n, 6]

A plot of the sequence shows that the sequence is periodic with period 6.

DiscretePlot[a[n], {n, 0, 20}]

In contrast to our first example, this sequence does not converge, since it oscillates between 0 and 5. Hence, DiscreteLimit returns Indeterminate in this case.

DiscreteLimit[a[n], n -> \[Infinity]]

The new Version 11.2 functions DiscreteMinLimit and DiscreteMaxLimit can be used to compute the lower and upper limits of oscillation, respectively, in such cases. Thus, we have:

DiscreteMinLimit[a[n], n -> \[Infinity]]

DiscreteMaxLimit[a[n], n -> \[Infinity]]

DiscreteMinLimit and DiscreteMaxLimit are often referred to as “lim inf” and “lim sup,” respectively, in the mathematical literature. The traditional underbar and overbar notations for these limits are available, as shown here.

 \!\(\*UnderscriptBox[\(\[MinLimit]\), \(n\* UnderscriptBox["\[Rule]",  TemplateBox[{}, "Integers"]]\[Infinity]\)]\) a[n]

 \!\(\*UnderscriptBox[\(\[MaxLimit]\), \(n\* UnderscriptBox["\[Rule]",  TemplateBox[{}, "Integers"]]\[Infinity]\)]\) a[n]

Our next example is an oscillatory sequence that is built from the trigonometric functions Sin and Cos, and is defined as follows.

a[n_] := Sin[2 n]^2/(2 + Cos[n])

Although Sin and Cos are periodic when viewed as functions over the real numbers, this integer sequence behaves in a bizarre manner and is very far from being a periodic sequence, as confirmed by the following plot.

DiscretePlot[a[n], {n, 1, 100}]

Hence, the limit of this sequence does not exist.

DiscreteLimit[a[n], n -> \[Infinity]]

However, it turns out that for such “densely aperiodic sequences,” the extreme values can be computed by regarding them as real functions. DiscreteMinLimit uses this method to return the answer 0 for the example, as expected.

DiscreteMinLimit[a[n], n -> \[Infinity]]

Using the same method, DiscreteMaxLimit returns a rather messy-looking result in terms of Root objects for this example.

DiscreteMaxLimit[a[n], n -> \[Infinity]]

The numerical value of this result is close to 0.8, as one might have guessed from the graph.


Discrete limits also occur in a natural way when we try to compute the value of infinitely nested radicals. For example, consider the problem of evaluating the following nested radical.

Nested radical

The successive terms in the expansion of the radical can be generated by using RSolveValue, since the sequence satisfies a nonlinear recurrence. For example, the third term in the expansion is obtained as follows.

RSolveValue[{r[n + 1] == Sqrt[2 + r[n]], r[1] == Sqrt[2]}, r[3], n]

The value of the infinitely nested radical appears to be 2, as seen from the following plot that is generated using RecurrenceTable.

ListPlot[RecurrenceTable[{r[n + 1] == Sqrt[2 + r[n]],      r[1] == Sqrt[2]}, r[n], {n, 2, 35}]]

Using Version 11.2, we can confirm that the limiting value is indeed 2 by requesting the value r(∞) in RSolveValue, as shown here.

RSolveValue[{r[n + 1] == Sqrt[2 + r[n]], r[1] == 2},   r[\[Infinity]], n]

The study of limits belongs to the branch of mathematics called asymptotic analysis. Asymptotic analysis provides methods for obtaining approximate solutions of problems near a specific value such as 0 or Infinity. It turns out that, in practice, the efficiency of asymptotic approximations often increases precisely in the regime where the corresponding exact computation becomes difficult! A striking example of this phenomenon is seen in the study of integer partitions, which are known to grow extremely fast as the size of the number increases. For example, the number 6 can be partitioned in 11 distinct ways using IntegerPartitions, as shown here.

IntegerPartitions[6] // TableForm


The number of distinct partitions can be found directly using PartitionsP as follows.


As noted earlier, the number of partitions grows rapidly with the size of the integer. For example, there are nearly 4 trillion partitions of the number 200.



In 1918, Hardy and Ramanujan provided an asymptotic approximation for this number, which is given by the following formula.

asymp[n_] := E^(\[Pi] Sqrt[(2 n)/3])/(4 n Sqrt[3])

The answer given by this estimate for the number 200 is remarkably close to 4 trillion.

asymp[200] // N

With a much larger integer, we get an even better approximation for the number of partitions almost instantaneously, as seen in the following example.

PartitionsP[2000000000] // N // Timing

N[asymp[2000000000] , 20] // Timing

Finally, we can confirm that the asymptotic estimate approaches the number of partitions as n tends to Infinity using DiscreteLimit, which is aware of the Hardy–Ramanujan formula discussed above.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] 0\)]\) Sin[x]/x

Formally, we say that exact and approximate formulas for the number of partitions are asymptotically equivalent as n approaches Infinity.

Asymptotic notions also play an important rule in the study of function limits. For instance, the small-angle approximation in trigonometry asserts that “sin(x) is nearly equal to x for small values of x.” This may be rephrased as “sin(x) is asymptotically equivalent to x as x approaches 0.” A formal statement of this result can be given using Limit, which computes function limits, as follows.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] 0\)]\) Sin[x]/x

This plot provides visual confirmation that the limit is indeed 1.

Plot[Sin[x]/x, {x, -20, 20}, PlotRange -> All]

The above limit can also be calculated using L’Hôspital’s rule by computing the derivatives, cos(x) and 1, of the numerator and denominator respectively, as shown here.

Limit[Cos[x]/1, x -> 0]

L’Hôspital’s rule gives a powerful method for evaluating many limits that occur in practice. However, it may require a large number of steps before arriving at the answer. For example, consider the following limit.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) x^6/E^  x

That limit requires six repeated applications of L’Hôspital’s rule to arrive at the answer 0, since all the intermediate computations give indeterminate results.

Table[ \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) \!\( \*SubscriptBox[\(\[PartialD]\), \({x, n}\)] \*SuperscriptBox[\(x\), \(6\)]\)/ \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) \!\( \*SubscriptBox[\(\[PartialD]\), \({x, n}\)] \*SuperscriptBox[\(E\), \(x\)]\), {n, 0, 10}]

Thus, we see that L’Hôspital’s rule has limited utility as a practical algorithm for finding function limits, since it is impossible to decide when the algorithm should stop! Hence, the built-in Limit function uses a combination of series expansions and modern algorithms that works well on inputs involving exponentials and logarithms, the so-called “exp-log” class. In fact, Limit has received a substantial update in Version 11.2 and now handles a wide variety of difficult examples, such as the following, in a rather comprehensive manner (the last two examples work only in the latest release).

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\)   Gamma[x + 1/2]/(Gamma[x] Sqrt[x])

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) (  Log[x] (-Log[Log[x]] + Log[Log[x] + Log[Log[x]]]))/  Log[Log[x] + Log[Log[Log[x]]]]

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) E^E^E^  PolyGamma[PolyGamma[PolyGamma[x]]]/x

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\)   E^(E^x + x^2) (-Erf[E^-E^x - x] - Erf[x])

As in the cases of sequences, the limits of periodic and oscillatory functions will often not exist. One can then use MaxLimit and MinLimit, which, like their discrete counterparts, give tight bounds for the oscillation of the function near a given value, as in this classic example.

f[x_] := Sin[1/x]

Plot[f[x], {x, -1, 1}]

The graph indicates the function oscillates rapidly between –1 and 1 near 0. These bounds are confirmed by MaxLimit and MinLimit, while Limit itself returns Indeterminate.

{Limit[f[x], x -> 0], MinLimit[f[x], x -> 0], MaxLimit[f[x], x -> 0]}

In the previous example, the limit fails to exist because the function oscillates wildly around the origin. Discontinuous functions provide other types of examples where the limit at a point may fail to exist. We will now consider an example of such a function with a jump discontinuity at the origin and other values. The function is defined in terms of SquareWave and FresnelS, as follows.

g[x_] := (SquareWave[x] FresnelS[x])/x^3

This plot shows the jump discontinuities, which are caused by the presence of SquareWave in the definition of the function.

Plot[{g[x], -Pi/6, Pi/6}, {x, -2, 2},   ExclusionsStyle -> Directive[Red, Dashed]]

We see that the limiting values of the function at 0, for instance, depend on the direction from which we approach the origin. The limiting value from the right (“from above”) can be calculated using the Direction option.

Limit[g[x], x -> 0, Direction -> "FromAbove"]

Similarly, the limit from the left can be calculated as follows.

Limit[g[x], x -> 0, Direction -> "FromBelow"]

The limit, if it exists, is the “two-sided” limit for the function that, in this case, does not exist.

Limit[g[x], x -> 0, Direction -> "TwoSided"]

By default, Limit computes two-sided limits in Version 11.2. This is a change from earlier versions, where it computed the limit from above by default. Hence, we get an Indeterminate result from Limit, with no setting for the Direction option.

Limit[g[x], x -> 0]

Directional limits acquire even more significance in the multivariate case, since there are many possible directions for approaching a given point in higher dimensions. For example, consider the bivariate function f(x,y) that is defined as follows.

f[x_, y_] := (x y)/(x^2 + y^2)

The limiting value of this function at the origin is 0 if we approach it along the x axis, which is given by y=0, since the function has the constant value 0 along this line.

f[x, 0]

Similarly, the limiting value of the function at the origin is 0 if we approach it along the y axis, which is given by x=0.

f[0, y]

However, the limit is 1/2 if we approach the origin along the line y=x, as seen here.

f[x, y] /. {y -> x}

More generally, the limiting value changes as we approach the origin along different lines y=m x.

f[x, y] /. {y -> m x} // Simplify

The directional dependence of the limiting value implies that the true multivariate limit does not exist. In Version 11.2, Limit handles multivariate examples with ease, and quickly returns the expected answer Indeterminate for the limit of this function at the origin.

Limit[f[x, y], {x, y} -> {0, 0}]

A plot of the surface z=f(x,y) confirms the behavior of the function near the origin.

Plot3D[f[x, y], {x, -4, 4}, {y, -4, 4}]

This example indicates that, in general, multivariate limits do not exist. In other cases, such as the following, the limit exists but the computation is subtle.

f[x_, y_] := (x^2 + y^2)/(3^(Abs[x] + Abs[y]) - 1)

This plot indicates that the limit of this function at {0,0} exists and is 0, since the function values appear to approach 0 from all directions.

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}, PlotRange -> All]

The answer can be confirmed by applying Limit to the function directly.

Limit[f[x, y], {x, y} -> {0, 0}]

A rich source of multivariate limit examples is provided by the steady stream of inputs that is received by Wolfram|Alpha each day. We acquired around 100,000 anonymized queries to Wolfram|Alpha from earlier years, which were then evaluated using Version 11.2 . Here is a fairly complicated example from this vast collection that Limit handles with ease in the latest version.

f[x_, y_] := Cos[Abs[x] Abs[y]] - 1

Plot3D[f[x, y], {x, -3, 3}, {y, -3, 3}]

Limit[f[x, y], {x, y} -> {0, 0}]

It is a sheer joy to browse through the examples from Wolfram|Alpha, so we decided to share 1,000 nontrivial examples from the collection with you. Sample images of the examples are shown below. The five notebooks with the examples can be downloaded here.

Downloadable notebooks

Version 11.2 evaluates 90% of the entire collection in the benchmark, which is remarkable since the functionality for multivariate limits is new in this release.

Limits pie chart

Version 11.2 also evaluates a higher fraction (96%) of an even larger collection of 1,000,000 univariate limits from Wolfram|Alpha when compared with Version 11.1 (94%). The small percentage difference between the two versions can be explained by noting that most Wolfram|Alpha queries for univariate limits relate to a first or second course in college calculus and are easily computed by Limit in either version.

Limit has been one of the most dependable functions in the Wolfram Language ever since it was first introduced in Version 1 (1988). The improvements for this function, along with DiscreteLimit and other new functions in Version 11.2, have facilitated our journey through the world of limits. I hope that you have enjoyed this brief tour, and welcome any comments or suggestions about the new features.

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

http://blog.wolfram.com/2017/11/09/limits-without-limits-in-version-11-2/feed/ 0