Wolfram Blog http://blog.wolfram.com News, views, and ideas from the front lines at Wolfram Research. Fri, 29 Apr 2016 14:46:38 +0000 en hourly 1 http://wordpress.org/?v=3.2.1 Celebrate Math Awareness with This Wolfram|Alpha Promo http://blog.wolfram.com/2016/04/25/celebrate-math-awareness-with-this-wolframalpha-promo/ http://blog.wolfram.com/2016/04/25/celebrate-math-awareness-with-this-wolframalpha-promo/#comments Mon, 25 Apr 2016 18:18:46 +0000 Hy Nguyen http://blog.internal.wolfram.com/?p=30465 April and Mathematics Awareness Month will soon be coming to an end, and so will these special offers on Mathematica and Wolfram|Alpha. As I mentioned in my last post, this year’s Mathematics Awareness Month explores “the Future of Prediction” via mathematics and statistics. Ever since the earliest recognition of mathematics, people have used it to make accurate predictions not only in math but also in related fields.

Math Awareness Month

For almost seven years now, millions of people have used the computational powers of Wolfram|Alpha to check their own predictions and explore fascinating topics in math, science, history, linguistics, culture, the arts, and more. Take Brian A. Carr from Jackson, Wyoming, for example. He is a classical scholar and career firefighter with Jackson Hole Fire/EMS. Carr uses Wolfram|Alpha for a variety of interests, “from helping to explain geometrical and mathematical concepts related to my study of early Greek mathematics to investigating the physics of room-and-contents fires.”

Wolfram|Alpha “was a lifesaver in my graduate work,” says Samantha Howard, who is now a physicist in the United States Air Force. Howard, who received her master’s degree in applied physics from the Air Force Institute of Technology, first discovered Wolfram|Alpha during her undergraduate years at the USAF Academy.

“I love Wolfram|Alpha because it helps me solve real-world problems while I’m on the road,” says Andre Koppel, a financial analyst. In addition to finding the natural language interface invaluable for solving complex math questions in his work, Koppel uses Wolfram|Alpha for everyday things like measuring food and drink quantities at restaurants or checking the tides to see if it’s safe to go into coastal waters.

Carr, Howard, and Koppel are only a few representatives of our huge community of users who are working with Wolfram|Alpha every day to explore new horizons. In honor of Mathematics Awareness Month and to encourage mathematics curiosity and exploration, we are offering 20% off subscriptions to Wolfram|Alpha Pro through April 30, 2016. Visit our website, use the promo code MATHMONTH20OFF, and start exploring!

http://blog.wolfram.com/2016/04/25/celebrate-math-awareness-with-this-wolframalpha-promo/feed/ 0
Analyzing Shakespeare’s Texts on the 400th Anniversary of His Death http://blog.wolfram.com/2016/04/21/analyzing-shakespeares-texts-on-the-400th-anniversary-of-his-death/ http://blog.wolfram.com/2016/04/21/analyzing-shakespeares-texts-on-the-400th-anniversary-of-his-death/#comments Thu, 21 Apr 2016 18:53:27 +0000 Jofre Espigule-Pons http://blog.internal.wolfram.com/?p=30470

Putting some color in Shakespeare’s tragedies with the Wolfram Language

After four hundred years, Shakespeare’s works are still highly present in our culture. He mastered the English language as never before, and he deeply understood the emotions of the human mind.

Have you ever explored Shakespeare’s texts from the perspective of a data scientist? Wolfram technologies can provide you with new insights into the semantics and statistical analysis of Shakespeare’s plays and the social networks of their characters.

William Shakespeare (April 26, 1564 (baptized)–April 23, 1616) is considered by many to be the greatest writer of the English language. He wrote 154 sonnets, 38 plays (divided into three main groups: comedy, history, and tragedy), and 4 long narrative poems.

Shakespeare's works

I will start by creating a nice WordCloud from one of his famous tragedies, Romeo and Juliet. You can achieve this with just a couple lines of Wolfram Language code.

First, you need to get the text. One possibility is to import the public-domain HTML versions of the complete works of William Shakespeare from this MIT site:

Importing text for Romeo and Juliet

Then make a word cloud from the text, deleting common stopwords like “and” and “the”:

Romeo and Juliet word cloud

As you can see, DeleteStopwords does not delete all the Elizabethan stopwords like “thou,” “thee,” “thy,” “hath,” etc. But I can delete them manually with StringDelete. And with some minor extra effort, you can improve the word cloud’s style a great deal:

Improving the style of a word cloud

Now let’s analyze a tragedy more deeply. Wolfram|Alpha already offers a lot of computed data about Shakespeare’s plays. For example, if you type “Othello” as Wolfram|Alpha input, you will get the following result:

Information on Othello in Wolfram|Alpha

If you want to visualize the interactions among the characters of this tragedy via a social network, you can achieve this with ease using the Wolfram Language. As I did earlier with the word cloud, I need to first import the texts. In this case I want to work with all the acts and scenes from Othello separately:

Seperating the acts and scenes in Othello

Since I want to import and save the scenes for later use in the same notebook’s folder, I can do the following:

Saving the scenes for later use in the same notebook's folder

In order to create the Graph, I first need all the character names, which will be displayed as vertices. I can gather the names by noticing that each dialog line is preceded by a character name in bold, which in HTML is written like this: <b>Name</b>. Thus it is straightforward to get an ordered list containing all character names (“speakers”) from each dialog line using StringCases:

Using StringCases to get a list of character names from each dialog line

Then, using Union and Flatten, I can obtain the names of all the characters in the tragedy of Othello:

Using Union and Flatten to obtain the character names in Othello

Once I have the vertices, I need to create the edges of the graph. In this case, an edge between two vertices will represent the connection between two characters that are separated by less than two lines within the dialog (similar to the Demonstration by Seth Chandler that analyzes the networks in Shakespeare’s plays). For that purpose, I will use SequenceCases to create all the edges, i.e. pairs of lines separated by less than two lines:

Using SequenceCases to create all the edges

Before creating the graph, I need to delete the edges that are duplicated or are equivalent, like OTHELLO↔IAGO and IAGO↔OTHELLO, and the edges connecting to themselves, i.e. IAGO↔IAGO:

Deleting duplicate edges or equivalents

Finally, you can specify the size of the vertices with the VertexSize option. For example, I want the vertices’ sizes to be proportional to the number of lines per character. I can get the number of lines per character with Counts:

Lines per character using Counts

After this, I can use a logarithmic function to rescale the vertices to a reasonable size. I will also improve the design with VertexStyle and VertexLabels.

Since the code is getting more cumbersome, I will omit it and show only the result (for those interested in the details of the code, you can find them in the attached notebook). Also note that in the final result I’m excluding the vertex “All” since it is not a real character in the dialog:

Interactions among characters in Othello

So far, so good. Having the social network from a Shakespeare play written more than four hundred years ago is quite cool, but I’m still not 100% satisfied. I would like to visualize when these interactions occur within the dialog itself. One way to achieve this is by representing each main speaker with a different-colored bar:

Representing each main character with a different-colored bar

Note: linesColor is a list of colors representing the lines in one scene, and linesLength is the list of the lines’ StringLength with a rescaling function. These functions involve some TextManipulation, like I did earlier to obtain the character names from the HTML version of the play. If you wish, you can see their construction in the attached notebook:

Play progress grid construction
Play progress grid construction

Additionally, I can mark when a particular character says a particular word—for example, the word “love” (note: the variable words is the list of words per line in the scene, created with the new function TextWords; see the attached notebook for details):

Marking when a particular character says a particular word
Marking when a particular character says a particular word
Marking when a particular character says a particular word

Now I can combine all of this with the social network graph and have a colorful and compact infographic about a Shakespeare tragedy:

Othello social network graph

Dialog lines with the word love

There are so many other interesting things that I would like to explore about Shakespeare’s works and life. But I will finish with a map representing the locations at which his plays occur. I hope you got a glance of what is possible to achieve with the Wolfram Language. The only limits are our imagination:

Mapping the locations at which Shakespeare's plays occurred

For a few places, the Interpreter fails to find a GeoPosition, so I used Cases to obtain all the successfully interpreted locations:

Mapping the locations at which Shakespeare's plays occurred

Finally, I’m using Geodisk to depict geopositions by disks with a radius proportional to the number of times each location appears in Shakespeare’s plays:

Map of locations where Shakespeare's plays occur

Many fellow Wolfram users expressed keen interest in and came up with astonishing approaches to Shakespeare’s corpus analysis on Wolfram Community. We hope this blog will inspire you to join that collaborative effort exploring the mysteries of Shakespeare data.

Download this post as a Computable Document Format (CDF) file.

http://blog.wolfram.com/2016/04/21/analyzing-shakespeares-texts-on-the-400th-anniversary-of-his-death/feed/ 1
The 26.2 Blog http://blog.wolfram.com/2016/04/15/the-26-2-blog/ http://blog.wolfram.com/2016/04/15/the-26-2-blog/#comments Fri, 15 Apr 2016 19:43:22 +0000 Eila Stiegler http://blog.internal.wolfram.com/?p=30398 It’s four months into the new year. Spring is here. Well, so they say. And if the temperatures do not convince you, the influx of the number of runners on our roads definitely should. I have always loved running. Despite the fact that during each mile I complain about various combinations of the weather, the mileage, and my general state of mind, I met up with 37,000 other runners for the Chicago Marathon on October 11, 2015. As it turns out, this single event makes for a great example to explore what the Wolfram Language can do with larger datasets. The data we are using below is available on the Chicago Marathon results website.

This marathon is one of the six Abbott World Marathon Majors: the Tokyo, Boston, Virgin Money London, BMW Berlin, Bank of America Chicago, and TCS New York City marathons. If you are looking for things to add to your bucket list, I believe these are great candidates. Given the international appeal, let’s have a look at the runners’ nationalities and their travel paths. Our GeoGraphics functionality easily enables us to do so. Clearly many people traveled very far to participate:

GeoGraphics shows where runners have traveled from

The vast majority, of course, came from the US:

Runners from the United States

Let’s create a heat map to see the distribution of all US runners. As expected, most of them are from Chicago and the Midwest:

Heat map of distribution of US runners

What did the race look like in Chicago? Recreating the map in the Wolfram Language, taking every runner’s running times, and utilizing my coworker’s mad programming skills, we can produce the following movie:

As you can see, the green dot is the winning runner. I am red, and the median is shown in blue. This movie made me realize that while the fastest runner was already approaching the most northern point of the course, I was still trying to meet up with my running partner! The purple bars indicate the density of runners at any given time along the race course. You might wonder what the gold curve is. That would be the center of gravity given the distribution of the runners.

The dataset also includes age division and placement within age group, gender and placement within gender group, all split times, and overall placement. The split times were taken every 5 km, at the half-marathon distance, and, of course, at the finish line. The following image illustrates the interpolated split times for all participants after deducting the starting time of the winning runner:

Interpolated split times for all participants after deducting the starting time of the winning runner

The graphic reflects several things about this race: runners were grouped into two waves, A and B, depending on their expected finishing time. This is illustrated by the split around 2,500 seconds at the starting line. Within each wave, runners were then grouped into corrals. Again, faster runners started in earlier corrals. Thus the later runners got started, the slower they were overall. The resulting slower split times are expressed in a much faster rise of the corresponding lines. It also took 4,503 seconds, a little over 75 minutes, for all runners to get started. In contrast, the last person crossed the finish line 19,949 seconds after the winner of this race. I was neither…

Let’s take a more detailed look at everyone’s start and finish in absolute time. We’re letting the first runner start at 0 seconds by subtracting his time from all participants’. The red dots indicate the mean of the finish time for runners with the exact same starting time:

Everyone's start and finish in absolute time

Again, the two waves are clearly visible. The smaller breaks within each wave indicate the corral changes. But what caught my eye was the handful of people preceding the first wave. Because the dataset provides us with the names of the participants, I was able to drill down and find out whose data I was looking at: it is the “Athletes with Disabilities” (AWD), as the group is named by the Chicago Marathon administration. Checking back with the schedule of events, I was able to confirm that this group started eight minutes ahead of the first wave.

Let’s investigate a bit more and see what we can learn about this group. Of course, the very first person to cross the starting line is part of this group. Everyone else started very closely around him. We can query for the AWD subgroup by looking for everyone who started within a generous 200 seconds of the first person. We find that there were 49 members in this group:

Deeper look into the Athletes with Disabilities subgroup

Here is the plot of their start and finish times. It is equivalent to a zoom on the 0-second start line in the above plot:

AWD start and finish times

Due to their physical disabilities, many of these runners were joined by one or two guides who helped them navigate the course. With our Nearest functionality, we can try to identify such groups. We just need to gather everyone’s time stamps, convert them to UnixTime, and define our Nearest function:

Using Nearest function to identify groups

Let’s find the group of nearest people for all 49 runners by limiting the variations of their time stamps to 10 seconds over the course of the race:

Finding the group of nearest people for all 49 runners

Out of the 49 runners, we find that 35 ran in 15 groups of 2 or more people:

Out of the 49 runners, 35 ran in 15 groups of 2 or more people

These are the groups we could identify:

Identified groups

Identified groups

I tip my hat to everyone who participated in this race. But I am in awe of people running a marathon with a physical disability. I would like to give them, as well their guides, a special shoutout!

Did I run with someone? As mentioned above, I sure did. I am lucky to have my next-door neighbor Michael as my running partner. Cursing and whining during a long run is a lot easier if you have someone on your side. Otherwise you just look crazy while mumbling to yourself. Let’s build the Nearest function:

Using Nearest function

Then we can apply it to the entire dataset. Any result of length greater than 1 indicates a running group. We find that 2,784 runners ran in 1,394 groups:

Applying Nearest to the entire dataset

There were 1,329 groups of 2, 62 groups of 3, and 3 groups of 4. The latter were:

Identifying groups

By the way, you will not find my and Michael’s names in any of these groups. Why? Because there was nothing in this world that could keep Michael from his tenth attempt to finish the marathon in under four hours—whereas halfway through the race I had to give in to that nagging voice telling me to take a break and walk. Just taking the first half of the race into account, here we are:

Finding Eila and Michael at the halfway point

We finished only three minutes apart, but that can be a whole lot of time during a marathon. Michael came in just under four hours; I barely missed that time.

Now let’s take a look at how the race progressed split by split. The following histograms show how participants’ split times compared to the mean time at each split distance:

Histograms showing participants' split times

Interestingly, for each split the curve shows a little bump just before the 0 marker, which indicates the mean split time. To find out which runners these might be, we have to consider who the participants are. The vast majority are recreational marathon runners. We hope to stay injury free and maybe achieve a personal record, but our goal is to have a great experience and a rush of endorphins. We are not there to win and collect prize money. But, as Michael did above, one thing that people might attempt is to break the illusive four-hour mark. To beat four hours, a runner—let’s call her “Molly”—has to average 341.517 s/km, or 9 minutes and 9 seconds per mile:

Average mile time to beat four hours

To make sure Molly comes in under four hours, let’s assume she runs at a pace five seconds faster per kilometer, 336.517 s/km. By not allowing any change of pace, we are basically turning Molly into a robot. But let’s see where her split times (indicated in red) fall compared to the mean at each of the kilometer markers. Indeed, Molly’s split times match the “hump,” and thus are a representation of all runners trying to finish the marathon in less than four hours:

"Molly's" split times

As can be seen in the above histograms, with each split we plot more bins representing fewer runners, while the variations from the mean steadily increase. Here is another look at the same fact, just from a different angle. Again taking the differences of the runners’ split times to the mean, and then sorting them from smallest to largest, we can see how the differences between the fastest and slowest runners steadily increase over the course of the race:

Difference between fastest and slowest runners

Again, the group of people trying to finish in under four hours is nicely visible in the small hump to the left of the y axis. How many people did make it in under four hours? We could not make this number up: it was exactly 11,111 people, or 29.7% of all participants:

Number of people finishing under four hours

As mentioned above, I could not keep pace with Michael after about halfway through the race. But let’s look at “keeping pace” and how consistent people ran their race. The dataset provides all the information we need to look at everyone’s average pace and absolute variations from it at each split. Adding up those variations per person gives us the following picture:

Variation of pace of runners

The maximum of accumulated variations from the average pace is around 10 minutes. I averaged 9 minutes and 16 seconds:

Eila's time compared to the accumulated variations from the average pace

My variations from that average added up to almost three minutes:

Variations from the average

In the charts below, we are looking at the distribution of those variations versus a runner’s finishing time. Since a slower runner takes more time between splits and thus automatically accumulates more minutes and more variations, we additionally normalized the pace variation by the corresponding finishing time:

Distribution of those variations versus a runner's finishing time

Of course, these pace variations cause people to pass each other. Let’s have a quick look at how often this happened. We counted an amazing 276,121,258 occurrences of runners’ position changes. Below is an illustration. Inside the attached notebook, please hover over the data points to see the number of takeovers at a given distance:

How often people passed each other

To explain the numerous peaks, we should have another look at the race. Every mile or two, aid stations were providing runners with fluids, medical assistance, and other necessities. These aid stations were about two city blocks long, giving runners plenty of opportunities to move through and to avoid crowds. Consider the aid stations on the map:

Aid stations on Chicago Marathon route

Also consider their locations along the course by using our new GeoDistanceList function:

Using GeoDistanceList to find aid station locations

We can nicely match the peaks with the locations of the aid stations. At each of these points, a huge number of runners change their paces, resulting in the jump in takeovers. While taking in fluids, one runner might choose to walk while another just slows down but continues to run. A third runner might not utilize the station at all and run through it. Turns out I am not very gifted when it comes to drinking while running, so I walk whenever necessary.

Interestingly, a Histogram3D of time versus distance versus the number of takeovers looks like the city of Chicago itself:

Histogram3D of time versus distance versus the number of takeovers

Running a marathon does not just take a good number of months of training, battles with injuries, and bouts of laziness (as well as a good sense of the craziness of this endeavor). It also takes a financial commitment. Race registration and travel costs can add up to an intimidating sum of money. This made me wonder if there is a correlation between travel distance and finishing time, i.e. can I assume that the farther you have to travel and the more money you have to spend on the event, the better you are as a runner? The following plot shows the finishing time versus travel distance to the US. Upon hovering inside the notebook, you can see the runners’ countries, their finishing times, and their overall placement in the race:

Finishing time versus travel distance

Clearly my assumption is incorrect. We do see a small number of runners from Kenya and Ethiopia who traveled thousands of miles and came in first. But we also see runners who traveled all the way from India, New Zealand, Indonesia, Swaziland, and Singapore who finished in more than six or seven hours. The means for these countries are all around six hours.

Let’s see if another assumption can be proven wrong, e.g. if the travel expense is not as prohibitive as thought, does the number of runners from a country decrease with increasing travel distance? And could it be true that the more runners a country has in the race, the higher its GDP per capita is? In the notebook, hover over each data point in the charts below to see the country, number of runners from that country, and travel distance or GDP per capita:

Country, number of runners from that country, and travel distance or GDP per capita

The data is not as obvious as one might think. More than 28,000 participants came from the US, whereas only a single person came from countries such as Réunion and Mauritius. We do have a number of countries with less wealth and only single-runner representation. But the single-runner representation also holds true for Qatar and Luxembourg—both known for their financial muscle.

I’ll admit that the country of origin might not be as much of a statement about the size of one’s wallet or someone’s performance as I might have thought. What about age?

Age distribution of runners

Marathons seem to appeal mainly to people in their mid-twenties to mid-forties. And, of course, the higher your age, the better your chances of winning your division. But what is interesting to see is that this is not actually a sport favoring the younger athletes. The fastest times were achieved by the 40–44 age division. So I might still have my Olympic years ahead of me!

Age distribution and times

To add a note of obscurity: have you ever considered if your name is any indication of your performance? Or if there are other runners by your name in this exact race? There are many shared first and last names. If you were a “Cabada” or a “Zac” in this race, you did awfully well:

Mean ranking versus mean running time

You may have guessed the most common first name: there were 641 Michaels. The leading last name was, also not very surprising, “Smith” with a count of 157. Of course, these numbers decrease considerably when we look at shared full names:

Mean ranking versus mean running time per shared name

And the most common full names and their counts are:

Most common full names

The combination of my family watching on the sidelines, including my mother visiting from Germany, the outstanding work of all the volunteers, and the huge crowds of spectators and the entertainment they provided, all made for a memorable race. Plus the weather, which is usually a liability in Illinois, was just impeccable. Both Michael and I had a blast, which I think is visible using ImageCollage:

ImageCollage with photos from Chicago Marathon

But as it turns out, not just the event itself was fun. This was a great dataset for me to play around with and learn a lot more about the capabilities of the Wolfram Language. I am not a developer, but I greatly enjoyed this opportunity to combine my professional and personal lives. If you are interested in more scientific approaches to the topic of marathon running, you might find this article and this article intriguing.

But most importantly, registration is now open for the 2016 event!

Download this post as a Computable Document Format (CDF) file.

http://blog.wolfram.com/2016/04/15/the-26-2-blog/feed/ 3
Newest Wolfram Technologies Books Cover Range of STEM Topics http://blog.wolfram.com/2016/04/07/newest-wolfram-technologies-books-cover-range-of-stem-topics/ http://blog.wolfram.com/2016/04/07/newest-wolfram-technologies-books-cover-range-of-stem-topics/#comments Thu, 07 Apr 2016 17:28:11 +0000 Wolfram Blog Team http://blog.internal.wolfram.com/?p=30370 Authors that choose to incorporate Wolfram technologies into their books are practitioners in a variety of STEM fields. Their work is an invaluable resource of information about the application of Mathematica, the Wolfram Language, and other Wolfram technologies for hobbyists, STEM professionals, and students.

Handbook of Mathematics, sixth edition; Advanced Calculus Using Mathematica: Notebook Edition; Handbook of Linear Partial Differential Equations for Engineers and Scientists, second edition

Handbook of Mathematics, sixth edition

This guidebook to mathematics by I. N. Bronshtein, K. A. Semendyayev, G. Musiol, and H. Muhlig contains a fundamental working knowledge of mathematics, which is needed as an everyday guide for working scientists and engineers, as well as for students. This newer edition emphasizes those fields of mathematics that have become more important for the formulation and modeling of technical and natural processes, namely numerical mathematics, probability theory and statistics, and information processing. Besides many enhancements and new paragraphs, new sections on geometric and coordinate transformations, quaternions and applications, and Lie groups and Lie algebras have also been included.

Advanced Calculus Using Mathematica: Notebook Edition

Keith Stroyan’s latest work is a complete text on calculus of several variables written in Mathematica notebooks. The eText has large, movable figures and interactive programs to illustrate things like “zooming in” to see “local linearity.” In addition to lots of traditional-style exercises, the eText also has sections on computing with Mathematica. Solutions to many exercises are in closed cells of the eText.

Handbook of Linear Partial Differential Equations for Engineers and Scientists, second edition

Including nearly 4,000 linear partial differential equations (PDEs) and a database of test problems for numerical and approximate analytical methods for solving linear PDEs and systems of coupled PDEs, Andrei D. Polyanin and Vladimir E. Nazaikinskii have created a comprehensive second edition of their handbook. The book also covers solutions to numerous problems relevant to heat and mass transfer, wave theory, hydrodynamics, aerodynamics, elasticity, acoustics, electrodynamics, diffraction theory, quantum mechanics, chemical engineering sciences, electrical engineering, and other fields.

Mathematical Science of the Developmental Process (Japanese), Single Variable Calculus with Early Transcendentals, Mathematica Data Analysis

Mathematical Science of the Developmental Process (Japanese)

This book by Takashi Miura uses Mathematica to introduce and explore the developmental process, in which a simple, spherical, fertilized egg becomes a complex adult structure. The process is very difficult to understand, and the mechanism behind it has yet to be elucidated. Since no fundamental equation of this process has been established, we need prototyping processes, which means quick formulation of simple phenomenological models and verification by simulation and analysis.

Single Variable Calculus with Early Transcendentals

A comprehensive, mathematically rigorous exposition, this text from Paul Sisson and Tibor Szarvas blends precision and depth with a conversational tone to include the reader in developing the ideas and intuition of calculus. A consistent focus on historical context, theoretical discovery, and extensive exercise sets provide insight into the many applications and inherent beauty of the subject.

Mathematica Data Analysis

If you are not a programmer but you need to analyze data, Sergiy Suchok’s new book will show you how to use Mathematica to take just a few strings of intelligible code to solve huge tasks, from statistical issues to pattern recognition. If you’re a programmer, you will learn how to use the library of algorithms implemented in Mathematica in your programs, as well as how to write algorithm testing procedures. Along with intuitive queries for data processing and using functions for time series analysis, we will highlight the nuances and features of Mathematica, allowing you to build effective analysis systems.

Mathematica: A Problem-Centered Approach, second edition; Estructuras Discretas con Mathematica (Spanish)

Mathematica: A Problem-Centered Approach, second edition

The second edition of Roozbeh Hazrat’s textbook introduces the vast array of features and powerful mathematical functions of Mathematica using a multitude of clearly presented examples and worked-out problems. Based on a computer algebra course taught to undergraduate students of mathematics, science, engineering, and finance, the book also includes chapters on calculus and solving equations, as well as graphics, thus covering all the basic topics in Mathematica. With its strong focus on programming and problem solving, and an emphasis on using numerical problems that do not require any particular background in mathematics, this book is also ideal for self-study and as an introduction for researchers who wish to use Mathematica as a computational tool.

Estructuras Discretas con Mathematica (Spanish)

This book by Enrique Vilchez Quesada provides a theoretical and practical overview for students studying discrete structures within the curriculum of computer engineering and computer science. The major contribution of this work, compared to other classical textbooks on this subject, consists of providing practical solutions to real-world problems in the context of computer science by creating different examples and solutions (programs, in most cases) using the renowned commercial software Mathematica.

Looking for more Wolfram technologies books? Don’t forget to visit Wolfram Books to browse by both topic and language!

http://blog.wolfram.com/2016/04/07/newest-wolfram-technologies-books-cover-range-of-stem-topics/feed/ 1
New in the Wolfram Language: GreenFunction and Applications in Electricity, ODEs, and PDEs http://blog.wolfram.com/2016/03/31/new-in-the-wolfram-language-greenfunction-and-applications-in-electricity-odes-and-pdes/ http://blog.wolfram.com/2016/03/31/new-in-the-wolfram-language-greenfunction-and-applications-in-electricity-odes-and-pdes/#comments Thu, 31 Mar 2016 15:34:41 +0000 Devendra Kapadia http://blog.internal.wolfram.com/?p=30304 Green's Windmill
Picture of Green’s Windmill by Kev747 at the English language Wikipedia.

In 1828, an English corn miller named George Green published a paper in which he developed mathematical methods for solving problems in electricity and magnetism. Green had received very little formal education, yet his paper introduced several profound concepts that are now taught in courses on advanced calculus, physics, and engineering. My aim in writing this post is to give a brief biography of this great genius and provide an introduction to GreenFunction, which implements one of his pioneering ideas in Version 10.4 of the Wolfram Language.

George Green was born on July 14, 1793, the only son of a Nottingham baker. His father noticed young George’s keen interest in mathematics, and sent him to a local school run by Robert Goodacre, a well-known science popularizer. George studied at Goodacre Academy between the ages of eight and nine, and then went to work in his father’s bakery. Later he ran a corn mill built by his father in Sneinton, near Nottingham. He is said to have hated his work at the bakery and the corn mill, and regarded it as annoying and tedious. In spite of his onerous duties, George appears to have continued studying mathematics in his spare time, retreating to the top floor of the 16-meter-high mill, shown above, for this purpose. In 1828, he published the results of his rigorous self-study in “An Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism,” one of the most influential mathematical papers of all time.

Green’s paper of 1828 introduced the potential function, which is well known to students of physics. He also proved a form of Green’s theorem from advanced calculus in this paper. Finally, he introduced the notion of a Green’s function that, in one form or another, is familiar to students of engineering, and is the theme for this post. By sheer chance, Sir Edward Bromhead, a founder of the Analytical Society, purchased and read a copy of Green’s paper. With his encouragement, Green entered Gonville and Caius College in Cambridge University at the age of forty, and eventually became a fellow of the college. He continued to publish papers until his untimely death in 1841, possibly due to lung complications arising from his work at the corn mill. Sadly, recognition for his mathematical work had to wait until 1993, when a plaque was dedicated to his memory in Westminster Abbey. Today, the Green’s Mill and Science Centre in Nottingham carries on the work of promoting George Green’s reputation as one of the greatest scientists of his age.

I will now give an introduction to GreenFunction using concrete examples from electrical circuits, ordinary differential equations, and partial differential equations.

The basic principle underlying a Green’s function is that, in order to understand the response of a system to arbitrary external forces, it is sufficient to understand the system’s response to an impulsive force of the DiracDelta type.

As an illustration of the above principle, consider a circuit that is composed of a resistor R and an inductor L, and is driven by a time-dependent voltage v[t], as shown below:

Circuit that is composed of a resistor R and an inductor L, and is driven by a time-dependent voltage v[t]

The current i[t] in the circuit can then be computed by solving the differential equation:

L i´(t)+R i(t)==v(t)

Let’s assume that the voltage source is a battery supplying a unit voltage. Next, suppose that you close the switch S for a fleeting moment at time t = s and then quickly throw it open again. The current induced in the circuit by this impulsive action can be computed by applying GreenFunction to the left-hand side of the above differential equation:

Applying GreenFunction to the left-hand side of the differential equation

The initial value of the current is assumed to be zero, since the switch was open until time t = s:

Initial value of the current is assumed to be zero

Here is the result given by GreenFunction for this example:

The result given by GreenFunction

The following plot for s = 1 shows that the current is 0 for all times t < 1, then rises instantaneously to its peak value at t = 1, and finally decreases to 0 with the passage of time:

Plotting the results

The behavior of the circuit in the above situation is usually called its impulse response, since it represents the response of the circuit to an impulsive voltage.

Next, suppose that you close the switch at time t = 0 and leave it closed at all later times. Thus the voltage steps up from its initial value 0 to a constant value 1, and can be modeled using the HeavisideTheta function:

Using the HeavisideTheta function

The step voltage can be visualized as follows:

Visualizing the step voltage

You can now compute the current in the circuit by performing the following integral involving the voltage and the Green’s function:

Computing the current in the circuit by performing the integral

The integral computed above is essentially a weighted sum of the Green’s function with the voltage source at all times s prior to a given time t, and is called a convolution integral.

As the plot below shows, the current for the step voltage source gradually increases from its value 0 at t = 0 to a steady-state value:

Plot showing the current for the step voltage source

The behavior of the circuit in the above situation is usually called its step response, since it represents the response of the circuit to a step voltage.

Finally, suppose that the voltage source supplies an alternating voltage—for example:

Voltage source supplying an alternating voltage

You can once again compute the current in the circuit by performing a convolution integral of the voltage with the Green’s function, as shown below:

Computing the current in the circuit

You can also obtain the result using DSolveValue as follows:

Obtaining the result using DSolveValue

As the plot below shows, the current settles down to a steady alternating pattern for large values of the time:

Plotting the results

To summarize, the Green’s function encodes all the information that is required to study the response of the circuit to any external voltage. This magical property makes it an indispensable tool for studying a wide variety of physical systems.

The two-step procedure for solving the differential equation associated with a circuit, which I discussed above, can be applied to any linear ordinary differential equation (ODE) with a forcing term on its right-hand side and homogeneous (zero) initial or boundary conditions. For example, suppose you wish to solve the following second-order differential equation:

Solving a second-order differential equation

Assume that the forcing term is given by:

Forcing term

Also, suppose that you are given homogeneous boundary conditions on the interval [0,1]:

Homogeneous boundary conditions on the interval [0,1]

As a first step in solving the problem, you compute the Green’s function for the corresponding differential operator (left-hand side) of the equation:

Computing the Green's function for the corresponding differential operator

The following plot shows the Green’s function for different values of y lying between 0 and 1. Each instance of the function satisfies the zero boundary conditions at both ends of the interval:

Plot showing Green's function for different values of y lying between 0 and 1

You can now compute the solution of the original differential equation with the given forcing term using a convolution integral on the interval [0,1], as shown below:

Computing the solution of the original differential equation with the given forcing term using a convolution integral on the interval [0,1]

Here is a plot of the solution, which shows that it satisfies the homogeneous boundary conditions for different values of the parameter a:

Plotting of the solution

Green’s functions also play an important role in the study of partial differential equations (PDEs). For example, consider the wave equation that describes the propagation of signals with finite speed, and that I discussed in an earlier post. In order to compute the Green’s function for this equation in one spatial dimension, use the wave operator (left-hand side of the wave equation), which is given by:

In order to compute the Green's function for this equation in one spatial dimension, use the wave operator

Here, x denotes the spatial coordinate that ranges over (-∞,∞), t denotes the time that always ranges over [0,∞), and u[x,t] gives the displacement of the wave at any position and time.

You can now find the Green’s function for the wave operator as follows:

Finding Green's function for the wave operator

The following plot of the Green’s function shows that it becomes 0 outside a certain triangular region in the x-t plane, for any choice of y and s (I have chosen both these values to be 0). This behavior is consistent with the fact that the wave propagates with a finite speed, and hence signals sent at any time can only influence a limited region of space at any later time:

3D plot

The Green’s function obtained above can be used to solve the wave equation with any forcing term, assuming that the initial displacement and velocity of the wave are both zero. For example, suppose that the forcing term is given by:

The Green's function obtained can be used to solve the wave equation with any forcing term

You can solve the wave equation with this forcing term by evaluating the convolution integral Wave equation

Solving the wave equation

The following plot shows the standing wave generated by the solution:

Plot showing the standing wave generated by the solution

Finally, I note that the same solution can be obtained by using DSolveValue with homogeneous initial conditions, as shown below:

Solution obtained by using DSolveValue with homogeneous initial conditions

Green’s functions of the above type are called fundamental solutions and play an important role in the modern theory of linear partial differential equations. In fact, they provided the motivation for the theory of distributions that was developed by Laurent Schwartz in the late 1940s.

The ideas put forward by George Green in his paper of 1828 are stunning in their depth and simplicity, and reveal a first-rate mind that was far ahead of the times during which he lived. I have found it very inspiring to study the life and work of this great mathematician while implementing GreenFunction for Version 10.4 of the Wolfram Language.

Download this post as a Computable Document Format (CDF) file.

http://blog.wolfram.com/2016/03/31/new-in-the-wolfram-language-greenfunction-and-applications-in-electricity-odes-and-pdes/feed/ 8
Ready? Review. Register: The 2016 Wolfram Technology Conference Is on the Way! http://blog.wolfram.com/2016/03/25/ready-review-register-the-2016-wolfram-technology-conference-is-on-the-way/ http://blog.wolfram.com/2016/03/25/ready-review-register-the-2016-wolfram-technology-conference-is-on-the-way/#comments Fri, 25 Mar 2016 14:42:37 +0000 Wolfram Blog Team http://blog.internal.wolfram.com/?p=30277 Mark your calendars now for the 2016 Wolfram Technology Conference! Join us October 18–21 at Wolfram headquarters in Champaign, Illinois, where we’ll be getting things off to an exciting start with a keynote address by Wolfram founder and CEO Stephen Wolfram on Tuesday, October 18 at 5pm.

Our conference gives developers and colleagues a rare opportunity for face-to-face discussion of the latest developments and features for cloud computing, interactive deployment, mobile devices, and more. Arrive early for pre-conference training opportunities, and come ready to participate in hands-on workshops, nonstop networking opportunities, and the Wolfram Language One-Liner Competition, just to name a few activities.

We are also looking for users to share their own stories and interests! Submit your presentation proposal by July 15 for full consideration. Last year’s lineup included everything from political data science to winning hackathon solutions to programming in the Wolfram Cloud… and literally almost everything in between. Review a sampling of the 2015 talks below, or visit our website for more.

Commanding the Wolfram Cloud—Todd Gayley

Computational Politics: The Wolfram Data Drop Meets Election 2016—Evan Ott

Genealogy with the Wolfram Language—Robert Nachbar

Infusing STEM Education with Coding and Discovery: An Open Platform for Publishing Modern Curriculum—Kyle Keane

Valuation Navigator and the Emergence of Real Estate Valuation 3.0—Shashi Rivankar

Ready to see it all firsthand this year? Our conference spans the broadest, most diverse group of tech folks we’ve ever seen—from enthusiastic high schoolers to commercial executives, professors to retirees, and experts in education, commercial law, physics, optics, math, engineering, and so very much more! Register today for this year’s conference to reserve your spot!

http://blog.wolfram.com/2016/03/25/ready-review-register-the-2016-wolfram-technology-conference-is-on-the-way/feed/ 0
Wolfram Community Highlights: LEGO, SCOTUS, Minecraft, and More! http://blog.wolfram.com/2016/03/18/wolfram-community-highlights-lego-scotus-minecraft-and-more/ http://blog.wolfram.com/2016/03/18/wolfram-community-highlights-lego-scotus-minecraft-and-more/#comments Fri, 18 Mar 2016 14:43:29 +0000 Emily Suess http://blog.internal.wolfram.com/?p=30241 Wolfram Community members continue to amaze us. Take a look at a few of the fun and clever ideas shared by our members in the first part of 2016.

How to LEGO-fy Your Plots and 3D Models, by Sander Huisman

LEGO-fied 3D models

This marvel by Sander Huisman, a postdoc from École Normale Supérieure de Lyon, attracted more than 6,000 views in one day and was trending on Reddit, Hacker News, and other social media channels. Huisman’s code iteratively covers layers with bricks of increasingly smaller sizes, alternating in the horizontal x and y directions. Read the full post to see how to turn your own plots, 3D scans, and models into brick-shaped masterpieces.

Supreme Court Ideological Data, by Alan Joyce

Visualizations of Supreme Court decisions

Wolfram’s own Alan Joyce was inspired by a recent New York Times article to use the Wolfram Language to explore Supreme Court ideological data and Martin–Quinn scores. While he leaves you to draw your own political conclusions, his visualizations will help you see the Supreme Court’s decisions in a new way. Get started on your own analysis and join the conversation by grabbing the cleaned-up dataset at the end of his Community post.

Implementing Minecraft in the Wolfram Language, by Boris Faleichik

Implementing Minecraft in the Wolfram Language

Fans of Minecraft are going to love this one. With some amazingly compact code, Boris Faleichik, a professor from Belarusian State University and past Wolfram One-Liner Competition winner, shows how the Wolfram Language handles Minecraft’s classic game functionality. Have an idea for an improvement? Visit the post on Community and leave a comment!

Find Your Species Name on Darwin’s Birthday, by Jofre Espigule

Find what species shares your name

To celebrate Darwin’s February 12 birthday, Brainterstellar cofounder Jofre Espigule wrote an app to help you find out if there’s a species that shares your name. It works using the Wolfram Language’s built-in species data. Read the full post to see how Espigule split each scientific name into two words, used the Nearest function to find the species name closest to a given name, and deployed his app to the Wolfram Cloud.

Using Mathematica to See the World in a Different Light, by Marco Thiel

Using Mathematica to See the World in a Different Light

Marco Thiel from the University of Aberdeen celebrated the United Nations’ Year of Light global initiative with an article on how the Wolfram Language, its wealth of data, and connected devices can be used to keep the Year of Light alive at your home. Part 1 explores how spectra enable us to “see the world in a different light.”

Internet of Things (IoT): Controlling an RGB LED with the Wolfram Cloud, by Armeen Mahdian

Creating IoT applications

Thirteen-year-old Armeen Mahdian’s first post on Wolfram Community caught our attention too. He shared how the Wolfram Cloud can be used in conjunction with an embedded Linux device to create IoT applications. Read his full post to see how he used a BeagleBone Black (BBB) and its IO ports to control an RGB LED using the cloud. Don’t miss Mahdian’s other post on PWM pins.

Cops and Robbers (and Zombies and Humans), by Brian Weinstein

Cops and Robbers (and Zombies and Humans)

Brian Weinstein, data analyst and grad student at Columbia, uses the Wolfram Language to create mathematical pursuit-evasion games. In these games, the goal is to determine how many pursuers are required to capture a given number of evaders. The GIFs he created show two fun versions—Cops and Robbers and Zombies and Humans.

Visit Wolfram Community to join in on these and other interesting discussions and browse the complete list of Staff Picks. Or share and test your own code, ideas, and apps with Community’s more than 11,000 members.

http://blog.wolfram.com/2016/03/18/wolfram-community-highlights-lego-scotus-minecraft-and-more/feed/ 0
Pi Day Discounts on Mathematica and Wolfram|Alpha http://blog.wolfram.com/2016/03/14/pi-day-discounts-on-mathematica-and-wolframalpha/ http://blog.wolfram.com/2016/03/14/pi-day-discounts-on-mathematica-and-wolframalpha/#comments Mon, 14 Mar 2016 14:31:21 +0000 Hy Nguyen http://blog.internal.wolfram.com/?p=30203 Pi Day is celebrated on March 14 (3.14) every year to properly recognize the constant pi (π=~3.14159)—the ratio of the circumference of a circle to its diameter. At Wolfram, π plays an important part in every one of our products, allowing users to do everything from getting the basic area of a circle to rendering a π symbol filled with the digits of π. On Pi Day last year (aka the Pi Day of the Century), the folks at SXSW got a very special treat from us in the name of π. This year, we decided to bring the celebration to you by offering exclusive discounts on Mathematica. Get 15% off Mathematica Home Edition and 25% or more off Mathematica Student Edition in select territories, including North and South America, Australia, and parts of Asia and Africa. Regardless of where you are, you can still celebrate with us by finding your Pi Day.

Pi Day Savings

This offer extends into this April, Mathematics Awareness Month, which we’re also kicking off today, Monday, March 14. First founded in 1986 as Mathematics Awareness Week, Mathematics Awareness Month aims to increase the public understanding of and appreciation for mathematics and its applications. In honor of this year’s theme, “The Future of Prediction,” we will be offering 20% off subscriptions to Wolfram|Alpha Pro starting today and ending April 30, 2016. With Pro you’ll be able to freely explore the realm of mathematics, get your “What’s next for math?” questions answered, and see how mathematics can make accurate predictions possible in any related field. Visit our website and use the promo code MATHMONTH20OFF to take advantage of this special discount.

http://blog.wolfram.com/2016/03/14/pi-day-discounts-on-mathematica-and-wolframalpha/feed/ 5
Bring Balance to Your Work with Wolfram SystemModeler http://blog.wolfram.com/2016/03/07/balancing-rotating-machinery-with-wolfram-systemmodeler/ http://blog.wolfram.com/2016/03/07/balancing-rotating-machinery-with-wolfram-systemmodeler/#comments Mon, 07 Mar 2016 16:58:12 +0000 Håkan Wettergren http://blog.internal.wolfram.com/?p=30146 One of the most common causes for vibrations in mechanical systems is imbalance in the rotating parts of a machine. Much effort has therefore gone into developing methods and devices for balancing rotating machines.

Balance is a requirement for many types of rotating machinery, such as electric motors, pumps, fans, turbines, generators, centrifugal compressors, and propellers. Many people know about the balance of their car wheels. If these systems are not properly balanced, the vibration will cause not only reduced efficiency and component fatigue but also disturbances for the environment, such as vibration and noise. The most common methods for balancing rotating machinery are the influence coefficient method and the modal balancing method. The car wheel balancing is, for instance, a subpart of the influence coefficient method.

Wolfram SystemModeler is used for modeling the rotor, and the Wolfram Language for the evaluation of the results. The workflow shows how powerful it is to combine these two softwares.

A disc with mass m is mounted on a shaft with stiffness k. The rotor rotates with the angular velocity W. The disc has an imbalance u. The unit for the imbalance is kg*m.

Wolfram SystemModeler modeling a rotor

The deflection of the shaft from its rest position will be Delta =uW^squareroot k/m.

A resonance occurs at Square root k/m. To eliminate the vibration, all you have to do is mount an equal imbalance opposite the existing one. However, in reality it is not that simple. There may be more than one disc. It is often not possible to put the balancing weight on an arbitrary position. You only have certain axial positions, called balancing planes, to work with. In, for instance, a generator or gas turbine, you cannot open up a system and mount weights inside a closed compartment. You most likely need to put the balancing weight close to the bearings. A more realistic example is shown in the first film. It consists of a shaft that carries a flywheel and a gear, along with two smaller discs for balancing. Both the discs and flywheel have an imbalance. Resonance occurs at close to 25 seconds into the film. (Note that the deflections have been scaled 10x).

Influence coefficient method, basic equations

In film one, it is not possible to place an equal imbalance opposite the existing one in order to balance the rotor. This is due to two reasons. The first is that the imbalance is not known. The second is that it is not possible to put extra masses on the disc and flywheel. This is a very common situation in real applications. Most parts of a rotor are typically not reachable after mounting. Instead, the balancing engineer needs to work with balancing planes that are normally closed to the bearings.

In the example, both the discs and flywheel have an imbalance. This is illustrated with a mass on the outer diameter and denoted as v1 and v2 in the figure below. Neither the size nor positions of them are known. There are also two balancing planes, u1 and u2. With two balancing planes, it is possible to correct both a static and dynamic imbalance:

Illustration with a mass on the outer diameter and denoted as v, 1 and v, 2

Now, the influence coefficient method is rather straightforward. The vibrations are measured in two different locations, v1 and v2. In this example we measure directly on the discs, but it is more realistic to measure on the bearings. The aim here is to show how the deflections of the discs can be reduced. The vibrations can be measured with displacements, velocities, or acceleration. For the basic principle, it doesn’t matter which one of the measuring methods is used, but in reality the accuracy of the results is dependent on the measuring method. For higher frequencies, measuring acceleration is preferred; for lower frequencies, velocities or even displacements may be a better choice.

The imbalances u1 and u2 are known weights mounted on a measured position (radius and angle).

Both u and v are consequently complex variables, which means that they describe both amplitude and phase. Assuming that the system is linear, the vibration of the rotor can then be described as:

v1 = r11 u1+ r12 u2 + v10

v2 = r21 u1+ r22 u2 + v20

where v10 and v20 are the initial vibrations. In matrix form:

(v,1 v,2)= [r,11 r21  r,12 r,22](u,1 u,2)+(v,10 v,20)


v = R u +v0,

where R is the receptance of the system. The aim of the balancing is, of course, to eliminate or at least minimize the vibrations v1 and v2.

The procedure is as follows.

1) Run the system without the balancing weights, i.e. u1 = u2 = 0. The measurement will give v10 and v20.

2) Apply a test weight at one of the balancing planes—for instance, u1 t. The size and direction don’t matter. The measurement now gives v11 and v21:

v11 = r11 u1 t + v10
v21 = r21 u2 t + v20,

which gives

r,11 = v,11-v,10 over u, 1 t  and r, 21=v,21-v,20 over u, 1 t

3) Remove u1 t and apply a new test weight u2 t at the second balancing plane. In the same way as above:

r, 12 = v,12-v,10 over u,2 t and r, 22 = v,22-v,20 over u,2 t

4) We now know R and v0. We want the vibration v = 0. This can, at least in theory, easily be fixed by choosing

u = -R-1 v0

Apply these balancing weights, and the vibration will be zero.

Wolfram SystemModeler model

The SystemModeler model is built up with standard components and an Euler–Bernoulli beam. The Euler–Bernoulli beam theory does not take into account shear deformation and rotational inertia effects, making it suitable for describing the behavior of long beams. This is typical when the length of the shaft is three or more times the size of the diameter. For shorter beams, the Timoshenko beam theory is more accurate. The difference in this case is one or a few percent on eigenfrequencies, and less for deflections. In this application, we use 16 beam elements. We need to have a component for external damping close to the mass, since we have used “pinned” support. In reality, external damping will include the bearings:

SystemModeler model is built up with standard components and an Euler-Bernoulli beam

The deflection (v1 and v2) of the flywheel (disc 1) and a gear (disc 2) during a startup from 0 to 40 Hz can be seen in the plot below. A resonance close to 25 Hz (= 1500 rpm) is noted. The high vibration could also be seen in film one:

The deflection ([v, 1] [v, 2]) of the flywheel (disc 1) and a gear (disc 2) during a startup from 0 to 40 Hz

The aim now is to reduce this vibration as much as possible with the influence coefficient method, and to do this we combine the Wolfram Language and SystemModeler.

Wolfram Language code

Initialize the link between the Wolfram Language and SystemModeler. Set up the working directory and choose the correct model:

Setting up a working directory

Run three different simulations for 40 seconds: the first one without any added test imbalance; the second with 2·0.05 kg m^2 at imbalance plane #1; and the third with 2·0.05 kg m2 at imbalance plane #2. In the last two cases, the imbalance is applied at 0°:

Three different simulations for 40 seconds

Evaluate the phase and deflections for these simulations:

Evaluating the phase and deflections for these simulations

The receptance of the system can now be calculated, and after that, the optimum balancing weights and positions can be calculated:

Calculating the receptance for the system and the optimum balancing weights and positions

Finally, run the model with optimal balancing weights:

Running the model with optimal balancing weights


The rotor used in this example is nonsymmetric in the axial direction, i.e. the masses and the shaft diameters are different. If the system is symmetric, the deflection at 30 Hz after balancing will be less than 0.1%, compared to the deflection before balancing. It is easy to check this by changing the values in the model. Due to the axial asymmetry in this example, does it matter at which speed the balancing is performed? Normally a balancing is preferably performed close to the running speed. With the Wolfram Language, it’s very easy to study this. Simply plot the balancing weights and phases during the run-up:

Plotting the balancing weights and phases

As can be seen, it is hard to find balance weights when the rotor rotates close to its resonance speed. It would also be wise to wait a while before measuring when a speed is reached; in reality, you need to wait till the temperature etc. has stabilized. But in this blog, we will skip the waiting time. The optimal balanced rotors’ reduction of the vibration amplitude for disc 1 will be:

vibrationReductionDisc1[t_] := deflection13[t]/deflection10[t];

And for disc 2 will be:

vibrationReductionDisc2[t_] : = deflection23[t]/deflection20[t];

Plot the result:

Plotting the result

From the above figure it can be noted that for this simple case, the vibration will be reduced to around 2%–4% of the original vibration during the run-up. The initial noise before five seconds can be ignored. The system has not yet stabilized, and the total deflection is very low.

Below, the difference can be seen more clearly. The two curves with highest vibration occur before the balancing:

Plot highlighting the differences

The main reason for balancing at the shaft’s operational running speed is that the system normally is more complex than this—for instance, systems with nonlinear supports, more than two bearings, or bent shafts all need to have a stable rotor and (when applicable) oil temperature. Circumstances that give the optimum balancing speed occur at running speed. If the speed varies, a best choice is needed. Exactly how to optimize depends on the application, but with Mathematica a statistical evaluation will be straightforward no matter what approach is chosen.

The behavior after the balancing is shown in the following video. (Note that the deflections have been scaled 10x.)


SystemModeler is a powerful tool for studying advanced problems in rotating machinery. Combined with the Wolfram Language, it gives tremendous opportunities to work with and analyze your models and results. I have shown how rotating machinery can be balanced, and how the balancing speed affects the results. The model can easily be extended to encompass everything from nonlinearities to stochastic sensor noise.

To learn more about what affects the balancing results, I recommend playing around with a model like this one. For instance, will the vibration reduce even further if the “balanced rotor” is balanced once again? If we had used the balancing weights from, say, 5 Hz, what would the vibration at 40 Hz become? How does signal noise affect the results? There are many more or less intelligent balancing methods besides the influence coefficient method and modal balancing method. Try one of those and learn how it works.

Download this post as a Computable Document Format (CDF) file.

http://blog.wolfram.com/2016/03/07/balancing-rotating-machinery-with-wolfram-systemmodeler/feed/ 0
Profiling the Eyes: ϕaithful or ROTen? Or Both? http://blog.wolfram.com/2016/03/02/profiling-the-eyes-phiaithful-or-roten-or-both/ http://blog.wolfram.com/2016/03/02/profiling-the-eyes-phiaithful-or-roten-or-both/#comments Wed, 02 Mar 2016 15:26:12 +0000 Michael Trott http://blog.internal.wolfram.com/?p=29945 An investigation of the golden ratio’s appearance in the position of human faces in paintings and photographs.

There is a vast amount of literature on the appearance of the golden ratio in nature, in physiology and psychology, and in human artifacts (see this page on the golden ratio; these articles on the golden ratio in art, in nature, and in the human body; and this paper on the structure of the creative process in science and art). In the past thirty years, there has been increasing skepticism about the prevalence of the golden ratio in these domains. Earlier studies have been revisited or redone. See, for example, Foutakis, Markowsky on Greek temples, Foster et al., Holland, Benjafield, and Svobodova et al. for human physiology.

In my last blog, I analyzed the aspect ratios of more than one million old and new paintings. Based on psychological experiments from the second half of the nineteenth century, especially by Fechner in the 1870s, one would expect many paintings to have a height-to-width ratio equal to the golden ratio or its inverse. But the large sets of paintings analyzed did not confirm such a conjecture.

While we did not find the expected prevalence of the golden ratio in external measurements of paintings, maybe looking “inside” will show signs of the golden ratio (or its inverse)?

In today’s blog, we will analyze collections of paintings, photographs, and magazine covers that feature human faces. We will also analyze where human faces appear in a few selected movies.

The literature on art history and the aesthetics of photography puts forward a theory of dividing the canvas into thirds, horizontally and vertically. And when human faces are portrayed, two concrete rules for the position of the eyeline are often mentioned:

  • the rule of thirds: the eyeline should be 2/3 (≈0.67) from the bottom
  • the golden ratio rule: the eyeline should be at 1/(golden ratio) (≈0.62) from the bottom

The rule of thirds is often abbreviated as ROT. In 1998 Frascari and Ghirardini—in the spirit of Adolf Zeising, the father of the so-called golden numberism—coined the term “ϕaithful” (making clever use of the Greek symbol ϕ that is used to denote the golden ratio) to label the unrestricted belief in the primacy of the golden ratio. Some consider the rule of thirds an approximation of the golden ratio rule; “ROT on steroids” and similar phrases are used. Various photograph-related websites contain a lot of discussion about the relation of these two rules. For early uses of the rule of thirds, see Nafisi. For the more modern use starting in the eighteenth century, see this history of the rule of thirds. For a recent human-judgment-based evaluation of the rule of thirds in paintings and photographs, see Amirshahi et al.

So because we cannot determine which rule is more common by first-principle mathematical computations, let’s again look at some data. At what height, measured from the bottom, are the eyes in paintings showing human faces?

Eyeline heights in older paintings—more ROTen than ϕaithful

Let’s start with paintings. As with the previous blog, we will use a few different data sources. We will look at four painting collections: Wikimedia, the Smithsonian, Britain’s Your Paintings, and Saatchi.

If we want to analyze the positions of faces within a painting, we must first locate the faces. The function FindFaces comes in handy. While typically used for photographs, it works pretty well on (representational) paintings too. Here are a few randomly selected paintings of people from Wikimedia. First, the images are imported and the faces located and highlighted by a yellow, translucent rectangle. We see potentially different amounts of horizontal space around a face, but the vertical extension is pretty uniform from the chin to the bottom of the forehead hairs.

Code for analyzing the positions of faces in paintings
Ols Maria Portert van Karel I Lodewijk van de Palts Catherine Brass Yates (Mrs. Richard Yates)
Italian Girl by the Well Prince Eugène, vice-roi d'Italie Dodo und ihr Bruder

A more detailed look reveals that the eyeline is approximately at 60% of the height of the selected face area. (Note that this is approximately 1/ϕ). To demonstrate the correctness of the 60%-of-the-face-height rule for some randomly selected images from Wikipedia, we show the resulting eyeline in red and the two lines ±5% above and below.

Eyeline at 60% of height of the face shown on Barack Obama, Mao Zedong, Carl Friedrich Gauss, Hillary Clinton, Gong Li, Magdalena Neuner

Independent of gender and haircut, the 60% height seems to be a good approximation for the eyeline. Of course, not all faces that we encounter in paintings and photographs are perfectly straightened. For tilting heads, we note both eyes will not be on a horizontal line. But as an average, the 60% rule works well.

Tilting heads and eyeline

Overall we see that the eyeline can be located within a few percent of the vertical height of the face rectangle. The error of the resulting estimation of the eyeline height in a painting/photograph in most collections should be about ≤2% for a typical ratio of face height to painting/photograph height. Plus or minus 2% should be small enough such that for a large enough painting/photograph collection we can discriminate the golden ratio height 1/ϕ from the rule of thirds 2/3. On the range [0,1], the distance between 1/ϕ and 2/3 is about 5%. (Using a specialized eye detection method to determine the vertical height of the eyes we leave for a later blog.)

We start with images of paintings from Wikimedia.

Using the 0.6 factor for the eyeline heights, we get the following distribution of the faces identified. About 12,000 faces were found in 8,000 images. The blue curve shows the probability density of the position of the eyelines of all faces, and the red curve the faces whose bounding rectangles occupy more than 1/12 of the total area of the painting. (While somewhat arbitrary, here and in the following, we will use 1/12 as the relative face rectangle area, above which a face will be considered to be a larger part of the whole image.) We see a clear single maximum at 2/3 from the bottom, as predicted by the ROT. (The two black vertical lines are at 2/3 and 1/ϕ).

Located eyeline across 12,000 faces in 8,000 images from Wikimedia

Because we determine the faces from potentially cropped images rather than ruler-based measurements on the actual paintings, we get some potential errors in our data. As analyzed in the last blog, these effects seem to average out and introduce final errors well under 1% for over 10,000 paintings.

Here are two heat maps: one for all faces, and the other for larger faces only. We place face-enclosing rectangles over each other, and the color indicates the fraction of all faces at a given position. One sees that human faces appear as frequently in the left half as in the right half. To allow comparisons of the face positions of paintings with different aspect ratios, the widths and heights of all paintings were rescaled to fit into a square. The centers of the faces fall nicely into the [2/3,1/ϕ] range. (The Wolfram Language code to generate the PDF and heat map plots is given below.)

Heat maps: one for all faces, one for larger faces only

Here is a short animation showing how the peak of the face distributions forms as more and more paintings are laid over each other.

Repeating the Wikimedia analysis with 4,000 portrait paintings from the portrait collection of the Smithsonian yields a similar result. This time, because we selected portrait paintings from the very beginning, the blue curve already shows a more located peak.

Located eyeline in 4,000 portrait paintings from the Smithsonian

The British Your Paintings website has a much larger collection of paintings. We find 58,000 paintings with a total of 76,000 faces.

Located eyeline of 76,000 faces in 58,000 paintings in the British Your Paintings

The mean and standard deviation for all eyeline heights is 0.64±0.19, and the median is 0.69.

In the eyeline position/relative face size plane, we obtain the following distribution showing that larger faces are, on average, positioned lower. Even for very small relative face sizes, the most common eyeline height is between 1/ϕ and 2/3.

yeline position/relative face size plane

The last image also begs for a plot of the PDF of the relative size of the faces in a painting. The mean area of a face rectangle is 3.9% of the whole painting area, with a standard deviation of 5.5%.

Relative size of the faces in a painting

Here is the corresponding cumulative distribution of all eyeline positions of faces larger than a given relative size. The two planes in the yz plane are at 1/ϕ and 2/3.

Cumulative distribution of all eyeline positions of faces larger than a given relative size

Did the fraction of paintings obeying the ROT of ϕ change over time? Looking at the data, the answer is no. For instance, here is the distribution of the eyeline heights for all nineteenth- and twentieth-century paintings from our dataset. (There are some claims that even Stone Age paintings already took the ROT into account.)

Eyeline heights for all nineteenth- and twentieth-century paintings

As paintings often contain more than one person, we repeat the analysis with the paintings that just have a single face. Now we see a broader maximum that spans the range from 1/ϕ to 2/3.

Eyeline heights in paintings that have a single face

Looking at the binned rather than the smoothed data in the range of the global maximum, we see two well-resolved maxima: one according to the ROT and one according to the golden ratio.

Binned data for eyeline heights

Now that we have gone through all the work to locate the faces, we might as well do something with them. For instance, we could superimpose them. And as a result, here is the average face from 11,000 large faces from nineteenth-century British paintings. The superimposed images of tens of thousands of faces also gives us some confidence in the robustness and quality of the face extraction process.

Average face from 11,000 large faces from nineteenth-century British paintings

Given a face from a nineteenth-century painting, which (famous) living person looks similar? Using Classify["NotablePerson",…], we can quickly find some unexpected facial similarities of living celebrities to people shown in older British paintings. The function findSimilarNotablePerson takes as the argument the abbreviated URL of a page from the Your Paintings website, imports the painting, extracts the face, and then finds the most similar notable person from the built-in database.

Using functions Classify, NotablePerson, findSimilarNotablePerson matching nineteenth-century faces with current living celebrities

Bob Dylan and Charles Kemble

William Shatner and Reverend William Morris

Mr. T and Sancho Panza

Here is a Demonstration that shows a few more similar pairs (please see the attached notebook to look through the different pairings).

Demonstration with similar pairs

The eyeline heights in newer paintings—more ϕaithful than ROTen

Now let us look at some more modern paintings. We find 15,000 modern portraits at Saatchi. Faces in modern portraits can look quite abstract, but FindFaces still is able to locate a fair number of them. Here are some concrete examples.

Using FindFaces to locate faces in modern portraits at Saatchi
sans titre The portraitist an ordinary person 19

In mozaik / One of us Model Jeanine

Dive into the Question #11 Eden PORTRAIT OF ANTON AT THE AGE OF 10

And here is an array of 144 randomly selected faces in modern art paintings. From a distance, one recognizes human faces, but deviations due to stylistic differences become less visible.

Array of 144 randomly selected faces in modern art paintings

If we again superimpose all faces, we get a quite normal-looking human face. With a more female appearance (e.g. softer jawline and fuller lips) as compared to the nineteenth-century British paintings, the overall face has more female characteristics. The fact that the average face looks quite “normal” is surprising when looking at the above 12*12 matrix of faces.

Faces from modern paintings superimposed

If we add not just all color values but also random positive and negative weights, we get much more modern-art-like average faces.

Adding all color values and random positive and negative weights

Now concerning the main question of this blog: what are the face positions in these modern portraits? Turns out, they again follow the golden ratio much more frequently than the ROT. About 30% more paintings have the eyeline at 1/ϕ±1% compared to 2/3±1%.

Face positioning in modern portraits

The mean and standard deviation for all eyeline heights is 0.60±0.16, and the median is 0.62. A clearly lower-centered and narrower distribution.

And if we plot the PDF of the eyeline height versus the relative face size, we clearly see a sweet spot at eyeline height 2/3 and relative face area 1/5. Smaller faces with relative size of about 5% occur higher, at eyeline height about 3/4.

Eyeline height versus relative face size in modern paintings

And here is again the corresponding 3D graphic that shows the 1/ϕ eyeline height for larger relative faces is quite pronounced.

3D graphic 1/ϕ eyeline height for larger relative faces

We should check with another data source to confirm that more modern paintings have a more ϕaithful eyeline. The site Fine Art America offers thousands of modern paintings of celebrities. Here is the average of 5,000 such celebrity paintings (equal amounts politicians, actors and actresses, musicians, and athletes). Again we clearly see the maximum of the PDF at 1/ϕ rather than at 2/3.

5,000 celebrity paintings from Fine Art America

For individual celebrities, the distribution might be different. Here is a small piece of code that uses some functions defined in the last section to analyze portrait paintings of individual persons.

Code used to analyze portraint paintings of individual persons

Here are some examples. (We used about 150 paintings per person.)

Jimi Hendrix

Mick Jagger

Perhaps unexpectedly, Jimi Hendrix is nearly perfectly ϕaithful, while Mick Jagger seems perfectly ROTen. Obama and Jesus obey nearly exactly the rule of thirds in its classic form.



The eyeline heights in photographs by professional photographers

Now, for comparison to the eyeline positions in paintings, let us look at some sets of photographs and determine the positions of the faces in these. Let’s start with professional portrait photographs. The Getty Image collection is a premier collection of good photographs. In contrast to the paintings, the maximum for large faces is much closer to 2/3 (ROT) than to 1/ϕ for a random selection of 200,000 portrait photographs.

Eyeline positions in photographs from Getty Image collection

And here is again the distribution in the eyeline height/relative face size plane. For very large relative face sizes, the most common eyeline height even drops below 1/ϕ.

Distribution in the eyeline height/relative face size plane for Getty images

And here is the corresponding heat map arising from overlaying 300,000 head rectangles.

Heat map arising from overlaying 300,000 head rectangles

So what about other photographs, those aesthetically less perfect than Getty Images? The Shutterstock website has many photos. Selecting photos with subjects of various tags, we quite robustly (meaning independent of the concrete tags) see the maximum of the eyeline height PDF near 2/3. This time, we display the results for portraits showing groups of identically tagged people.

These are the eyeline height distributions and the average faces of 100,000 male and female portraits. (The relatively narrow peak in the twin-peak structure of the distribution between 0.5 and 0.55 comes from photos that are close-up headshots that don’t show the entire face.)

Eyeline height distributions and the average faces of 100,000 male and female portraits

Restricting the photograph selection even more, e.g. to over 10,000 photographs of persons tagged with nerd or beard shows again ROTen-ness.

Eyeline height distributions and the average faces of over 10,000 photographs of persons tagged with nerd or beard

The next two rows show photos tagged with happy or sad.

Eyeline height distributions and the average faces of photographs tagged with happy or sad

All of the last six tag types (male, female, nerd, beard, happy, sad) of photographs show a remarkable robustness of the position of the eyeline maximum. It is always in the interval [1/ϕ,2/3], with a trend toward 2/3 (ROT).

But where are the babies (the baby eyeline, to be precise)? The two peaks are now even more pronounced, with the first peak even bigger than the second—the reason being that many more baby pictures are just close-ups of the baby’s whole face.

Eyeline height on photographs of babies

Next we’ll have a look at the eyeline height PDFs for two professional photographers: Peggy Sirota and Mario Testino. Because both artists often photograph models, the whole human body will be in the photograph, which shifts the eyeline height well above 2/3. (We will come back to this phenomenon later.)

Eyeline height in Peggy Sirota's photographs

Eyeline height in Mario Testino's photographs

The eyeline heights in selfies—maybe too high?

After looking at professionally made photos, we should, of course, also have a look at the pinnacle of modern amateur portraiture—the selfie. (For a nice summary of the history of the selfie, see Saltz. For a detailed study in the increase of selfie popularity over the last three years by nearly three orders of magnitude, see Souza et al. Using some of the service connects, e.g. the “Flickr” connection, we can immediately download a sample of selfies. Here are five selfies from the last week in September around the Eiffel Tower. Not all images tagged as “selfies” are just the faces in close up.

Selfies from Flickr from around the Eiffel Tower

Every day, more than 100,000 selfies are added to Instagram (one can easily browse them here)—this is a perfect source for selfies. Here are the eyeline height distributions for 100,000 selfie thumbnails.

Eyeline height distributions for 100,000 selfies from Instagram

Compared with the professional photographs, we see that the maximum of the eyeline height distributions is clearly above 2/3 for photos that contain a face larger than 1/12 of the total photo. So the next time you take a selfie, position your face a bit lower in the picture to better obey the ROT and ϕ. (Systematic deviations of selfies from established photographic aesthetic principles have already been observed by Bruno et al.)

The eyeline height in a selfie changes much less with the total face area as compared to professional photographs.

Eyeline height compared to face size in selfies

And again, the corresponding heat map.

Heat map for selfies

The maximum of the total area of the faces in selfies is—not unexpectedly—due to the finite length of the human arm or typical telescopic selfie sticks, bounded by about one meter. So selfies with very small faces are scarcer than photographs or paintings with small faces.

Total area of the faces in selfies

What’s the average selfie face look like? The left image is the average over all faces, the middle image the average over all male faces, and the right image the average over all female faces. (Genders were heuristically determined by matching the genders associated with a given name to user names.) The fact that the average selfie looks female arises from the fact that a larger number of selfies are of female faces. This was also found in the recent study by Manovich et al.

xAverage of all selfie faces (left), average of male selfie faces (middle), average of female selfie faces (right)

Now, it could be that the relative height of the eyeline is dependent on the concrete person portrayed. We give the full code in case the reader wants to experiment with people not investigated here. Eyeline heights we measure in images from the Getty website, tagged with the keywords to be specified in the function positionSummary.

Full code for determining eyeline height
Full code for determining eyeline height

Now it takes just a minute to get the average eyeline height of people seen in the news, each based on analyzing 600 portrait shots of Lady Gaga, Taylor Swift, Brad Pitt, and Donald Trump. Lady Gaga’s eyeline is, on average, clearly higher, quite similar to typical selfie positions. On the other hand, Taylor Swift’s eyeline is peaked at the modern painting-like maximum at 1/ϕ.

Lady Gaga

Taylor Swift

Brad Pitt

Donald Trump

Many more types of photographs could be analyzed. But we end here and leave further exploration and more playtime to the reader.

LinkedIn profile photos—men seem to be more ϕaithful

Many LinkedIn profile pages have photographs of the page owners. These photographs are another data source for our eyeline height investigations. Taking 25,000 male and 25,000 female profile photos, we obtain the following results. Because the vast majority of LinkedIn photographs are close-up shots, the curve for faces occupying more than 1/12 of the whole area is quite similar to the curve of all faces, and so we show only the distribution of all faces. This time, the yellow curve shows all faces that occupy between 10% and 30% of the total area.

Here are the eyeline height PDF, the bivariate PDF, and the average face for 10,000 male members from LinkedIn. Based on the frequency of male first names in the US, Bing image searches restricted to the LinkedIn domain were carried out, and the images found were collected.

Eyeline height PDF, bivariate PDF, and the average face for male members

Interestingly, the global maximum of the eyeline height distribution occurs clearly below 1/ϕ, the opposite effect compared to the selfies analyzed above. The center graph shows the distribution of the eyeline height as a function of the face area. The global maximum appears at a face area of 1/5 and at eyeline height quite close to 1/ϕ. This means the low global maximum is mostly caused by photographs where the face rectangles occupy more than 30% of the total area. The most typical LinkedIn photograph has a face rectangle area of 1/5th of the total area and the eyeline height is at 1/ϕ.

The corresponding distribution over all female US first names is quite similar to the corresponding curve for males. But for faces that occupy a larger fraction of the image, the female distribution is visibly different. The average eyeline height of these photos of women on LinkedIn is a few percent smaller than the corresponding male curve.

Eyeline height PDF, bivariate PDF, and the average face for female members

With the large number of members on LinkedIn, it even becomes feasible to look for eyeline height distribution for individual names. We carry out a facial profiling for three names: Josh, Raj, and Mei. Taking 2,500 photos for each name, we obtain the following distributions and average faces.

Eyelinge height distribution for Josh, Raj, and Mei

The distributions agree quite well with the corresponding gender distributions above.

After observing the remarkable peak of the eyeline height PDF at 1/ϕ, I was wondering which of my Wolfram Research or Wolfram|Alpha coworkers obey the ϕaithful rule. And indeed I found more of my male coworkers have the 1/ϕ height than female coworkers. Not unexpectedly, our design director’s is among the ϕaithful. The next input imports photos from the LinkedIn pages of other Wolfram employees and draws a red line at height 1/ϕ.

Eyeline height distribution for Wolfram Research employees

Let us compare the peak distribution with the one from the current members of Congress. We import photos of all members of Congress.

Importing photos of members of Congress

Here are some example photos.

Photos of members of Congress

Similar to the LinkedIn profile photos, the maximum of the eyeline PDF is slightly lower than 2/3. We also show the face of the averaged member of Congress.

Eyeline height distribution, heat map, and average face for memebers of Congress

Weekly magazine covers—tending to be ϕaithful over the last three decades

After having analyzed the face positions of amateur and professional photographs, a next natural area for exploration is magazine covers: their photographs are carefully made, selected, and placed. TIME magazine maintains a special website for their 4,800 covers covering over ninety years of published issues. (For a quick view of all covers, see Manovich’s cover analysis from a few years ago.)

It is straightforward to download the covers, and then find and extract the faces.

Downloading TIME magazine covers and extracting the faces

These are the two resulting distributions for the eyelines.

Eyeline distributions for faces on TIME magazine covers

The maximum occurs at a height smaller than 1/2. This is mostly caused by the title “TIME” on top of the cover. Newer editions have partial overlaps between the magazine title and the image. The following plot shows the yearly average of the eyeline height over time. Since the 1980s, there has been a trend for higher eyeline positions on the cover.

Yearly average of eyeline height over time

If we calculate the PDFs of the eyeline positions of all issues from the last twenty-five years, we see quite a different distribution with a bimodal structure. One of the peaks is nearly exactly at 1/ϕ.

Eyeline height positions of all issues in the last 25 years

And here are the average faces per decade. We see also that the covers of the first two decades were in black and white.

Average faces per decade

For a second example, we will look at the German magazine SPIEGEL. It is again straightforward to download all the covers, locate the faces, and extract the eyelines.

Downloading covers and extracting faces from SPIEGEL

Again, because of the title text “SPIEGEL” on top of the cover, the maximum of the PDF of the eyeline height on the cover occurs at relatively low heights (≈0.56).

Eyeling height distribution for SPIEGEL magazine covers

A heat map of the face positions shows this clearly.

Heat map for SPIEGEL magazine covers

Taking into account both that the magazine title “SPIEGEL” is typically 13% of the cover height and that there is whitespace at the bottom, the renormalized peak of the eyeline height is nearly exactly at 1/ϕ.

Average faces by decade from SPIEGEL covers

For a third, not-so-politically-oriented magazine, we chose the biweekly Rolling Stone. They too have a collection of their covers (through 2013) online. The eyeline height distribution is again bimodal, with the largest peak at 1/ϕ. So Rolling Stone is a ϕaithful magazine.

Eyeline height distribution for Rolling Stone magazine

By year, the average eyeline height shows some regularities within an eight-year period.

Average eyeline height for Rolling Stone magazine

The cumulative mean of the eyeline heights is very near to 1/ϕ, and the average through 2013 deviates only 0.4% from 1/ϕ.

Cumulative mean of eyeline heights from Rolling Stone magazine

To parallel the earlier two magazines, here are the averaged faces by decade.

Average faces by decade from Rolling Stone magazine

Comic book covers—where are the eyelines of the superheros?

Comic covers are another fairly large source of images to analyze. The Comic Book Database has a large collection of comic book covers. Here we restrict ourselves to Marvel Comics and DC Comics, totaling about 72,000 covers. Because comics are not photographs, recognizing faces is now a harder job. But even so, we successfully extract about 90,000 faces.

Here are our typical characterizations (eyeline height PDF, face position heat map, average face) for Marvel Comics.

Eyeline height PDF, face position heat map, average face for Marvel

And the same for DC Comics.

Eyeline height PDF, face position heat map, average face for DC Comics

All three characteristics show remarkable consistency between the two comic publishers.

Daily newspapers, fashion magazines, …—where are the eyelines now?

Many more collections of faces can now be investigated for the eyeline positions. It is straightforward to write a small crawler function that starts with a given website and extracts images and links to pages with more images. (This is just a straightforward implementation. Many optimizations, such as parallel retrieval, could be implemented to improve this function.)

Extracting images with a given website with links to pages with more images

For example, here is the resulting average data for all images (larger than 200 pixels) from The New York Times website from February 8, 2016. The eyeline PDF maximum is between 2/3 and 1/ϕ.

Eyeline height distribution, heat map, and average face from February 8, 2016 on The New York Times website

And here from the weekly German newspaper, Die Zeit. This time, the eyeline maximum is clearly 2/3 for larger faces.

Eyeline height distribution, heat map, and average face from Die Zeit

Here is a snapshot of 1,000 images from CNN.

Eyeline height distribution, heat map, and average face from 1,000 photos from CNN

The eyeline heights in fashion magazines show a totally different distribution. Here are the results of 1,000 images from Vogue. Because many images on the site show the stylishly dressed models from head to toe, the head is small and the eyeline very high in the images. As a result, we get the strong, narrow peak of the blue curve.

Eyeline height distribution, heat map, and average face from 1,000 images from Vogue

GQ Magazine also shows a global eyeline height peak at 2/3 for large faces.

Eyeline height distribution, heat map, and average face from GQ Magazine

The maximum of the eyeline in the magazine People is again at 2/3 for large faces.

Eyeline height distribution, heat map, and average face from People

And here are the results for Ebony magazine. This time, the large face eyeline height has a peak at 1/ϕ.

Eyeline height distribution, heat map, and average face from Ebony

Using a bodybuilding magazine, as with the Vogue images, we see a very high eyeline, again because often whole-body images are shown. The average face looks different from the previous averages.

Eyeline height distribution, heat map, and average face from Flex magazine

We obtain a softer-looking face with an eyeline maximum greater than 2/3 from Allure magazine.

Eyeline height distribution, heat map, and average face from Allure magazine

And goths from the Gothic Beauty magazine are on average ROTen, but large goths are more ϕaithful.

Eyeline height distribution, heat map, and average face from Gothic Beauty

The magazine 20/20 specializes in glasses. Not unexpectedly, the average face shows pronounced sunglasses and the eyeline height as greater than 2/3.

Eyeline height distribution, heat map, and average face from 20/20

Movie posters—the eyelines of film stars

A good-sized source of a wide variety of drawn and photographed paintings are movie posters. The site Movie Posters has 35,000 posters going back to the 1920s.

Cumulative distribution for movie posters

More interesting is a plot of the mean over time. Before the 1980s, eyelines were more in the center of posters. Since then, the average eyeline position is more in the interval [1/ϕ,2/3].

Eyeline height over time in movie posters

The shift in average eyeline height in movie posters is even more clearly visible in the corresponding face heat maps.

Face heat maps from movie posters

Here is the average face from all movie posters from the last five years.

Average face from all movie posters

Movies—the eyelines in motion picture frames

In the last blog, we ended with plots of the evolution of the average movie aspect ratio, so this time we will also end by analyzing some movies. The Internet Archive has a collection of 20,000 movies that are available for download. We will look at the face positions of two well-known classics: Buster Keaton’s The General from 1926 and Fritz Lang’s Metropolis from 1927. We start with The General. The average of all faces (without taking size into account) is at 2/3, and the large faces clearly appear lower.

Eyeline distribution for faces in The General

Not every frame of a movie contains faces, so it is natural to ask if the mean (windowed) eyeline height changes as the movie progresses. Here is a different kind of heat map that shows the mean eyeline height over time. The colors indicate the number of frames that contain identified faces.

Heat map of mean eyeline height over time in The General

Because the main character in the film moves a lot, the heat map of the face position has now much more structure as compared to the above heat maps of photographs and paintings.

Heat map for The General

Fritz Lang’s Metropolis, although made only one year after The General, was shot in quite a different style. Just by quickly zooming through the movie, one observes that the majority of faces appear at a much larger height. This impression is confirmed by the actual data about the eyeline positions.

Heat map of mean eyeline height over time in Metropolis

The PDF of all eyeline positions shows that especially large faces appear high in the frames.

Eyeline distribution in Metropolis

We compare with a modern TV series production—episode nine of season nine from The Big Bang Theory, “The Platonic Permutation”. Most faces appear above the 2/3 height.

Heat map of mean eyeline height over time in The Big Bang Theory

But the PDF of the eyeline position of larger faces peaks very near to 2/3, and the average face shows characteristic facial features of the show’s main characters.

Eyeline distribution for The Big Bang Theory

Or, for a very recent example, here is the PDF of episode one of Amazon’s recent The Man in the High Castle. The peak of the eyeline of larger faces is nearer to 1/ϕ than to 2/3.

Eyeline postion for The Man in the High Castle

We end with a third TV series example, episode eight of season six of The Walking Dead. For larger faces, we see a well-pronounced bimodal eyeline height distribution, with the two maxima at 1/ϕ and 2/3.

Eyeline height distribution and average faces from The Walking Dead


In this second part of our explorations of the golden ratio in the visual arts, we looked at the height of the eyeline of human faces and the face position. Using the function FindFaces and approximate rules for determining the eyeline height in faces, we computed averages of more than a million faces and eyeline heights.

The maxima of the eyeline height distribution for photographs and paintings is predominately in the range of 0.6 to 0.67. Older paintings and modern photographs have maxima near 2/3, as the rule of thirds predicts (demands). Interestingly, modern art portraits show the eyeline height PDF peak at 1/golden ratio for large faces. (We used >1/12 of the total area to define “large” faces.) The peak eyeline position in selfies is about 0.7, higher than in paintings and many professional photographs. The magazine covers we analyzed, especially those of the past few decades, seem to have a peak of the eyeline position PDF at 1/golden ratio. Similarly, the photos from various newspaper sites show a peak at 1/golden ratio. For LinkedIn photos, clear gender differences between the positions of the eyeline height were found—men turned out to be more ϕaithful. And the analyzed movies show that faces, especially smaller ones, appear quite often significantly above the 2/3 height. But modern TV series show peaks at either the 1/golden ratio or 2/3—or even both simultaneously.

Download this post as a Computable Document Format (CDF) file.

http://blog.wolfram.com/2016/03/02/profiling-the-eyes-phiaithful-or-roten-or-both/feed/ 5