Wolfram Blog
Peter Barendse
Emily Suess

Literary Analysis and the Wolfram Language: Jumping Down a Reading Rabbit Hole

October 1, 2015
Peter Barendse, Senior Wolfram|Alpha Developer
Emily Suess, Technical Writer, Technical Communications and Strategy Group

As summer wraps up and students are hitting the books once again here in the US, it’s fun to explore how the Wolfram Language can be used in the classroom to analyze texts.

Take the beloved classic Alice in Wonderland by Lewis Carroll as an example. In just a few lines of code, you can create a word cloud from its text, browse its numerous covers, and visualize its emotional content.

Jump right in by creating a WordCloud:

Creating a WordCloud for <em>Alice in Wonderland</em>

WordCloud includes low-information words like “the” and “a”; using DeleteStopwords to eliminate them gives a more meaningful word cloud:

Using DeleteStopwords in WordCloud

Even if you’ve never read Alice in Wonderland, you can get a one-sentence summary from the word cloud: Alice is the star of a story with lots of talking, royalty, and animals.

You can see the covers for different editions of Alice in Wonderland by browsing them at Open Library.

First, fetch the URL:

<em>Alice in Wonderland</em> book covers from Open Library

Then look through the page to find cover image locations and import them.

Uh-oh! These covers are only thumbnails—too small. There is no potion to drink to make them magically larger, but you can fine-tune the code, locating edition-specific pages from the main URL and importing the corresponding full-size cover images:

Importing the full-size cover images

You can browse through the different covers using the thumbnails for the drop-down control:

Using Manipulate to browse through different covers

You can also create a cartoon summary of the emotional content expressed in the sentences of a book. With a summary, you can tell at a glance whether that Shakespeare play you’re reading is a comedy or a tragedy. Or, in this case, find out what’s happening with Alice.

In the output, neutral text is represented by an underscore, positive content by a blue smiley face, and negative content by a red face. With Tooltip, you can hover over a symbol to read its corresponding sentence:

Summary of emotional content using Tooltip

Diving further down the rabbit hole, you can also graph swings of emotion. Here, each positive or negative sentence adds or subtracts from the cumulative emotion, while each neutral one brings the emotion closer to zero:

Graphing swings of emotion

Finally, you can classify the sentences in Alice by "FacebookTopic", which is a useful tool for identifying themes within any book. The output gives the number of sentences with the corresponding sentiment and could likewise be used to generate ideas for a term paper on The Catcher in the Rye or a literary critique of Pride and Prejudice:

Using FacebookTopic to identify themes

We take the six most popular topics and color the data by those topics, leaving everything else gray:

Color coding topics by popularity

Putting these colors on the previous “cumulative emotion” graph allows you to compare topic and emotional content:

Cummulative emotion graph

No matter what books you’re reading—for study or for pleasure—you’ll find new ways to explore them with the Wolfram Language.

Download this post as a Computable Document Format (CDF) file.

Leave a Comment

One Comment


Arno

Hi, can you tell me how the sentiment classifier was trained (i.e. on what texts?). I’m going to assume FaceBook posts.

Posted by Arno    October 2, 2015 at 10:19 am


Leave a comment

Loading...

Or continue as a guest (your comment will be held for moderation):