Literary Analysis and the Wolfram Language: Jumping Down a Reading Rabbit Hole
As summer wraps up and students are hitting the books once again here in the US, it’s fun to explore how the Wolfram Language can be used in the classroom to analyze texts.
Take the beloved classic Alice in Wonderland by Lewis Carroll as an example. In just a few lines of code, you can create a word cloud from its text, browse its numerous covers, and visualize its emotional content.
Jump right in by creating a WordCloud:
WordCloud includes low-information words like “the” and “a”; using DeleteStopwords to eliminate them gives a more meaningful word cloud:
Even if you’ve never read Alice in Wonderland, you can get a one-sentence summary from the word cloud: Alice is the star of a story with lots of talking, royalty, and animals.
You can see the covers for different editions of Alice in Wonderland by browsing them at Open Library.
First, fetch the URL:
Then look through the page to find cover image locations and import them.
Uh-oh! These covers are only thumbnails—too small. There is no potion to drink to make them magically larger, but you can fine-tune the code, locating edition-specific pages from the main URL and importing the corresponding full-size cover images:
You can browse through the different covers using the thumbnails for the drop-down control:
You can also create a cartoon summary of the emotional content expressed in the sentences of a book. With a summary, you can tell at a glance whether that Shakespeare play you’re reading is a comedy or a tragedy. Or, in this case, find out what’s happening with Alice.
In the output, neutral text is represented by an underscore, positive content by a blue smiley face, and negative content by a red face. With Tooltip, you can hover over a symbol to read its corresponding sentence:
Diving further down the rabbit hole, you can also graph swings of emotion. Here, each positive or negative sentence adds or subtracts from the cumulative emotion, while each neutral one brings the emotion closer to zero:
Finally, you can classify the sentences in Alice by "FacebookTopic", which is a useful tool for identifying themes within any book. The output gives the number of sentences with the corresponding sentiment and could likewise be used to generate ideas for a term paper on The Catcher in the Rye or a literary critique of Pride and Prejudice:
We take the six most popular topics and color the data by those topics, leaving everything else gray:
Putting these colors on the previous “cumulative emotion” graph allows you to compare topic and emotional content:
No matter what books you’re reading—for study or for pleasure—you’ll find new ways to explore them with the Wolfram Language.
Download this post as a Computable Document Format (CDF) file.
Hi, can you tell me how the sentiment classifier was trained (i.e. on what texts?). I’m going to assume FaceBook posts.