Wolfram Blog
Andrew Steinacher

Comparing Apples and Oranges with the Wolfram Language

June 30, 2016 — Andrew Steinacher, Wolfram|Alpha Developer, Wolfram|Alpha Scientific Content

We’ve all heard the phrase “You can’t compare apples and oranges.” Well, the “impossible” can now be done within the Wolfram Language. With the help of new features and new data, you can finally compare the two fruits from the inside out. Along with a variety of interactive visualizations, the real difference between apples and oranges—or between frozen and chain pizzas, or even food-related Pokémon—is just a few lines of code away.

Difference between an apple and an orange

Comparing apples and oranges

The biggest difference most people might notice between apples and oranges is in their appearance. In case you’ve forgotten what they look like, a simple EntityValue call can jog your memory:

Using EntityValue to see apples and oranges

And although “a picture is worth a thousand words,” we can go further. For example, we can even “see inside” the fruits by asking for their typical inside and outside colors:

Typical inside and outside colors of apples and oranges

But appearance isn’t the only way to compare apples to oranges. We can also get an overview of their nutritional profiles by generating nutrition labels based on typical serving sizes with the Wolfram Language’s built-in natural language processing capabilities:

Nutrition labels for apples and oranges

It looks like the fruits are not all that different, nutritionally speaking. However, oranges do have well over the daily recommended value of Vitamin C.

Let’s see if we can find other nutritional differences that aren’t on the nutrition label by gathering nutritional properties on a per-gram basis:

Gathering nutritional properties on a per-gram basis

Using the new implicit Entity syntax, we can consider all foods that are apples and oranges:

Considering all foods that are apples and oranges

We can gather all of the nutritional data for these foods, keeping only “comparable” properties (i.e. those without missing data for both apples and oranges):

Comparing nutritional values for comparable properties

With a little more work, we can visualize all of the nonzero relative differences between an apple and an orange in a single plot with tooltips to show the properties (download the attached notebook to use tooltips):

Nonzero relative differences between an apple and an orange

Negative numbers here indicate that an orange has more of that nutrient than an apple. From this, we can see that oranges are more nutritious than apples, particularly with respect to vitamins. It’s also important to note the extreme cases at ±200% relative difference. These are nutrients where one fruit has no measurable content, but the other fruit does. We can see exactly what these are by making a grid with the five largest differences on each end of the spectrum:

Five largest differences in vitamin content

So although people say “an apple a day keeps the doctor away,” it looks like eating an orange every day would be healthier. But how well do oranges stack up against other fruits and vegetables?

I’ve deployed an interactive Manipulate in the Wolfram Cloud where you can see just that. Here’s a screen shot of it comparing oranges and the “superfood” kale, which outshines oranges in vitamins and minerals (download the attached notebook to use the Manipulate):

Comparing fruits and vegetables in a Manipulate

So far, we’ve been comparing all apple varieties to all orange varieties. But what happens when we look at individual apples and oranges with different varieties and attributes? We can do this by starting with EntityList, which gives our ever-growing range of specifics:

Apples and oranges with different varieties and attributes

We can then compare the individual fruits in nutrient spaces, such as this plot in potassium-calcium space:

Plot of fruits in potassium-calcium space

But let’s not be limited by the fruits themselves—we can also look at their juices. Using the Ctrl+Equal interface makes this very simple:

Fruit juices

We can easily see which brands sell both apple juice and orange juice:

Brands that sell both apple and orange juices

We can even use nutrient spaces again to visualize the individual juices, but this time we can do it in three dimensions:

Comparison of fruit juices in nutrient spaces

There is definitely an outlier, which corresponds to frozen, undiluted orange juice concentrate:

Frozen, undiluted orange juice concentrate

Using non-nutritional properties, we can learn more about this food:

Learning more about frozen, undiluted orange juice through non-nutritional properties

Based on this kind of information, we can exclude foods with undesirable characteristics. In this particular case, we should exclude undiluted juices, which we can do by tweaking the implicit entity’s constraints with ContainsNone:

Excluding undesirable characteristics with ContainsNone

After the adjustment, the plot gives better results:

Comparison of fruit juices with excluded juices

We can also extend this idea and compare apple juices and orange juices to other fruit juices:

Comparison of apple, orange, grape, pineapple, lemon, and grapefruit juices

Beyond apples and oranges

Of course, the Wolfram Language is useful for more than just comparing fruits. With non-nutritional properties and images for thousands of foods, many applications are possible. For example, we can search for foods with specific non-nutritional properties, such as yellow, approximately spherical food types:

Searching for foods with specific non-nutritional properties

Or make an image collage of vegetables that have white or cream-colored insides, weighted by iron content:

Image collage of vegetables that have white or cream-colored insides, weighted by iron content

In addition to whole foods, the Wolfram Language also has thousands of packaged foods with UPCs and images. With a little help from BarcodeRecognize, it’s easy to make a grocery store barcode scanner:

Making a grocery store barcode scanner

Remember when we plotted apples and oranges in 3D space? We can do this again, but this time with packaging images for nearly 100 pizzas:

Plot of nutritional information for nearly 100 pizzas

Of course, we can also visualize the pizzas in a typical histogram of calories per 100 grams. Additionally, we can compare them to a typical pizza from a pizza chain, as reported by USDA estimates. The typical pizza calorie content is shown in the following plot as a red line:

Calorie content of pizzas

It’s interesting to note that most packaged pizzas appear to have fewer calories than those from pizza chains. One explanation for this is that the common pizza-chain toppings (e.g. pepperoni, sausage, etc.) are high in calories, especially compared to the many vegetarian and less-calorie-heavy options available in packaged pizzas.

There are many other complicated visualization tools available, but nothing beats a good word cloud. This one is for leafy vegetables weighted according to their vitamin A content:

Leafy vegetables weighted according to their vitamin A content

And, of course, we’re also working to align our food data to other domains. For example, consider this query for cherry-like Pokémon:

Cherry-like Pokémon

You can even make an image collage of food-related Pokémon:

Collage of food-related Pokémon

We’re also continuing to add more data, so keep an eye out for updates!

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

Posted in: Wolfram Language
Leave a Comment

One Comment

Barrie Stokes

Hi Andrew

Thanks for another very impressive Wolfram Blog, again highlighting the breadth of the Wolfram Language. The possibility of extended dietary analyses is intriguing … .

Posted by Barrie Stokes    July 3, 2016 at 6:18 pm

Leave a comment


Or continue as a guest (your comment will be held for moderation):