Comparing Apples and Oranges with the Wolfram Language
June 30, 2016 — Andrew Steinacher, Lead Developer, Wolfram|Alpha Scientific Content
We’ve all heard the phrase “You can’t compare apples and oranges.” Well, the “impossible” can now be done within the Wolfram Language. With the help of new features and new data, you can finally compare the two fruits from the inside out. Along with a variety of interactive visualizations, the real difference between apples and oranges—or between frozen and chain pizzas, or even food-related Pokémon—is just a few lines of code away.
Comparing apples and oranges
The biggest difference most people might notice between apples and oranges is in their appearance. In case you’ve forgotten what they look like, a simple EntityValue call can jog your memory:
And although “a picture is worth a thousand words,” we can go further. For example, we can even “see inside” the fruits by asking for their typical inside and outside colors:
But appearance isn’t the only way to compare apples to oranges. We can also get an overview of their nutritional profiles by generating nutrition labels based on typical serving sizes with the Wolfram Language’s built-in natural language processing capabilities:
It looks like the fruits are not all that different, nutritionally speaking. However, oranges do have well over the daily recommended value of Vitamin C.
Let’s see if we can find other nutritional differences that aren’t on the nutrition label by gathering nutritional properties on a per-gram basis:
Using the new implicit Entity syntax, we can consider all foods that are apples and oranges:
We can gather all of the nutritional data for these foods, keeping only “comparable” properties (i.e. those without missing data for both apples and oranges):
With a little more work, we can visualize all of the nonzero relative differences between an apple and an orange in a single plot with tooltips to show the properties (download the attached notebook to use tooltips):
Negative numbers here indicate that an orange has more of that nutrient than an apple. From this, we can see that oranges are more nutritious than apples, particularly with respect to vitamins. It’s also important to note the extreme cases at ±200% relative difference. These are nutrients where one fruit has no measurable content, but the other fruit does. We can see exactly what these are by making a grid with the five largest differences on each end of the spectrum:
So although people say “an apple a day keeps the doctor away,” it looks like eating an orange every day would be healthier. But how well do oranges stack up against other fruits and vegetables?
I’ve deployed an interactive Manipulate in the Wolfram Cloud where you can see just that. Here’s a screen shot of it comparing oranges and the “superfood” kale, which outshines oranges in vitamins and minerals (download the attached notebook to use the Manipulate):
So far, we’ve been comparing all apple varieties to all orange varieties. But what happens when we look at individual apples and oranges with different varieties and attributes? We can do this by starting with EntityList, which gives our ever-growing range of specifics:
We can then compare the individual fruits in nutrient spaces, such as this plot in potassium-calcium space:
But let’s not be limited by the fruits themselves—we can also look at their juices. Using the Ctrl+Equal interface makes this very simple:
We can easily see which brands sell both apple juice and orange juice:
We can even use nutrient spaces again to visualize the individual juices, but this time we can do it in three dimensions:
There is definitely an outlier, which corresponds to frozen, undiluted orange juice concentrate:
Using non-nutritional properties, we can learn more about this food:
Based on this kind of information, we can exclude foods with undesirable characteristics. In this particular case, we should exclude undiluted juices, which we can do by tweaking the implicit entity’s constraints with ContainsNone:
After the adjustment, the plot gives better results:
We can also extend this idea and compare apple juices and orange juices to other fruit juices:
Beyond apples and oranges
Of course, the Wolfram Language is useful for more than just comparing fruits. With non-nutritional properties and images for thousands of foods, many applications are possible. For example, we can search for foods with specific non-nutritional properties, such as yellow, approximately spherical food types:
Or make an image collage of vegetables that have white or cream-colored insides, weighted by iron content:
In addition to whole foods, the Wolfram Language also has thousands of packaged foods with UPCs and images. With a little help from BarcodeRecognize, it’s easy to make a grocery store barcode scanner:
Remember when we plotted apples and oranges in 3D space? We can do this again, but this time with packaging images for nearly 100 pizzas:
Of course, we can also visualize the pizzas in a typical histogram of calories per 100 grams. Additionally, we can compare them to a typical pizza from a pizza chain, as reported by USDA estimates. The typical pizza calorie content is shown in the following plot as a red line:
It’s interesting to note that most packaged pizzas appear to have fewer calories than those from pizza chains. One explanation for this is that the common pizza-chain toppings (e.g. pepperoni, sausage, etc.) are high in calories, especially compared to the many vegetarian and less-calorie-heavy options available in packaged pizzas.
There are many other complicated visualization tools available, but nothing beats a good word cloud. This one is for leafy vegetables weighted according to their vitamin A content:
And, of course, we’re also working to align our food data to other domains. For example, consider this query for cherry-like Pokémon:
You can even make an image collage of food-related Pokémon:
We’re also continuing to add more data, so keep an eye out for updates!