New in the Wolfram Language: WikipediaData
Since the inception of Wolfram|Alpha, Wikipedia has held a special place in its development pipeline. We usually use it not as a primary source for data, but rather as an essential resource for improving our natural language understanding, particularly for mining the common and colloquial ways people refer to entities and concepts in various domains.
We’ve developed a lot of internal tools to help us analyze and extract information from Wikipedia over the years, but now we’ve also added a Wikipedia “integrated service” to the latest version of the Wolfram Language—making it incredibly easy for anyone to incorporate Wiki content into Wolfram Language workflows.
You can simply grab the text of an article, of course, and feed it into some of the Wolfram Language’s new functions for text processing and visualization:
Or if you don’t have a specific article in mind, you can search by title or content:
You can even use Wolfram Language entities directly in WikipediaData to, say, get equivalent page titles in any of the dozens of available Wikipedia language versions:
One of my favorite functions allows you to explore article links out from (or pointing in toward) any given article or category—either in the form of a simple list of titles, or as a list of rules that can be used with the Wolfram Language’s powerful functions for graph visualization. In fact, with just a few lines of code, you can create a beautiful and interesting visualization of the shared links between any set of Wikipedia articles:
There’s a lot of useful functionality here, and we’ve really only scratched the surface. Watch for many more integrated services to follow throughout the coming year.
Version 10.1 of the Wolfram Language is now supported in Mathematica and rolling out in all other Wolfram products.
Download this post as a Computable Document Format (CDF) file.
this is great!
This is great, as well as other previews of new functionality. Is there any timetable for when we might see this?
Thanks for your comment, this post is a preview of upcoming functionality; be sure to keep checking the blog for more updates!
I cannot find the documentation for “WikipediaData” and it does not yet work in Mathematica 10.0.2 or the Programming Cloud. Could you tell me where to find the new version and / or when it will be live?
Thank your for your comment, this post is a preview of upcoming functionality; be sure to keep checking the blog for more updates!
The Arabic entry in the table is completely wrong: The characters are not connected and not written from right to left, as they should.
Love it. I image this will be fun to play with.
Are you actually using Wikidata? http://wikidata.org/
Thank you for your comment, at this time our WikipediaData does not pull from wikidata.org.
Thanks for your comment, the features in this blog are a part of our 10.1 update that was recently released.