Wolfram Blog
Alan Joyce

New in the Wolfram Language: WikipediaData

March 20, 2015 — Alan Joyce, Director, Content Development

Since the inception of Wolfram|Alpha, Wikipedia has held a special place in its development pipeline. We usually use it not as a primary source for data, but rather as an essential resource for improving our natural language understanding, particularly for mining the common and colloquial ways people refer to entities and concepts in various domains.

We’ve developed a lot of internal tools to help us analyze and extract information from Wikipedia over the years, but now we’ve also added a Wikipedia “integrated service” to the latest version of the Wolfram Language—making it incredibly easy for anyone to incorporate Wiki content into Wolfram Language workflows.

You can simply grab the text of an article, of course, and feed it into some of the Wolfram Language’s new functions for text processing and visualization:

text sentence WikipediaData

word cloud WikipediaData

Or if you don’t have a specific article in mind, you can search by title or content:

WikipediaSearch by content or title

You can even use Wolfram Language entities directly in WikipediaData to, say, get equivalent page titles in any of the dozens of available Wikipedia language versions:

using entitites in WikipediaData

One of my favorite functions allows you to explore article links out from (or pointing in toward) any given article or category—either in the form of a simple list of titles, or as a list of rules that can be used with the Wolfram Language’s powerful functions for graph visualization. In fact, with just a few lines of code, you can create a beautiful and interesting visualization of the shared links between any set of Wikipedia articles:

WikisSharedLinks in a given article or category

There’s a lot of useful functionality here, and we’ve really only scratched the surface. Watch for many more integrated services to follow throughout the coming year.

Version 10.1 of the Wolfram Language is now supported in Mathematica and rolling out in all other Wolfram products.

Download this post as a Computable Document Format (CDF) file.

Leave a Comment

12 Comments


p-bear

this is great!

Posted by p-bear    March 20, 2015 at 1:04 pm
George Woodrow III

This is great, as well as other previews of new functionality. Is there any timetable for when we might see this?

Posted by George Woodrow III    March 21, 2015 at 2:34 am
    The Wolfram Team

    Thanks for your comment, this post is a preview of upcoming functionality; be sure to keep checking the blog for more updates!

    Posted by The Wolfram Team    March 25, 2015 at 12:55 pm
Marbaehr

I cannot find the documentation for “WikipediaData” and it does not yet work in Mathematica 10.0.2 or the Programming Cloud. Could you tell me where to find the new version and / or when it will be live?

Posted by Marbaehr    March 22, 2015 at 5:45 am
    The Wolfram Team

    Thank your for your comment, this post is a preview of upcoming functionality; be sure to keep checking the blog for more updates!

    Posted by The Wolfram Team    March 25, 2015 at 12:56 pm
Vittorio G. Caffa

The Arabic entry in the table is completely wrong: The characters are not connected and not written from right to left, as they should.

Posted by Vittorio G. Caffa    March 22, 2015 at 12:32 pm
Michael Stern

Love it. I image this will be fun to play with.

Posted by Michael Stern    March 26, 2015 at 3:59 pm
Mark Holmes

Most importantly (to me), the stopping power data (and Mass Attenuation data) is now in Mathematica. I can retire my cobble-code with the web queries to NIST. Thank you!

Posted by Mark Holmes    March 31, 2015 at 2:28 pm
Nemo

Are you actually using Wikidata? http://wikidata.org/

Posted by Nemo    April 2, 2015 at 1:32 am
    The Wolfram Team

    Thank you for your comment, at this time our WikipediaData does not pull from wikidata.org.

    Posted by The Wolfram Team    April 13, 2015 at 10:17 am
    The Wolfram Team

    Thanks for your comment, the features in this blog are a part of our 10.1 update that was recently released.

    Posted by The Wolfram Team    April 28, 2015 at 12:32 pm


Leave a comment in reply to Marbaehr

Loading...

Or continue as a guest (your comment will be held for moderation):