September 24, 2019 — Suba Thomas, Software Engineer, Algorithms R&D
Real-time filters work like magic. Usually out of sight, they clean data to make it useful for the larger system they are part of, and sometimes even for human consumption. A fascinating thing about these filters is that they don’t have a big-picture perspective. They work wonders with only a small window into the data that is streaming in. On the other hand, if I had a stream of numbers flying across my screen, I would at the very least need to plot it to make sense of the data. These types of filters are very simple as well.
September 5, 2019 — Daniel Lichtblau, Symbolic Algorithms Developer, Algorithms R&D
A Year Ago Today
On September 5 of last year, The New York Times took the unusual step of publishing an op-ed anonymously. It began “I Am Part of the Resistance inside the Trump Administration,” and quickly became known as the “Resistance” op-ed. From the start, there was wide‐ranging speculation as to who might have been the author(s); to this day, that has not been settled. (Spoiler alert: it will not be settled in this blog post, either. But that’s getting ahead of things.) When I learned of this op-ed, the first thing that came to mind, of course, was, “I wonder if authorship attribution software could….” This was followed by, “Well, of course it could. If given the right training data.” When time permitted, I had a look on the internet into where one might find training data, and for that matter who were the people to consider for the pool of candidate authors. I found at least a couple of blog posts that mentioned the possibility of using tweets from administration officials. One gave a preliminary analysis (with President Trump himself receiving the highest score, though by a narrow margin—go figure). It even provided a means of downloading a dataset that the poster had gone to some work to cull from the Twitter site.
The code from that blog was in a language/script in which I am not fluent. My coauthor on two authorship attribution papers (and other work), Catalin Stoean, was able to download the data successfully. I first did some quick validation (to be seen) and got solid results. Upon setting the software loose on the op-ed in question, a clear winner emerged. So for a short time I “knew” who wrote that piece. Except. I decided more serious testing was required.
August 22, 2019 — Sjoerd Smit, Technical Consultant, European Sales
Readers who follow the Mathematica Stack Exchange (which I highly recommend to any Wolfram Language user) may have seen this post recently, in which I showed a function I wrote to make Bayesian linear regression easy to do. After finishing that function, I have been playing around with it to get a better feel of what it can do, and how it compares against regular fitting algorithms such as those used by Fit. In this blog post, I don’t want to focus too much on the underlying technicalities (check out my previous blog post to learn more about Bayesian neural network regression); rather, I will show you some of the practical applications and interpretations of Bayesian regression, and share some of the surprising results you can get from it.
August 8, 2019 — Jesse Friedman, Software Engineer, Engine Connectivity Engineering
Cerne Abbas Walk is an artwork by Richard Long, in the collection of the Tate Modern in London and on display at the time of this writing. Several of Long’s works involve geographic representations of his walks, some abstract and some concrete. Cerne Abbas Walk is described by the artist as “a six-day walk over all roads, lanes and double tracks inside a six-mile-wide circle centred on the Giant of Cerne Abbas.” The Tate catalog notes that “the map shows his route, retracing and re-crossing many roads to stay within a predetermined circle.”
The Giant in question is a 180-foot-high chalk figure carved into a hill near the village of Cerne Abbas in South West England. Some archaeologists believe it to be of Iron Age pedigree, some think it to date from the Roman or subsequent Saxon periods and yet others find the bulk of evidence to indicate a 17th-century origin as a political satire. (I find the last theory to be both the most amusing and the most convincing.)
I found the geographic premise of Cerne Abbas Walk intriguing, so I decided to replicate it computationally.
August 2, 2019 — Bob Sandheinrich, Development Manager, Document & Media Systems
Every summer, I play in a recreational Ultimate Frisbee league—just “Ultimate” to those who play. It’s a fun, relaxed, coed league where I tend to win more friends than games.
The league is organized by volunteers, and one year, my friend and teammate Nate was volunteered to coordinate it. A couple weeks before the start of the season, Nate came to me with some desperation in his voice over making the teams. The league allows each player to request to play with up to eight other players—disparagingly referred to as their “baggage.” And Nate discovered that with over 100 players in a league, each one requesting a different combination of teammates, creating teams that would please everyone seemed to become more complicated by the minute.
Luckily for him, the Wolfram Language has a suite of graph and network tools for things like social media. I recognized that this seemingly overwhelming problem was actually a fairly simple graph problem. I asked Nate for the data, spent an evening working in a notebook and sent him the teams that night.
July 19, 2019 — Jamie Peterson, Technical Programs Manager, Wolfram U
Wolfram U’s latest interactive course, Multiparadigm Data Science, gives a comprehensive overview of Multiparadigm Data Science (MPDS) through a series of videos, quizzes and live computations, all running from the Wolfram Cloud. Using real-world examples, this free course provides an introduction to MPDS, strategies for improving your process and building your ideal toolkit, and the Wolfram Language functionality that makes it easy to implement.
June 20, 2019 — Brian Wood, Lead Technical Writer, Document and Media Systems
Mapping an Ancient Empire
Geocomputation is an indispensable modern tool for analyzing and viewing large-scale data such as population demographics, natural features and political borders. And if you’ve read some of my other posts, you can probably tell that I like working with maps. Recently, a Wolfram Community member asked:
“How do I make an interactive map of the
Byzantine Empire through the years?”
To figure out a solution, we’ll tap into the Wolfram Knowledgebase for some historical entities, as well as some of the high-level geocomputation and visualizations of the Wolfram Language. Once we’ve created our brand-new function, we’ll submit it to the Wolfram Function Repository for anyone to use.
May 28, 2019 — Daniel Lichtblau, Symbolic Algorithms Developer, Algorithms R&D
Several Months Ago…
I wrote a blog post about the disputed Federalist Papers. These were the 12 essays (out of a total of 85) with authorship claimed by both Alexander Hamilton and James Madison. Ever since the landmark statistical study by Mosteller and Wallace published in 1963, the consensus opinion has been that all 12 were written by Madison (the Adair article of 1944, which also takes this position, discusses the long history of competing authorship claims for these essays). The field of work that gave rise to the methods used often goes by the name of “stylometry,” and it lies behind most methods for determining authorship from text alone (that is to say, in the absence of other information such as a physical typewritten or handwritten note). In the case of the disputed essays, the pool size, at just two, is as small as can be. Even so, these essays have been regarded as difficult for authorship attribution due to many statistical similarities in style shared by Hamilton and Madison.
May 23, 2019 — Brian Wood, Lead Technical Writer, Document and Media Systems
Just as Wolfram was doing AI before it was cool, so have we been doing data science since before it was mainstream. A prime example is the creation of Wolfram|Alpha—a massive project that involved engineering, modeling, analyzing, visualizing and interfacing with terabytes of data, developing a natural language interface, and deploying results in a sensible way. Wolfram|Alpha itself is a tool for doing data science, and its continued success is largely because of the underlying strategy we used to build it: a multiparadigm approach driven by natural curiosity, exploring all kinds of data, using advanced methods from a range of areas and automating as much as possible.
Any approach to data science can only be as effective as the computational tools driving it; luckily for us, we had the Wolfram Language at our disposal. Leveraging its universal symbolic representation, high-level automation and human readability—as well as its broad range of built-in computation, knowledge and interfaces—streamlined our process to help bring Wolfram|Alpha to fruition. In this post, I’ll discuss some key tenets of the multiparadigm approach, then demonstrate how they combine with the computational intelligence of the Wolfram Language to make the ideal workflow for not only discovering and presenting insights from your data, but also for creating scalable, reusable applications that optimize your data science processes.
April 26, 2019 — Tim Shedelbower, Visualization Developer, Algorithms R&D
Connect the dots. It was exciting to draw from number to number until the sudden discovery of a hidden cartoon. That was my inadvertent introduction to graph theory very early in school. Little did I know adults used the same concept to discover hidden patterns to solve problems, such as proving that a single crossing of seven Königsberg bridges to four land masses is not possible, but coloring a map distinctly with four colors is. These problems inspired the methods we know today as graph theory. And in honor of the work of late mathematician and connect-the-dot author Elwyn Berlekamp, we see how sophisticated this “child’s play” can be by examining the different styles and themes we can apply to graphs.