Wolfram Blog http://blog.wolfram.com News, views, & ideas from the front lines at Wolfram Research Wed, 30 Jul 2014 14:23:02 +0000 en hourly 1 http://wordpress.org/?v=3.2.1 Creating Escher-Inspired Art with Mathematica http://blog.wolfram.com/2014/07/30/creating-escher-inspired-art-with-mathematica/ http://blog.wolfram.com/2014/07/30/creating-escher-inspired-art-with-mathematica/#comments Wed, 30 Jul 2014 14:23:02 +0000 Wolfram Blog http://blog.internal.wolfram.com/?p=20154 Kenzo Nakamura uses Mathematica to create Escher-inspired mathematical art. His trademark piece, Three-Circle Mandala, depicts a large circle covered by three smaller, repeating circles that form a Sierpinksi gasket.

When Nakamura began using Mathematica, he didn’t originally intend to use it for his artistic endeavors. He found the program by chance at a seminar while looking for the right tool to help him write his master’s thesis.

Now, in addition to using Mathematica for technical and operations research, Nakamura uses it to create Mathematica-derived visual illusions. Although his works are static drawings, their infinite properties create the illusion of movement.

Watch Nakamura discuss using Mathematica to create his drawings, and see a few of his creations.


(YouTube in Japanese)

http://www.wolfram.com/mathematica/customer-stories/mathematicas-role-in-creating-art.html
(Customer Story Page)

Nakamura, who plans to continue pursing math as art, says, “Through these drawings, my dream is to create even one drawing that surpasses Escher’s drawings.”

You can view other Mathematica stories on our Customer Stories pages.

]]>
http://blog.wolfram.com/2014/07/30/creating-escher-inspired-art-with-mathematica/feed/ 0
Announcing Wolfram SystemModeler 4 http://blog.wolfram.com/2014/07/23/announcing-wolfram-systemmodeler-4/ http://blog.wolfram.com/2014/07/23/announcing-wolfram-systemmodeler-4/#comments Wed, 23 Jul 2014 20:28:57 +0000 Roger Germundsson http://blog.internal.wolfram.com/?p=20366 Today we are proud to announce the release of Wolfram SystemModeler 4.

Wolfram SystemModeler Logo

For SystemModeler 4, we have expanded the supported model libraries to cover many new areas. We’ve also improved workflows for everything from learning the software to developing models to analyzing and deploying them.

People have been using SystemModeler in an astonishing variety of areas. Many of those have been well supported by built-in libraries, but many are totally new domains where models typically need to be built from scratch.

For most applications, using existing model libraries gives a real boost to productivity, but developing a good library takes a lot of effort. There are many aspects to think of: the best structure for easy modeling, the right level of detail, the interfaces to other components, which components to include, documentation, etc. And you may very well have to refactor the library more than once before you’re done. Reusing components and interfaces from already tested and documented libraries not only speeds up development and learning, but also improves quality.

So we’ve made SystemModeler‘s already broad collection of built-in libraries even larger. For instance, we’ve added Digital, for digital electronics following the VHDL multivalued logic standard; QuasiStationary, for efficient approximate modeling of large analog circuits; and FundamentalWave, for modeling multiphase electrical machines. There are also many improvements to existing libraries, such as support for thermal ports in the Rotational and Translational mechanics libraries so that heat losses can be captured.

Some of the new and updated libraries in SystemModeler 4

But we also wanted to make it easy to access all the other great existing and future model libraries. So we decided to create a marketplace for both free and paid libraries, the SystemModeler Library Store. With the Library Store you get easy download and automatic installation, and we are even working with the Modelica Association to get a new standard accepted for such library bundles to enable this simple workflow more generally. All the libraries in the store are verified to work with SystemModeler 4, and we are working with developers to bring you new and updated libraries on an ongoing basis.

SystemModeler Library Store

So what sorts of modeling libraries can you already find in the Library Store? Well, they cover a variety of areas—for example, Hydraulic, for hydraulic actuators and circuits as in an excavator arm or flight controls; BioChem, for biochemical systems as in compartmental or pathway models; SmartCooling, for cooling circuits such as battery stacks or combustion engines; SystemDynamics, for sociotechnical models such as energy markets, disease propagation, and logistics; and PlanarMechanics for constrained 2D mechanical systems such as revolute joints in robots.

Some libraries in the SystemModeler Library Store

The world of modeling libraries and areas covered by SystemModeler just got bigger. And with the Library Store we expect to expand on the available libraries continuously. Interestingly, we’ve interacted with many R&D groups that have in-depth knowledge of different areas—from tire modeling for off-road machinery, to chemical reactors, to classes of disease pathways, etc.—for which libraries don’t yet exist. With the SystemModeler Library Store, there is an actual marketplace where such knowledge—if made into a library—can readily be made available. There really is no limit to the areas that can be made accessible with well-designed model libraries.

So, with more built-in libraries and a dedicated Library Store, you have a more powerful modeling tool. But how do you find out what is in one of these libraries? And how do you learn to use the software? How do you learn to model in the first place? With SystemModeler 4 we have created a new Documentation Center (online and in-product) as the hub from which questions like these can be answered.

SystemModeler 4's new Documentation Center

The Documentation Center makes it easy to browse and search all product and library documentation, which includes video and text tutorials as well as the more-structured library pages. But it also provides access to additional resources, such as free online training courses, other SystemModeler users in the Wolfram Community, technical support, and technical consulting.

The documentation is extensively cross-linked so that when you, for instance, look up a component, you will immediately find links to connectors, parameters, subcomponents, and—particularly useful—a list of examples that make use of that component. And for simulatable models, you will find links to all components that they use, as well as the ability to directly simulate the models from the documentation in SystemModeler.

 

So learning about libraries and how to use them has become much easier. But what do you do when there is no library? SystemModeler is set up to also support modeling from the ground up using the Modelica language. For SystemModeler 3 we pointed people to the Modelica book by Michael Tiller as the most accessible resource. But the book was getting out of date with recent developments, and fortuitously Michael came to us with the idea of producing an updated Creative Commons version of the book. A little later he launched a Kickstarter project, which we’re happy to say we were one of the first gold sponsors for. The project got funded and the first version became available this spring, and we are now including this book, Modelica by Example, as part of the Documentation Center in SystemModeler. This is a great resource for when you want to learn more about the Modelica language.

Model libraries are all about reusing and connecting component models in SystemModeler. But is there a way that you can reuse models outside of SystemModeler in other software? SystemModeler provides a standalone executable for simulatable models that can be called using a TCP/IP-based API. This means it can be integrated into most software systems by using the appropriate API calls.

But for simulation software we can make this easier and do away with programming. Functional Mockup Interface (FMI) is an industry standard that we and other modeling and simulation companies have been developing for this very purpose. The idea is that by standardizing the interfaces we can enable model exchange without the user needing to do any programming. This means there can be complementary tools that make use of these models, including things like system integration tools that integrate both software and hardware modules. SystemModeler 4 now supports FMI export, which can be used in several dozen other software systems immediately and in many more to come.

FMI standard

So SystemModeler on its own is a very powerful system, but when used together with Mathematica you open up a whole new world of uses, including programmatic control of most tasks in SystemModeler; support for model calibration, linearization, and control design; access to the world’s largest web of algorithms and data; interactive notebooks; cloud integration; and more.

The integration between SystemModeler and Mathematica has been improved throughout, so things are generally faster and smoother. One noticeable change is that any model is now displayed using its diagram or icon, which you can even use as input to other functions.

Mathematica accepts SystemModeler model diagrams as input to other functions

In the plot above, the model is compiled and simulated automatically. And with SystemModeler 4, you can now perform real-time simulation and visualization of models. You can even use input controls such as sliders, joysticks, and so on to affect the simulation model in real time, and you can conveniently use gauges and other real-time visualizations to display simulation states. This means you can easily build a mockup of a system using inexpensive input devices and have the system react like the real thing, whether for a virtual lab, an actual intended product, or whatever.

You can use input devices with SystemModeler 4

In SystemModeler 4, you can now create models automatically from differential equations in Mathematica without writing the detailed Modelica code. You can also create components, which are models with interfaces so they can be connected to other components. This makes it simple to derive a key component in Mathematica and then rely on SystemModeler for other parts of the overall model. In fact, you can even connect components programmatically from Mathematica, making it easy to explore whole worlds of modeling alternatives, for instance replacing one or several components with ones that have different failure behaviors.

SystemModeler 4 integrates with Mathematica

One particularly interesting type of model that is algorithmically derived from others is a control system. In the real-time simulation above, a human can directly interact with a model through input control devices. But many “smart” systems don’t have a human in the loop, but rather a controller that automatically decides inputs for the model based on measurements and an internal model, just like the familiar cruise control for a car, or autopilot for a plane. An important task in many system designs is to derive such a control algorithm. Mathematica has a full suite of control design algorithms with many new capabilities added in Mathematica 9 and 10, including automated PID tuning, support for descriptor, delay, and nonlinear systems. So in SystemModeler 4 you can now design the controller, create the corresponding model component, connect it to the rest of the system, and simulate the closed-loop system at full fidelity.

This is just a sampling of the myriad ways SystemModeler and Mathematica can be used together. Some of the uses we’ve seen include deriving and testing model equations, performing more advanced analysis of models (like sensitivity analysis or computing aggregate measures of performance or efficiency for different subsystems), and creating informative visualizations, animations, and manipulations, as well as presentation material to communicate designs to students, managers, and customers. Many of our users have already adopted this way of working, and of course we use it extensively in developing SystemModeler.

For more on what’s new in SystemModeler 4 as well as examples, free courses, and trial software, check out the SystemModeler website.

]]>
http://blog.wolfram.com/2014/07/23/announcing-wolfram-systemmodeler-4/feed/ 0
How Citizen Computation Changes Democracy: Conrad Wolfram at TEDxHousesofParliament http://blog.wolfram.com/2014/07/22/how-citizen-computation-changes-democracy-conrad-wolfram-at-tedxhousesofparliament/ http://blog.wolfram.com/2014/07/22/how-citizen-computation-changes-democracy-conrad-wolfram-at-tedxhousesofparliament/#comments Tue, 22 Jul 2014 15:51:22 +0000 Wolfram Blog http://blog.internal.wolfram.com/?p=20348 Conrad Wolfram at TEDxHOP
Photography by Tracy Howl and Paul Clarke

Has our newfound massive availability of data improved decisions and lead to better democracy around the world? Most would say, “It’s highly questionable.”

Conrad Wolfram’s TEDx UK Parliament talk poses this question and explains how computation can be key to the answer, bridging the divide between availability and practical accessibility of data, individualized answers, and the democratization of new knowledge generation. This transformation will be critical not only to government efficiency and business effectiveness—but will fundamentally affect education, society, and democracy as a whole.

Wolfram|Alpha and Mathematica 10 demos feature throughout—including a live Wolfram Language generated tweet.

More about Wolfram’s solutions for your organization’s data »

]]>
http://blog.wolfram.com/2014/07/22/how-citizen-computation-changes-democracy-conrad-wolfram-at-tedxhousesofparliament/feed/ 1
Launching Mathematica 10—with 700+ New Functions and a Crazy Amount of R&D http://blog.wolfram.com/2014/07/09/launching-mathematica-10-with-700-new-functions-and-a-crazy-amount-of-rd/ http://blog.wolfram.com/2014/07/09/launching-mathematica-10-with-700-new-functions-and-a-crazy-amount-of-rd/#comments Wed, 09 Jul 2014 18:42:16 +0000 Stephen Wolfram http://blog.internal.wolfram.com/?p=20164 We’ve got an incredible amount of new technology coming out this summer. Two weeks ago we launched Wolfram Programming Cloud. Today I’m pleased to announce the release of a major new version of Mathematica: Mathematica 10.

Wolfram Mathematica 10

We released Mathematica 1 just over 26 years ago—on June 23, 1988. And ever since we’ve been systematically making Mathematica ever bigger, stronger, broader and deeper. But Mathematica 10—released today—represents the single biggest jump in new functionality in the entire history of Mathematica.

At a personal level it is very satisfying to see after all these years how successful the principles that I defined at the very beginning of the Mathematica project have proven to be. And it is also satisfying to see how far we’ve gotten with all the talent and hard work that has been poured into Mathematica over nearly three decades.

We’ll probably never know whether our commitment to R&D over all these years makes sense at a purely commercial level. But it has always made sense to me—and the success of Mathematica and our company has allowed us to take a very long-term view, continually investing in building layer upon layer of long-term technology.

One of the recent outgrowths—from combining Mathematica, Wolfram|Alpha and more—has been the creation of the Wolfram Language. And in effect Mathematica is now an application of the Wolfram Language.

But Mathematica still very much has its own identity too—as our longtime flagship product, and the system that has continually redefined technical computing for more than a quarter of a century.

And today, with Mathematica 10, more is new than in any single previous version of Mathematica. It is satisfying to see such a long curve of accelerating development—and to realize that there are more new functions being added with Mathematica 10 than there were functions altogether in Mathematica 1.
Mathematica functions over time, by version

So what is the new functionality in Mathematica 10? It’s a mixture of completely new areas and directions (like geometric computation, machine learning and geographic computation)—together with extensive strengthening, polishing and expanding of existing areas. It’s also a mixture of things I’ve long planned for us to do—but which had to wait for us to develop the necessary technology—together with things I’ve only fairly recently realized we’re in a position to tackle.

New functionality in Mathematica 10

When you first launch Mathematica 10 there are some things you’ll notice right away. One is that Mathematica 10 is set up to connect immediately to the Wolfram Cloud. Unlike Wolfram Programming Cloud—or the upcoming Mathematica OnlineMathematica 10 doesn’t run its interface or computations in the cloud. Instead, it maintains all the advantages of running these natively on your local computer—but connects to the Wolfram Cloud so it can have cloud-based files and other forms of cloud-mediated sharing, as well as the ability to access cloud-based parts of the Wolfram Knowledgebase.

If you’re an existing Mathematica user, you’ll notice some changes when you start using notebooks in Mathematica 10. Like there’s now autocompletion everywhere—for option values, strings, wherever. And there’s also a hovering help box that lets you immediately get function templates or documentation. And there’s also—as much requested by the user community—computation-aware multiple undo. It’s horribly difficult to know how and when you can validly undo Mathematica computations—but in Mathematica 10 we’ve finally managed to solve this to the point of having a practical multiple undo.

Another very obvious change in Mathematica 10 is that plots and graphics have a fresh new default look (you can get the old look with an option setting, of course):

Some new default styles in Mathematica 10

And as in lots of other areas, that’s just the tip of the iceberg. Underneath, there’s actually a whole powerful new mechanism of “plot themes”—where instead of setting lots of individual options, you can for example now just specify an overall theme for a plot—like “web” or “minimal” or “scientific”.

Plot themes in Mathematica 10

But what about more algorithmic areas? There’s an amazing amount there that’s new in Mathematica 10. Lots of new algorithms—including many that we invented in-house. Like the algorithm that lets Mathematica 10 routinely solve systems of numerical polynomial equations that have 100,000+ solutions. Or the cluster of algorithms we invented that for the first time give exact symbolic solutions to all sorts of hybrid differential equations or differential delay equations—making such equations now as accessible as standard ordinary differential equations.

Solving differential equations in Mathematica 10

Of course, when it comes to developing algorithms, we’re in a spectacular position these days. Because our multi-decade investment in coherent system design now means that in any new algorithm we develop, it’s easy for us to bring together algorithmic capabilities from all over our system. If we’re developing a numerical algorithm, for example, it’s easy for us to do sophisticated algebraic preprocessing, or use combinatorial optimization or graph theory or whatever. And we get to make new kinds of algorithms that mix all sorts of different fields and approaches in ways that were never possible before.

From the very beginning, one of our central principles has been to automate as much as possible—and to create not just algorithms, but complete meta-algorithms that automate the whole process of going from a computational goal to a specific computation done with a specific algorithm. And it’s been this kind of automation that’s allowed us over the years to “consumerize” more and more areas of computation—and to take them from being accessible only to experts, to being usable by anyone as routine building blocks.

And in Mathematica 10 one important area where this is happening is machine learning. Inside the system there are all kinds of core algorithms familiar to experts—logistic regression, random forests, SVMs, etc. And all kinds of preprocessing and scoring schemes. But to the user there are just two highly automated functions: Classify and Predict. And with these functions, it’s now easy to call on machine learning whenever one wants.

Machine learning in Mathematica 10

There are huge new algorithmic capabilities in Mathematica 10 in graph theory, image processing, control theory and lots of other areas. Sometimes one’s not surprised that it’s at least possible to have such-and-such a function—even though it’s really nice to have it be as clean as it is in Mathematica 10. But in other cases it at first seems somehow impossible that the function could work.

There are all kinds of issues. Maybe the general problem is undecidable, or theoretically intractable. Or it’s ill conditioned. Or it involves too many cases. Or it needs too much data. What’s remarkable is how often—by being algorithmically sophisticated, and by leveraging what we’ve built in Mathematica and the Wolfram Language—it’s possible to work around these issues, and to build a function that covers the vast majority of important practical cases.

Another important issue is just how much we can represent and do computation on. Expanding this is a big emphasis in the Wolfram Language—and Mathematica 10 has access to everything that’s been developed there. And so, for example, in Mathematica 10 there’s an immediate symbolic representation for dates, times and time series—as well as for geolocations and geographic data.

An example of geographic visualization in Mathematica 10

The Wolfram Language has ways to represent a very broad range of things in the real world. But what about data on those things? Much of that resides in the Wolfram Knowledgebase in the cloud. Soon we’re going to be launching the Wolfram Discovery Platform, which is built to allow large-scale access to data from the cloud. But since that’s not the typical use of Mathematica, basic versions of Mathematica 10 are just set up for small-scale data access—and need explicit Wolfram Cloud Credits to get more.

Still, within Mathematica 10 there are plenty of spectacular new things that will be possible by using just small amounts of data from the Wolfram Knowledgebase.

A little while ago I found a to-do list for Mathematica that I wrote in 1991. Some of the entries on it were done in just a few years. But most required the development of towers of technology that took many years to build. And at least one has been a holdout all these years—until now.

On the to-do it was just “PDEs”. But behind those four letters are centuries of mathematics, and a remarkably complex tower of algorithmic requirements. Yes, Mathematica has been able to handle various kinds of PDEs (partial differential equations) for 20 years. But in Mathematica we always seek as much generality and robustness as possible, and that’s where the challenge has been. Because we’ve wanted to be able to handle PDEs in any kind of geometry. And while there are standard methods—like finite element analysis—for solving PDEs in different geometries, there’s been no good way to describe the underlying geometry in enough generality.

Over the years, we’ve put immense effort into the design of Mathematica and what’s now the Wolfram Language. And part of that design has involved developing broad computational representations for what have traditionally been thought of as mathematical concepts. It’s difficult—if fascinating—intellectual work, in effect getting “underneath” the math to create new, often more general, computational representations.

A few years ago we did it for probability theory and the whole cluster of things around it, like statistical distributions and random processes. Now in Mathematica 10 we’ve done it for another area: geometry.

Geometry examples in Mathematica 10

What we’ve got is really a fundamental extension to the domain of what can be represented computationally, and it’s going to be an important building block for many things going forward. And in Mathematica 10 it delivers some very powerful new functionality—including PDEs and finite elements.

So, what’s hard about representing geometry computationally? The problem is not in handling specific kinds of cases—there are a variety of methods for doing that—but rather in getting something that’s truly general, and extensible, while still being easy to use in specific cases. We’ve been thinking about how to do this for well over a decade, and it’s exciting to now have a solution.

It turns out that math in a sense gets us part of the way there—because it recognizes that there are various kinds of geometrical objects, from points to lines to surfaces to volumes, that are just distinguished by their dimensions. In computer systems, though, these objects are typically represented rather differently. 3D graphics systems, for example, typically handle points, lines and surfaces, but don’t really have a notion of volumes or solids. CAD systems, on the other hand, handle volumes and solids, but typically don’t handle points, lines and surfaces. GIS systems do handle both boundaries and interiors of regions—but only in 2D.

So why can’t we just “use the math”? The problem is that specific mathematical theories—and representations—tend once again to handle, or at least be convenient in, only specific kinds of cases. So, for example, one can describe geometry in terms of equations and inequalities—in effect using real algebraic geometry—but this is only convenient for simple “math-related” shapes. One can use combinatorial topology, which is essentially based on mesh regions, and which is quite general, but difficult to use directly—and doesn’t readily cover things like non-bounded regions. Or one could try using differential geometry—which may be good for manifolds, but doesn’t readily cover geometries with mixed dimensions, and isn’t closed under Boolean operations.

What we’ve built in effect operates “underneath the math”: it’s a general symbolic representation of geometry, which makes it convenient to apply any of these different mathematical or computational approaches. And so instead of having all sorts of separate “point in polygon”, “point in mesh”, “point on line” etc. functions, everything is based on a single general RegionMember function. And similarly Area, Volume, ArcLength and all their generalizations are just based on a single RegionMeasure function.

The result is a remarkably smooth and powerful way of doing geometry, which conveniently spans from middle-school triangle math to being able to describe the most complex geometrical forms for engineering and physics. What’s also important—and typical of our approach to everything—is that all this geometry is deeply integrated with the rest of the system. So, for example, one can immediately find equation solutions within a geometric region, or compute a maximum in it, or integrate over it—or, for that matter, solve a partial differential equation in it, with all the various kinds of possible boundary conditions conveniently being described.

The geometry language we have is very clean. But underneath it is a giant tower of algorithmic functionality—that relies on a fair fraction of the areas that we’ve been developing for the past quarter century. To the typical user there are few indications of this complexity—although perhaps the multi-hundred-page treatise on the details of going beyond automatic settings for finite elements in Mathematica 10 provides a clue.

Geometry is just one new area. The drive for generality continues elsewhere too. Like in image processing, where we’re now supporting most image processing operations not only in 2D but also in 3D images. Or in graph computation, where everything works seamlessly with directed graphs, undirected graphs, mixed graphs, multigraphs and weighted graphs. As usual, it’s taken developing all sorts of new algorithms and methods to deal with cases that in a sense cross disciplines, and so haven’t been studied before, even though it’s obvious they can appear in practice.

As I’ve mentioned, there are some things in Mathematica 10 that we’ve been able to do essentially because our technology stack has now reached the point where they’re possible. There are others, though, that in effect have taken solving a problem, and often a problem that we’ve been thinking about for a decade or two. An example of this is the system for handling formal math operators in Mathematica 10.

In a sense what we’re doing is to take the idea of symbolic representation one more step. In math, we’ve always allowed a variable like x to be symbolic, so it can correspond to any possible value. And we’ve allowed functions like f to be symbolic too. But what about mathematical operators like derivative? In the past, these have always been explicit—so for example they actually take derivatives if they can. But now we have a new notion of “inactive” functions and operators, which gives us a general way to handle mathematical operators purely symbolically, so that we can transform and manipulate expressions formally, while still maintaining the meaning of these operators.

Inactive functionality in Mathematica 10

This makes possible all sorts of new things—from conveniently representing complicated vector analysis expressions, to doing symbolic transformations not only on math but also on programs, to being able to formally manipulate objects like integrals, with built-in implementations of all the various generalizations of things like Leibniz’s rule.

In building Mathematica 10, we’ve continued to push forward into uncharted computational—and mathematical—territory. But we’ve also worked to make Mathematica 10 even more convenient for areas like elementary math. Sometimes it’s a challenge to fit concepts from elementary math with the overall generality that we want to maintain. And often it requires quite a bit of sophistication to make it work. But the result is a wonderfully seamless transition from the elementary to the advanced. And in Mathematica 10, we’ve once again achieved this for things like curve computations and function domains and ranges.

The development of the Wolfram Language has had many implications for Mathematica—first visible now in Mathematica 10. In addition to all sorts of interaction with real-world data and with external systems, there are some fundamental new constructs in the system itself. An example is key-value associations, which in effect introduce “named parts” throughout the system. Another example is the general templating system, important for programmatically constructing strings, files or web pages.

Using associations in Mathematica 10

With the Wolfram Language there are vast new areas of functionality—supporting new kinds of programming, new structures and new kinds of data, new forms of deployment, and new ways to integrate with other systems. And with all this development—and all the new products it’s making possible—one might worry that the traditional core directions of Mathematica would be left behind. But nothing is further from the truth. And in fact all the new Wolfram Language development has made possible even more energetic efforts in traditional Mathematica areas.

Partly that is the result of new software capabilities. Partly it is the result of new understanding that we’ve developed about how to push forward the design of a very large knowledge-based system. And partly it’s the result of continued strengthening and streamlining of our internal R&D processes.

We’re still a fairly small company (about 700 people), but we’ve been steadily ramping up our R&D output. And it’s amazing to see what we’ve been able to build for Mathematica 10. In the 19 months (588 days) since Mathematica 9 was released, we’ve finished more than 700 new functions that are now in Mathematica 10—and we’ve added countless enhancements to existing functions.

I think the fact that this is possible is a great tribute to the tools, team and organization we’ve built—and the strength of the principles under which we’ve been operating all these years.

To most people in the software business, if they knew the amount of R&D that’s gone into Mathematica 10, it would seem crazy. Most would assume that a 26-year-old product would be in a maintenance mode, with tiny updates being made every few years. But that’s not the story with Mathematica at all. Instead, 26 years after its initial release, its rate of growth is still accelerating. There’s going to be even more to come.

But today I’m pleased to announce that the fruits of a “crazy” amount of R&D are here: Mathematica 10.

]]>
http://blog.wolfram.com/2014/07/09/launching-mathematica-10-with-700-new-functions-and-a-crazy-amount-of-rd/feed/ 34
Hungry for More Pi? http://blog.wolfram.com/2014/07/02/hungry-for-more-pi/ http://blog.wolfram.com/2014/07/02/hungry-for-more-pi/#comments Wed, 02 Jul 2014 17:52:13 +0000 Wolfram Blog http://blog.internal.wolfram.com/?p=19475 We recently wrote some blog posts for our friends over at raspberrypi.org, sharing some of the cool things you can do with the Wolfram Language on the Raspberry Pi. Combined with the amazing projects and ideas being shared at Wolfram Community, we’re doing some seriously cool stuff on this little computer!

Raspberry Pi + Mathematica

If you haven’t been following the latest, here’s a recap of our favorite Raspberry Pi + Wolfram Language creations:

Wolfram Language One-Liners

The signature competition from our 2011 and 2012 Wolfram Technology Conferences has made a debut on the Raspberry Pi. What’s the most complex program you can write on your RPi that’s no longer than a Twitter tweet? Read the post for some inspiration!

Modeling Physics on the Raspberry Pi

Creating simple physics models is as straightforward as using the built-in functions of the Wolfram Language. Find out how you can easily visualize complex systems on your Raspberry Pi.

Hooking up Vernier Sensors

Looking for ways to incorporate the Raspberry Pi into your classroom or science lab? Vernier is one of the leading companies in scientific measurement tools—and connecting them to your Raspberry Pi using the Wolfram Language allows you to discover dynamic and educational solutions for analyzing the data you collect with them.

Cooking with Raspberry Pi

Feed your inner coder and your palate with this affordable DIY slow cooker setup. Watch the demo and see how you can use the Raspberry Pi to master the science of sous-vide cooking. Who knew programming could be so tasty?

New projects are also regularly cropping up on Community: Build a spectrometer with a RaspiCam, measure moment of inertia, or use a motion sensor to make a security system. Join the Raspberry Pi Group to keep up with the latest and greatest ideas from brilliant users just like you!

]]>
http://blog.wolfram.com/2014/07/02/hungry-for-more-pi/feed/ 0
World Cup Follow-Up: Update of Winning Probabilities and Betting Results http://blog.wolfram.com/2014/06/26/world-cup-follow-up-update-of-winning-probabilities-and-betting-results/ http://blog.wolfram.com/2014/06/26/world-cup-follow-up-update-of-winning-probabilities-and-betting-results/#comments Thu, 26 Jun 2014 23:53:52 +0000 Etienne Bernard http://blog.internal.wolfram.com/?p=20019 Find out Etienne’s initial predictions by visiting last week’s World Cup blog post.

The World Cup is half-way through: the group phase is over, and the knockout phase is beginning. Let’s update the winning probabilities for the remaining teams, and analyze how our classifier performed on the group-phase matches.

From the 32 initial teams, 16 are qualified for the knockout phase:

16 teams are qualified for the knockout phase

There have been some surprises: from 10 of our favorite teams, 3 have been eliminated (Portugal, England, and, most surprisingly, Spain). But most of the main teams are still there.

Using our classifier, we compute again the winning probabilities of each team. To do so, we update the team features to include the last matches (that is, we update the Elo ratings and the goal average features), and then we run 100,000 Monte Carlo simulations of the World Cup starting from the round of 16. Here are the probabilities we obtained:

Winning probabilities

Again, Brazil is the favorite, but with a 32% chance to win now. After its impressive victory against Spain, the Netherlands’ odds jumped to 23.5%: it is now the second favorite. Germany (21.6%) and Argentina (8.6%) are following. There is thus, according to our model, an 86% chance that one of these four teams will be champion.

Let’s now look at the possible final matches:

Possible final matches

The most probable finals are Brazil vs. Netherlands (21.5%) and Germany vs. Netherlands (16.7%). It is however impossible to have a final Brazil vs. Germany, since these teams are on the same side of the tournament tree. Here is the most likely tournament tree:

Tournament tree

In the knockout phase, the position in the tournament tree matters: teams being on the same side as Brazil and Germany (such as France and Colombia) will have a hard time reaching the final. On the other hand, the United Sates, which is in the weakest side of the tree, has about a 6% chance to reach its first World Cup final.

Finally, let’s see how far in the competition teams can hope to go. The following plots show, for the 9 favorite teams, the probabilities to reach (in blue), and to be eliminated at (in orange), a given stage of the competition:

How far can nine favorite teams make it?

We see that Germany has a 35% chance to be eliminated at the semi-finals stage (probably against Brazil), while France and Colombia will probably be stopped at the quarter-finals stage (probably against Germany and Brazil).

Let’s now analyze how our classifier performed for group-phase matches. Forty-eight matches have been played, and it correctly predicted about 62.5% of them:

62.5% prediction accuracy

Which is close to the 59% accuracy obtained in the test set of the previous post. The accuracy is an interesting property to measure, but it does not reveal the full power of the classifier (we could have obtained a similar accuracy by always predicting the victory of the highest Elo-ranked team). It is more interesting to look at how reliable the probabilities computed by the classifier are. For example, let’s compute the likelihood of the classifier on past matches (that is, the probability attributed by the classifier to the sequence of actual match outcomes P(outcome1) × P(outcome2) × … × P(outcome48)):

measurer["Likelihood"]

This value can be compared to the likelihood computed from commonly believed probabilities: bookmakers’ odds. Bookmakers tune their odds in order to always win money: if $3 has been bet on A, $2 on B, and $5 on C, they will set the odds (the amount you get if you bet $1 on the corresponding outcome) for A, B, and C a bit under:

Bookmakers' odds

Therefore, if we inverse the odds, we can obtain the outcome probabilities believed by betters. So, can our classifier compete with this “collective intelligence”?

We scraped the World Cup betting odds as they were right before each match from http://www.oddsportal.com/soccer/world/world-cup-2014/results and converted them to probabilities. We obtained a likelihood of 1.33209 × 10-20, which is more than five times smaller than the likelihood of our classifier: there is thus about an 85% chance that our probabilities are “better” than bookmakers’. The simple fact that our classifier probabilities compete with bookmakers’ is remarkable, as we only used a few simple features to create the classifier. It is thus surprising to see that our classifier probably outperforms bookmakers’ odds: we might even be able to make money!

To test this, let’s imagine that we bet $1 on every match using the classifier (setting the value of UtilityFunction as explained in the previous post). Here are matches that we would have got right, and their corresponding gains:

Betting with UtilityFunction

The classifier only got 38% of its bets right. However, it often chose to bet on the underdog in order to increase its expected gain. In the end, we obtained $16 of profit, which is about 33% of our stake! Have we been lucky? To answer this, we compute the probability distribution of gains (through Monte Carlo simulations) according to our probabilities and to bookmakers’:

Distribution of profits: bookmaker vs. our model

The average profit, according to our model, is $14. We have thus, for sure, been a bit lucky with this $16 of profit. Can we at least conclude that our probabilities outperform bookmakers’? Again we can’t be sure, but from computing the probability density to obtain a profit of $16 in both models, we see that there is a 65% chance that our model actually allows us to make money in the future…. To be tested on the next international competition!

]]>
http://blog.wolfram.com/2014/06/26/world-cup-follow-up-update-of-winning-probabilities-and-betting-results/feed/ 30
Wolfram Programming Cloud Is Live! http://blog.wolfram.com/2014/06/23/wolfram-programming-cloud-is-live/ http://blog.wolfram.com/2014/06/23/wolfram-programming-cloud-is-live/#comments Mon, 23 Jun 2014 18:38:00 +0000 Stephen Wolfram http://blog.internal.wolfram.com/?p=19976 Twenty-six years ago today we launched Mathematica 1.0. And I am excited that today we have what I think is another historic moment: the launch of Wolfram Programming Cloud—the first in a sequence of products based on the new Wolfram Language.

Wolfram Programming Cloud

My goal with the Wolfram Language in general—and Wolfram Programming Cloud in particular—is to redefine the process of programming, and to automate as much as possible, so that once a human can express what they want to do with sufficient clarity, all the details of how it is done should be handled automatically.

I’ve been working toward this for nearly 30 years, gradually building up the technology stack that is needed—at first in Mathematica, later also in Wolfram|Alpha, and now in definitive form in the Wolfram Language. The Wolfram Language, as I have explained elsewhere, is a new type of programming language: a knowledge-based language, whose philosophy is to build in as much knowledge about computation and about the world as possible—so that, among other things, as much as possible can be automated.

The Wolfram Programming Cloud is an application of the Wolfram Language—specifically for programming, and for creating and deploying cloud-based programs.

How does it work? Well, you should try it out! It’s incredibly simple to get started. Just go to the Wolfram Programming Cloud in any web browser, log in, and press New. You’ll get what we call a notebook (yes, we invented those more than 25 years ago, for Mathematica). Then you just start typing code.

Type code in Wolfram Programming Cloud

It’s all interactive. When you type something, you can immediately run it, and see the result in the notebook.

Like let’s say you want to build a piece of code that takes text, figures out what language it’s in, then shows an image based on the flag of the largest country where it’s spoken.

First, you might want to try out the machine-learning language classifier built into the Wolfram Language:

Wolfram Language has a built-in machine-learning classifier

OK. That’s a good start. Now we have to find the largest country where it’s spoken:

Find the largest country that speaks a given language

Now we can get a flag:

Find the country's flag

Notebooks in the Wolfram Programming Cloud can mix text and code and anything else, so it’s easy to document what you’re doing:

Notebooks in the Wolfram Programming Cloud let you mix text, code, and more

We’re obviously already making pretty serious use of the knowledge-based character of the Wolfram Language. But now let’s say that we want to make a custom graphic, in which we programmatically superimpose a language code on the flag.

It took me about 3 minutes to write a little function to do this, using image processing:

Image processing function to put a language code on a flag

And now we can test the function:

Testing the labeled-flag function

It’s interesting to see what we’ve got going on here. There’s a bit of machine learning, some data about human languages and about countries, some typesetting, and finally some image processing. What’s great about the Wolfram Language is that all this—and much much more—is built in, and the language is designed so that all these pieces fit perfectly together. (Yes, that design discipline is what I personally have spent a fair fraction of the past three decades of my life on.)

But OK, so we’ve got a function that does something. Now what can we do with it? Well, this is one of the big things about the Wolfram Programming Cloud: it lets us use the Wolfram Language to deploy the function to the cloud.

One way we can do that is to make a web API. And that’s very straightforward to do in the Wolfram Language. We just specify a symbolic API function—then deploy it to the cloud:

Specify a symbolic API function and deploy it to the cloud

And now from anywhere on the web, if we call this API by going to the appropriate URL, our Wolfram Language code will run in the Wolfram Cloud—and we’ll get a result back on the web, in this case as a PNG:

ay "bonjour," get a French flag

There are certainly lots of bells and whistles that we can add to this. We can make a fancier image. We can make the code more efficient by precomputing things. And so on. But to me it’s quite spectacular—and extremely useful—that in a matter of seconds I’m able to deploy something to the cloud that I can use from any website, web program, etc.

Here’s another example. This time I’m setting up a URL which, every time it’s visited, gives the computed current number of minutes until the next sunset, for the inferred location of the user:

Deployment code for the computed number of minutes until sunset

Every time you visit this URL, then, you get a number, as a piece of text. (You can also get JSON and lots of other things if you want.)

It’s easy to set it up a dashboard too. Like here’s a countdown timer for sunset, which, web willing, updates every half second:

Deploy a counter for the number of seconds until sunset

How many seconds until sunset?

What about forms? Those are easy too. This creates a form that generates a map of a given location, with a disk of a given radius:

A line of code makes a web form to generate maps marked with disks

Here’s the form:

The map-generating form deployed on the web

And here’s the result of submitting the form:

A map with a two-mile disk centered on the Empire State Building—it's that easy

There’s a lot of fancy technology being used here. Like even the fields in the form are “Smart Fields” (as indicated by their little icons), because they can accept not just literal input, but hundreds of types of arbitrary natural language—which gets interpreted by the same Natural Language Understanding technology that’s at the heart of Wolfram|Alpha. And, by the way, if, for example, your form needs a color, the Wolfram Programming Cloud will automatically create a field with a color picker. Or you can have radio buttons, or a slider, or whatever.

OK, but at this point, professional programmers may be saying, “This is all very nice, but how do I use this in my particular environment?” Well, we’ve gone to a lot of effort to make that easy. For example, with forms, the Wolfram Language has a very clean mechanism for letting you build them out of arbitrary XML templates, to give them whatever look and feel you want.

And when it comes to APIs, the Wolfram Programming Cloud makes it easy to create “embed code” for calling an API from any standard language:

Embed code for calling an API from any standard language

Soon it’ll also be easy to deploy to a mobile app. And in the future there’ll be Embedded Wolfram Engines and other things too.

So what does it all mean? I think it’s pretty important, because it really changes the whole process—and economics—of programming. I’ve even seen it quite dramatically within our own company. As the Wolfram Language and the Wolfram Programming Cloud have been coming together, there’ve been more and more places where we’ve been able to use them internally. And each time, it’s been amazing to see programming tasks that used to take weeks or months suddenly get done in days or less.

But much more than that, the whole knowledge-based character of the Wolfram Language makes feasible for the first time all sorts of programming that were basically absurd to consider before. And indeed within our own organization, that’s for example how it became possible to build Wolfram|Alpha—which is now millions of lines of Wolfram Language code.

But the exciting thing today is that with the launch of the Wolfram Programming Cloud, all this technology is now available to anyone, for projects large and small.

It’s set up so that anyone can just go to a web browser and—for free—start writing Wolfram Language code, and even deploying it on a small scale to the Wolfram Cloud. There are then a whole sequence of options available for larger deployments—including having your very own Wolfram Private Cloud within your organization.

Something to mention is that you don’t have to do everything in a web browser. It’s been a huge challenge to implement the Wolfram Programming Cloud notebook interface on the web—and there are definite limitations imposed by today’s web browsers and tools. But there’s also a native desktop version of the Wolfram Programming Cloud—which benefits from the 25+ years of interface engineering that we’ve done for Mathematica and CDF.

Wolfram Desktop

It’s very cool—and often convenient—to be able to use the Wolfram Programming Cloud purely on the web. But at least for now you get the very best experience by combining desktop and cloud, and running the native Wolfram Desktop interface connected to the Wolfram Cloud. What’s really neat is that it all fits perfectly together, so you can seamlessly transfer notebooks between cloud and desktop.

I’ve built some pretty complex software systems in my time. But the Wolfram Programming Cloud is the most complex I’ve ever seen. Of course, it’s based on the huge technology stack of the Wolfram Language. But the collection of interactions that have to go on in the Wolfram Programming Cloud between the Wolfram Language kernel, the Wolfram Knowledgebase, the Wolfram Natural Language Understanding System, the Wolfram Cloud, and all sorts of other subsystems are amazingly complex.

There are certainly still rough edges (and please don’t be shy in telling us about them!). Many things will, for example, get faster and more efficient. But I’m very pleased with what we’re able to launch today as the Wolfram Programming Cloud.

So if you’re going to try it out, what should you actually do? First, go to the Wolfram Programming Cloud on the web:

Wolfram Programming Cloud on the web

There’s a quick Getting Started video there. Or you can check out the Examples Gallery. Or you can go to Things to Try—and just start running Wolfram Language examples in the Wolfram Programming Cloud. If you’re an experienced programmer, I’d strongly recommend going through the Fast Introduction for Programmers:

The Wolfram Language: A Fast Introduction for Programmers

This should get you up to speed on the basic principles and concepts of the Wolfram Language, and quickly get you to the point where you can read most Wolfram Language code and just start “expanding your vocabulary” across its roughly 5000 built-in functions:

Function categories for the Wolfram Language

Today is an important day not only for our company and our technology, but also, I believe, for programming in general. There’s a lot that’s new in the Wolfram Programming Cloud—some in how far it’s been possible to take things, and some in basic ideas and philosophy. And in addition to dramatically simplifying and automating many kinds of existing programming, I think the Wolfram Programming Cloud is going to make possible whole new classes of software applications—and, I suspect, a wave of new algorithmically based startups.

For me, it’s been a long journey. But today I’m incredibly excited to start a new chapter—and to be able to see what people will be able to do with the Wolfram Language and the Wolfram Programming Cloud.

]]>
http://blog.wolfram.com/2014/06/23/wolfram-programming-cloud-is-live/feed/ 19
Predicting Who Will Win the World Cup with Wolfram Language http://blog.wolfram.com/2014/06/20/predicting-who-will-win-the-world-cup-with-wolfram-language/ http://blog.wolfram.com/2014/06/20/predicting-who-will-win-the-world-cup-with-wolfram-language/#comments Fri, 20 Jun 2014 14:20:39 +0000 Etienne Bernard http://blog.internal.wolfram.com/?p=19788 Check out Etienne’s updated predictions from Thursday, June 26 here.

The FIFA World Cup is underway. From June 12 to July 13, 32 national football teams play against each other to determine the FIFA world champion for the next four years. Who will succeed? Experts and fans all have their opinions, but is it possible to answer this question in a more scientific way? Football is an unpredictable sport: few goals are scored, the supposedly weaker team often manages to win, and referees make mistakes. Nevertheless, by investigating the data of past matches and using the new machine learning functions of the Wolfram Language Predict and Classify, we can attempt to predict the outcome of matches.

The first step is to gather data. FIFA results will soon be accessible from Wolfram|Alpha, but for now we have to do it the hard way: scrape the data from the web. Fortunately, many websites gather historical data (www.espn.co.uk, www.rsssf.com, www.11v11.com, etc.) and all the scraping and parsing can be done with Wolfram Language functions. We first stored web pages locally using URLSave and then imported these pages using Import[myfile,"XMLObject"] (and Import[myfile,"Hyperlinks"] for the links). Using XML objects allows us to keep the structure of the page, and the content can be parsed using Part and pattern-matching functions such as Cases. After the scraping, we cleaned and interpreted the data: for example, we had to infer the country from a large number of cities and used Interpreter to do so:

Spelling interpretation of Dhaka

From scraping various websites, we obtained a dataset of about 30,000 international matches of 203 teams from 1950 to 2014 and 75,000 players. Loaded into the Wolfram Language, its size is about 200MB of data. Here is a match and a player example stored in a Dataset:

Match and a player example stored in a dataset

Matches include score, date, location, competition, players, referee, etc. along with players’ birth date, height, weight, number of selection in national teams, etc. However, the dataset contains missing elements: most players have missing characteristics, for example. Fortunately, machine learning functions such as Predict and Classify can handle missing data automatically.

Before starting to construct a predictive model, let’s compute some amusing statistics about football matches and players.

The mean number of goals per match is 2.8 (which corresponds to one goal every 30 minutes on average). Here is the distribution of this variable:

Distribution of variable

It can be roughly approximated by a PoissonDistribution with mean 2.8, which tells us that the probability rate for a goal to happen is about the same in most matches. Another interesting analysis is the evolution of the mean number of goals per match from the 1950s to present day:

Mean number of goals per match from the 1950s to present day

We see that in the ’50s, almost four goals were scored on average, while sadly it is only about 2.5 goals per match nowadays. As a result, the probability for teams to tie is now higher (almost 25% end in draws now, against 20% in the ’50s).

Here are the evolutions of the (estimated) probabilities to win when teams are playing in their home country and when they are playing away:

Evolutions of the (estimated) probabilities to win at home vs. away

The effect of playing at home is important: teams have about a 50% chance of winning when they are at home, while only a 27% chance when they are away! A naive predicting strategy might then be to always predict the victory of the home team. But there is not always a home team: for this World Cup, the only home team is Brazil.

Let’s now analyze what we can determine about players. Here is the average player height for matches played in a given year:

Mean height of players in a given year

As expected, players tend to be taller (matching the growth of the entire population). However, they have not gotten heavier (at least not in the last 30 years), in fact, they are getting thinner. Here is their average Body Mass Index (BMI, computed as weight/height2) as a function of time:

Mean Body Mass Index (BMI) of players

We can see that in the ’70s, players’ average BMI increased from 23 kg/m-2 to 24 kg/m-2. In the ’80s, the average BMI stayed roughly the same, and since the ’90s it has been steadily decreasing, down to 22.8 kg/m-2 in 2014. It is hard to interpret the reasons for this behavior, though one could argue that in modern football, speed and agility are preferred over impact skills.

Let’s now dive into the predictions of football matches. In order to predict the winning probabilities of the World Cup, we need to be able to predict the results of individual matches. Predicting the exact score would be interesting, but it is not necessary for our problem. Instead we prefer predicting whether the first team will win (labeled Team1), the second team will win (labeled Team2), or the match will end in a draw (labeled Draw). We thus want a classifier for the classes Team1, Team2, and Draw.

A first classifier would be to pick a class randomly with a uniform distribution, which would give 33% accuracy. To do better, we can use some of the statistical information we gathered earlier on: for example, we know that only 23% of matches are tied, so we could then predict either Team1 or Team2 at random, which would give 38.5% accuracy. To improve upon these naive baselines, we need to start using information about matches and teams, that is, to extract “features” and use them in machine learning algorithms.

With our dataset, we can construct many features in order to feed machine learning algorithms: the number of goals scored in previous matches, the fact that a team plays at home, etc. These algorithms try to find statistical patterns in these features, which will be used to predict the outcome of matches. With the new functions Classify and Predict, we don’t have to worry about how these algorithms work or which one to choose, but only about which features we want to give them. In our problem, we want to predict classes, and thus we will use the Classify function.

We saw in the previous analyses that when teams are playing in their country they have a greater chance of winning. This effect is also present for continents (although in a much less important way). We thus construct a first classifier that uses features indicating whether teams play in their own country or continent. The Country feature will be set to Team1 if the first team plays in their own country, Team2 if the second team plays in their own country, and Neutral if both teams play away. Same goes for the Continent feature (when both teams are from the same continent, the feature is also set to Neutral). Our dataset uses associations to have named features; here is a sample of it:

RandomSample using Associations

In order to assess the quality of our classifier, we split the dataset into a training set and a test set, which is composed of the 2000 most recent matches (the dataset is sorted by date here):

Training and test sets

We can now train the classifier with a simple command:

Training classifier

With this dataset, the k-nearest neighbors algorithm has been selected by Classify. We can now evaluate the classification performance on the test set:

Evaluate classification performance on test set

We obtain about 48% accuracy, which roughly corresponds to the 50% accuracy when always predicting a home win (except that the test set also contains matches played in neutral locations).

Let’s now add a very valuable feature: the Elo ratings of teams. Originally developed for chess, the Elo rating system has been adapted for football (see “World Football Elo Ratings“). This system rates teams according to how good they are. The rating has a probabilistic interpretation: if D = Eloteam1 – Eloteam2, then the predicted probability for team1 to win is P(D) = 1/(1+10-D/400).

The Elo rating of all teams starts at 1500 (this value is arbitrary). After a match is played by a given team, their Elo rating is updated according to the formula Elonew = Eloold + K * (r – P(D)), where P(D) is the probability for the team to win, r is a variable marked 1 if the team won, 0 if they lost, and 0.5 for a draw, and K is a coefficient that depends on the match type and the difference of goals. Here is an implementation of the rating update in the Wolfram Language:

Implementation of rating update in Wolfram Language

where matchWeight gives a weight depending on the competition (60 for World Cup finals, 20 for friendly matches, etc.). Here are the computed Elo ratings with our dataset (restricted to matches before the World Cup):

Elo ratings dataset

and the time evolution of Elo ratings for some selected teams:

Time evolution of Elo ratings for selected teams

We then compute, before each match, the Elo ratings of both teams and add them as features. Here is a training example:

Compute Elo ratings of both teams

Again we train a classifier and test its accuracy:

Train a classifier and test accuracy

This time, Classify chose the logistic regression method. With this new classifier, about 58.3% of test set examples are correctly classified, which is a great improvement upon the previous classifier. In matches where draws are forbidden (in the knockout phase, for example), this classifier obtains 75.7% accuracy.

Let’s now add some extra features that we think are relevant in order to build a better classifier. Usually, adding more features might lead to overfitting (that is, modeling patterns that are just statistical fluctuations, thus reducing the generalization of our prediction to new examples). Fortunately, Classify has automatic regularization methods to avoid overfitting, so we should not be too concerned about that. We choose to add four extra features for each team:

– goal average of the last three matches
– mean age of players
– mean number of national selection of players
– mean Body Mass Index of players

Here is a training example of the dataset:

Training example of dataset

Let’s now train our final classifier:

Train final classifier

The logistic regression has again been used. We now generate a ClassifierMeasurements[...] object in order to query various performance results:

Generate ClassifierMeasurements object to query for various results

We now have 58.9% accuracy on the test set. In knockout-type matches, this classifier gives 76.5% accuracy. As we can see, it is only a marginal improvement on the previous classifier. This confirms how powerful the Elo rating feature is, and it is a sign that, from now on, accuracy percentages will be hard to improve. However, we have to keep in mind that our dataset contained many missing values for these extra features.

Let’s now have a look at the confusion matrix for the classification on the test set:

ConfusionMatrixPlot

This matrix shows the counts cij of class i examples classified as class j. The rows represent the true classes while the column represents the predicted classes. For example, we can read that amongst 779 matches won by Team1, two have been classified as Draw, 600 as Team1, and 177 as Team2. Interestingly, the classifier decides to predict Draw very rarely. This is due to the low proportion of tied matches (only 23%), but it does not mean the classifier excludes the possibility of draws; here are the classification probabilities on an example:

Classification probabilities

Is it possible to improve upon this classifier? Certainly, but we will probably need more and better-quality data. It would be interesting to have access to national championship results, infer players’ skills, how players interact together, etc. With our data, the prospects for improvement seem limited, so we will thus continue using this classifier to predict World Cup matches.

Our goal is to predict the probabilities for each team to access a given stage of the competition (round of 16, quarter-finals, semi-finals, finals, and victory). We must infer these probabilities from the outcome probabilities of individual matches given by the classifier. One way to do so would be to compute the probabilities for all possible World Cup results. Unfortunately, the number of possible configurations grows exponentially with the number of matches; it will thus be very slow to compute. Instead, we will simulate World Cup results through Monte Carlo simulations: for each match, we randomly pick one of the outcomes (with RandomChoice) according to their distribution. We can then simulate the development of many imaginary World Cups and count how many times a given team reached a given stage.

We first compute the features associated with each team (continent, Elo rating, mean age, etc.). Here are the features for Brazil:

Computing features of Brazil team

Using this, we construct a function converting the features of both teams into features used by the classifier:

Convert features of team into classifier

In the group stage, a victory is three points, a draw one point, and a defeat zero points. Only the first and second teams qualify. Here is a function that simulates the qualified teams for the “round of 16″:

Simulating qualified teams for "round 16"

As we cannot compute goal averages, if two teams have an equal number of points, their order is chosen randomly.

We then code a function that simulates a knockout round from a list of countries. To do so, we use the option ClassPriors in order to tell the classifier that the probability of Draw in this phase is 0:

Using ClassPriors

We can now have our full simulation function:

Full simulation function

Here is one simulation and the corresponding plot of the tournament tree:

Simulation of tournament tree plot

We can now perform many trials and count how many times each team reaches a given level of the competition.

estimateProbabilities

After performing 100,000 simulations, here is what we obtained for winning probabilities:

Winning probabilities

As one might expect, Brazil is the favorite, with a probability to win of 42.5%. This striking result is due to the fact that Brazil has both the highest Elo ranking and plays at home. Spain and Germany follow and are the most serious challengers, with about 21.5% and 15.6% probability to win, respectively. There is almost 80% chance that one of these teams will win the World Cup according to our model.

Let’s now look at the probabilities to get out of the group phase:

Group phase qualification probabilities

This ranking follows the ranking of final victory. There are some interesting things to note: while Germany and Argentina have about the same probability to get out of their group, Germany is more than three times as likely to win. This is partly due to the fact that Germany has strong opponents in its group (Portugal, USA, and Ghana), while Argentina is in quite a weak group.

Finally, here are plots of the probabilities to reach each stage of the competition for the nine favorite teams:

Predictions for the nine most favored teams

We can see the domination of Europe and South America in football.

At the time of writing (June 17), some matches have already been played. Let’s see how our classifier would have predicted them:

Classifier predictions

From the first 15 matches, 11 have been correctly classified, which gives 73.3% accuracy. This is higher than expected; we have been lucky. We will report the final accuracy on all the matches after the World Cup is over.

So what else can we do with this classifier? Besides being disappointed that our favorite team has little chance of winning, one straightforward application is for betting. How could we do that? Let’s say that we just want to bet on the result of matches (Team1 wins, Team2 wins, or Draw). The naive approach would be to bet on the outcome predicted by the classifier, but this is not the best strategy. What we really want is to maximize our gain according to the probabilities predicted by the classifier and the bookmaker odds. In order to do so, we can use the option UtilityFunction, which sets the utility function of the classifier. This function defines our utility for each pair of actual-predicted classes. In order to make a decision, the classifier maximizes the expected utility. By default, the utility is 1 when an example is correctly classified, and 0 otherwise; therefore, the most likely class is predicted. In our case, the utility should be our money gain: if we do the correct prediction, it will be the betting odds for the corresponding outcome, and otherwise it will be 0. Here is how we can construct such a utility function using associations:

bettingUtilityFunction

Now let’s say that the odds of Switzerland vs. France (June 20) are:
– Switzerland: 4.20
– Draw: 3.30
– France: 2.05

The predicted probabilities are:

Switzerland versus France

And the predicted outcome is that France will win:

France wins prediction

However, if we add the betting odds in the utility, the decision is the opposite:

Switzerland wins prediction

It thus seems reasonable to bet on Switzerland. Now, should we blindly follow the decision of the classifier? Well, there are some counterarguments. First, this method does not take into account our risk aversion: it will choose the maximum expected utility no matter what the risks are. This strategy is winning in the long run, but might lead to severe loss of money at a given time. We also have to consider the quality of the predictions: are they better than bookmakers’ odds? Betting odds reflect what people think, and people often put feelings into their bet (e.g. they have a tendency to bet for their favorite team). In that sense, a cold machine learning algorithm will perform better. On the other hand, many betters already use algorithms to bet and they are probably more sophisticated than this one. So use at your own risk!

]]>
http://blog.wolfram.com/2014/06/20/predicting-who-will-win-the-world-cup-with-wolfram-language/feed/ 28
Wolfram Technology Conference 2014: Register Now! http://blog.wolfram.com/2014/06/17/wolfram-technology-conference-2014-register-now/ http://blog.wolfram.com/2014/06/17/wolfram-technology-conference-2014-register-now/#comments Tue, 17 Jun 2014 17:12:50 +0000 Wolfram Blog http://blog.internal.wolfram.com/?p=19554 Stephen Wolfram speaking at 2013 WTCIt’s been a productive 2014 already here at Wolfram with tons of new technology being released and a whole new world of possibilities opening up. One great way to learn more about these accomplishments is to join us at the 2014 Wolfram Technology Conference.
The conference takes place Wednesday, October 22 through Friday, October 24, in Champaign, Illinois (our headquarters). This year’s talks will highlight the Wolfram Language and the thriving ecosystem growing around it, including the new Wolfram Programming Cloud, Mathematica,
Wolfram|Alpha, SystemModeler, and more.

At the conference, you’ll hear from Stephen Wolfram himself. Plus our top Wolfram developers will cover exciting new features in-depth, while industry experts will show you how you can use Wolfram technologies in your everyday work to accomplish more–and do so more efficiently.

For those not familiar with the Technology Conference, there will be several types of talks and activities rounding out a full schedule:

  • Wolfram talks: presentations on key topics and new features
  • Expert talks: presentations detailing practical applications of Wolfram technologies in education, science, industry, and business
  • Hands-on workshops: sessions dedicated to helping you create custom solutions for the problems you face every day
  • Collaborative Meet-Ups: your chance to sit down with Wolfram developers and like-minded individuals to discuss best practices and come up with innovative solutions to your most challenging problems
  • Networking events: your opportunity to connect with the Wolfram team and other participants at lunch roundtable discussions, cocktail hour, and more

Images of 2013 WTCTo get a feel for each of the different kinds of talks, check out the conference videos from 2013.
There’s never a dull moment during the three days of the conference. Stephen Wolfram delivers the opening keynote, demoing new developments just to conference participants. (By attending the conference, you’ll get the inside scoop before the general public.)
Then there’s the Conference dinner and Wolfram Innovator Award ceremony. While you enjoy food, drinks, and great company, Stephen recognizes pioneers who are using Wolfram technologies in innovative ways in their fields. After the award ceremony, Stephen opens the floor for a Q&A session–a rare opportunity to have him answer your questions on any topic.

If that’s not enough, there’s also the conference one-liner competition, where you’ll have the opportunity to dazzle everyone with your ability to create powerful programs in a single line of code.

All in all, the Wolfram Technology Conference is an enjoyable, information-packed event. If you’re a fan of any of our products, you won’t want to miss it.

To ensure the best possible experience for conference participants, we limit registration to just 250 people. Register now to secure your spot!

]]>
http://blog.wolfram.com/2014/06/17/wolfram-technology-conference-2014-register-now/feed/ 1
How the Wolfram Language Measures Up http://blog.wolfram.com/2014/06/04/how-the-wolfram-language-measures-up/ http://blog.wolfram.com/2014/06/04/how-the-wolfram-language-measures-up/#comments Wed, 04 Jun 2014 15:13:59 +0000 Wolfram Blog http://blog.internal.wolfram.com/?p=19310 Back in 2012, Jon McLoone wrote a program that analyzed the coding examples of over 500 programming languages that were compiled on the wiki site Rosetta Code. He compared the programming language of Mathematica (now officially named the Wolfram Language) to 14 of the most popular and relevant languages, and found that most programs can be written in the Wolfram Language with 1/2 to 1/10 as much code—even as tasks become larger and more complex.

We were curious to see how the Wolfram Language continues to stack up, since a lot has happened in the last two years. So we updated and re-ran Jon’s code, and, much to our excitement (though we really weren’t all that surprised), the Wolfram Language remains largely superior by all accounts!

Keep in mind that the programming tasks at Rosetta Code are the typical kinds of exercises that you can write in conventional programming languages: editing text, implementing quicksort, or solving the Towers of Hanoi. You wouldn’t even think of dashing off a program in C to do handwriting recognition, yet that’s a one-liner in the Wolfram Language. And since the Wolfram Language’s ultra-high-level constructs are designed to match the way people think about solving problems, writing programs in it is usually easier than in other languages. In spite of the Rosetta Code tasks being relatively low-level applications, the Wolfram Language still wins handily on code length compared to every other language.

Here’s the same graph as in Jon’s 2012 post comparing the Wolfram Language to C. Each point gives the character counts of the same task programmed in the Wolfram Language and C. Notice the Wolfram Language still remains shorter for almost every task, staying mostly underneath the dashed one-to-one line:

Wolfram Language versus C

The same holds true for Python:

Wolfram Language versus Python

Although the typical methods for comparing coding languages are usually by character count or line count, these measures are not reliable when looking at the Wolfram Language. Lines are fluid and arbitrary in the Wolfram Language, and it has long, descriptive function names. On the plus side, this makes the language very straightforward and easy to understand—but it can also skew data when trying to quantify coding efficiency in terms of character count or line count. Instead, we can compare “tokens,” or any string of letters and numbers that are not interrupted by a number or punctuation. This lets us classify length in terms of “units of syntax,” which, while it isn’t perfect, gives us a clearer picture of the number of different elements required to build a function or program.

And so, using tokens now as our metric to compare the Wolfram Language and Python, we see a slightly different spread, but the points still lean very much underneath the one-to-one line, implying that the Wolfram Language still ranks comparatively shorter.

Wolfram Language versus Python using tokens

Using a MovingMedian can help clean up some of the ambient noise around these results. Below, the Wolfram Language appears to, on average, increase in token count at a slower rate than Python. Using FindFit, we can estimate that a typical Python program that requires x tokens can be written in the Wolfram Language with 3.48Square root of x tokens, meaning a Python program that requires 1,000 tokens would require just 110 tokens in the Wolfram Language.

A Python program that requires 1,000 tokens would require just 110 tokens in the Wolfram Language

Similarly in the four comparisons below, the number of tokens naturally increases for both languages as the tasks become larger—but the Wolfram Language gets larger at a slower pace. (Their respective coefficients: C++ -> 2.85, C -> 2.36, Java -> 3.53, MATLAB -> 4.16.)

C++, Python, Java, MATLAB, and Wolfram Language

We can also look at the data in a table of ratios, comparing the languages across the top to the languages to the left. Numbers greater than 1 mean the language on top requires more lines of code.

All tasks--token count ratio

The Wolfram Language does even better compared to every other language when looking specifically at large tasks.

Large tasks--token count ratio

To see more data, or to experiment with the code yourself, download the notebook at the end of this post. And to get a more in-depth look at the process we used to perform this analysis, give Jon’s blog post a read!

Download this post as a Computable Document Format (CDF) file.

]]>
http://blog.wolfram.com/2014/06/04/how-the-wolfram-language-measures-up/feed/ 18