Any approach to data science can only be as effective as the computational tools driving it; luckily for us, we had the Wolfram Language at our disposal. Leveraging its universal symbolic representation, high-level automation and human readability—as well as its broad range of built-in computation, knowledge and interfaces—streamlined our process to help bring Wolfram|Alpha to fruition. In this post, I’ll discuss some key tenets of the multiparadigm approach, then demonstrate how they combine with the computational intelligence of the Wolfram Language to make the ideal workflow for not only discovering and presenting insights from your data, but also for creating scalable, reusable applications that optimize your data science processes.

Given all the buzzwords floating around over the past few years—automated machine learning, edge AI, adversarial neural networks, natural language processing—you might be tempted to grab a sleek new method from arXiv and call it a solution. And while this can be a convenient way to solve a specific problem at hand, it also tends to limit the range of questions you can answer. One main goal of the multiparadigm approach is to remove those kinds of constraints from your workflow, instead letting questions and curiosity drive your analysis.

Leading with questions is easiest when you start from a high level. Wolfram Language functions use built-in parameter selection, which lets you focus on the overarching task rather than the technical details. Your workflow might require data from any number of sources and formats; `Import` automatically detects the file type and structure of your data:

✕
Import["surveydata.csv"]// Shallow |

`SemanticImport` goes even further, interpreting expressions in each field and displaying everything in an easy-to-view `Dataset`:

✕
data = SemanticImport["surveydata.csv"] |

Apply `FindDistribution` to your data, and it auto-selects a fitting method to give you an approximate distribution:

✕
dist = FindDistribution[ages = Normal@data[All, "Age"]] |

Another quick line of code generates a `SmoothHistogram` plot comparing the actual data to the computed distribution:

✕
SmoothHistogram[{ages, RandomVariate[dist, Length[ages]]}, PlotLegends -> {"Data", "Computed"}] |

You can then ask yourself, “What is going on in that graph?”, drill down with more computation and get more info about the fit:

✕
DistributionFitTest[ages, dist, "TestDataTable"] |

Go deeper by trying specific distributions:

✕
Column[Table[ DistributionFitTest[ages, FindDistribution[ages,TargetFunctions->{d}],"TestDataTable"],{d,{NormalDistribution,PascalDistribution,NegativeBinomialDistribution}}]] |

The simple input-output flow of a Wolfram Notebook helps move the process forward, with each step building on the previous computation. Every evaluation gives clear output that can be used immediately in further analysis, letting you code as fast as you think (or, at least, as fast as you type).

Using a question-answer workflow with human-readable functions and interactive coding gives you unprecedented freedom for computational exploration.

Although it’s easy to think of data science as a numbers game, the best insights often come from exploring images, audio, text, graphs and other data. In most systems, this involves either tracking down and switching between frameworks to support different data types or writing custom code to convert everything to the appropriate type and structure.

But again, underlying technical details shouldn’t be the focus of your data science workflow. To simplify the process, the Wolfram Data Framework (WDF) expresses everything the same way: tables and matrices, text, images, graphics, graphs and networks are all represented as symbols:

This reduces the time and effort for reading standard data types, sorting through unstructured data or handling mixed data types. Wolfram Language functions generally work on a variety of data types, such as `LearnDistribution` (big brother to `FindDistribution`), which can generate a probability distribution for a set of images:

✕
images = ResourceData["CIFAR-100"][[All, 1]]; |

✕
ld = LearnDistribution[images] |

The distribution can be used to examine the likelihood of a given image being from the set:

✕
Grid[Table[{i, RarerProbability[ld, i]} , {i, CloudGet["https://wolfr.am/DLdsWtJA"]}]] |

You can even generate new representative samples:

✕
RandomVariate[ld, 10] |

WDF also makes it easy to construct high-level entities with uniformly structured data. This is especially useful for representing complex real-world concepts:

✕
us = Entity["Country", "UnitedStates"]; RandomSample[us["Properties"], 5] |

You can use high-level natural language input for immediate access to entities and their properties:

✕
states = EntityList[\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "all us states", Typeset`boxes$$ = TemplateBox[{"\"all US states with District of Columbia\"", RowBox[{"EntityClass", "[", RowBox[{"\"AdministrativeDivision\"", ",", "\"AllUSStatesPlusDC\""}], "]"}], "\"EntityClass[\\\"AdministrativeDivision\\\", \ \\\"AllUSStatesPlusDC\\\"]\"", "\"administrative divisions\""}, "EntityClass"], Typeset`allassumptions$$ = {{"type" -> "SubCategory", "word" -> "all us states", "template" -> "Assuming ${desc1}. Use ${desc2} instead", "count" -> "2", "Values" -> {{"name" -> "AllUSStatesPlusDC", "desc" -> "all US states with District of Columbia", "input" -> "*DPClash.USStateEC.all+us+states-_*AllUSStatesPlusDC-\ "}, {"name" -> "USStatesAllStates", "desc" -> "all US states", "input" -> "*DPClash.USStateEC.all+us+states-_*USStatesAllStates-\ "}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.560174`6.199867941042123, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {439., {7., 15.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\)] |

Entities and associated data values can then be used and combined in computations:

✕
GeoRegionValuePlot[ states -> EntityProperty["AdministrativeDivision", "Population"], GeoRange -> us] |

You can build entire computational workflows based on this curated knowledge. Custom entities made with `EntityStore` have the same flexibility. With data unified through WDF, you won’t have to worry about the size, type or structure of data—leaving more time for finding answers.

Discovery comes from trying new ideas, so a truly discovery-focused data science workflow should go beyond standard areas like statistics and machine learning. You can find more by testing and combining different computational techniques on your data—borrowing from geocomputation, time series analysis, signal processing, network analysis and engineering. To do that, you need algorithms for every subject and discipline, as well as the ability to change techniques without a major code rewrite.

Fortunately, Wolfram Language code uses the same symbolic structure as its data, ensuring maximum compatibility with no preprocessing required. Computational methods (e.g. model selection, data resampling, plot styles) are also standardized and automated across a range of functionality. Syntax, logics and conventions are uniform no matter what domain you’re exploring:

For a case in point, look at our data exploration of sensors from the ThrustSSC supersonic car. Standard data partitioning and time series techniques were useful in translating and understanding the data. But we also opted to try a few unconventional approaches, such as using `CommunityGraphPlot` to group together similar datasets:

A dive into signal processing—specifically, wavelet analysis—led to the discovery of certain discontinuities in the vibrational frequency of a wheel:

As it turned out, these gaps represented the wheel’s top edge crossing the sound barrier—a phenomenon that was understood by the engineers but had not been verified by previous analyses. Even a quick exploration using a broad, high-level toolset can provide better insights with less expertise (and a *lot* less code) than more siloed approaches.

Data science doesn’t stop with the discovery of answers; you need to interpret and share your results before anyone can act on them. That means presenting the right information to the right people in the right way, whether it’s sending a few static visualizations, deploying an interactive desktop or web app, generating an automated report or making a full write-up of your project. One major goal of the multiparadigm approach is to express insights in the clearest way possible, regardless of context.

For the basic cases, the Wolfram Language has visualizations for a range of analyses, with high-level plot themes and the ability to add labels, frames and other details all inline:

✕
cities = EntityClass[ "City", {EntityProperty["City", "Country"] -> Entity["Country", "UnitedStates"], EntityProperty["City", "Population"] -> TakeLargest[10]}][ "Entities"]; |

✕
BubbleChart[ EntityValue[cities, {"GiniIndex", "Area", "PerCapitaIncome"}], PlotTheme -> "Marketing", ColorFunction -> "TemperatureMap", ChartLabels -> Callout[ EntityValue[cities, "Name"]], PlotLabel -> Style[ "Gini Index Data: 10 Largest U.S. Cities", "Title", 24]] |

And using functions like `Manipulate`, you can interactively explore additional parameters to find patterns in the data:

✕
With[{params = { EntityProperty["City", "HousingAffordability"], EntityProperty["City", "MedianHouseholdIncome"], EntityProperty["City", "PopulationByEducationalAttainment"], EntityProperty["City", "PublicTransportationAnnualPassengerMiles"], EntityProperty["City", "UnemploymentRate"], EntityProperty["City", "RushHours"]}}, Manipulate[ BubbleChart[EntityValue[cities, {"GiniIndex", y, z}]], {y, params}, {z, params}]] |

Documenting your project’s workflow can be equally important; a clear, detailed narrative gives real-world context to an analysis and makes it understandable to a broad audience. The combination of interactive, human-readable code with plain language properly organized creates what Stephen Wolfram calls a computational essay:

This kind of high-level document is easy to produce using Wolfram Notebooks, which let you mix code, text, images, interfaces and other expressions in a hierarchical cell structure. With built-in interactivity and live code, the audience can follow the same discovery process as the author.

Computational essays are typical of the kind of output produced by the multiparadigm approach. But sometimes you need more information with fewer words—say, a financial dashboard:

From there, you can send the notebook to anyone for interactive viewing with Wolfram Player. Or deploy it as a web form in the Wolfram Cloud, add permissions for your colleagues and set up an automated report. You could even set up an API so others can create their own interfaces. It’s all built into the language.

Every interface has its unique use for data scientists and end users. The Wolfram Language gives you access to the full spectrum of interfaces for analyzing and reporting on your data—and they can be made permanently accessible for interactive viewing from anywhere, making them ideal for sharing ideas with the wider world.

Following these principles leads to an optimized workflow that includes every aspect of the data science process, from data to deployment. Taking it a step further, the multiparadigm approach uses automation as much as possible, both simplifying the task at hand and making subsequent explorations easier.

This brings us back to Wolfram|Alpha: an adaptive web application that accepts a broad range of input styles, automatically chooses the appropriate data source for a given task and runs optimized computations on that data to provide answers at scale. When viewed as a whole, the system exemplifies the multiparadigm approach.

For instance, take a question involving revenue and GDP:

In this case, the system must first interpret the natural language statement, retrieving the entities and functions necessary to compute the ratio of revenue to GDP during a given time period. But beyond having access to the right data, it must be able to bring those different data sources together instantaneously when a response is needed.

Another example is revenue forecasting:

On top of the steps from the previous example, this computation also uses automated model selection, in this case choosing log-normal random walks. And in both cases, the system returns additional information to the user, all organized in a high-level report.

Wolfram|Alpha can be used in this way for all kinds of analysis, at any scale, always using the latest algorithms and data—making the full data science process available through simple natural language queries.

Ultimately, the best insights come from augmenting human curiosity with intelligent computation—and that’s exactly what multiparadigm data science in the Wolfram Language achieves. The result is a scalable, start-to-finish computational workflow designed around human understanding and usability.

Making high-level computation more accessible leads to the democratization of data science, giving anyone with questions immediate access to answers. The multiparadigm approach is more than just a new method for data science; creating and sharing high-level tools for interactive exploration make possible a new generation of data science.

So what kind of insights does your data hold? There’s only one way to find out: start exploring!

Preview Wolfram U’s upcoming open online course to learn more about the multiparadigm data science project workflow.

]]>It happens far too often. I’ll be talking to a software developer, and they’ll be saying how great they think our technology is, and how it helped them so much in school, or in doing R&D. But then I’ll ask them, “So, are you using Wolfram Language and its computational intelligence in your production software system?” Sometimes the answer is yes. But too often, there’s an awkward silence, and then they’ll say, “Well, no. Could I?”

I want to make sure the answer to this can always be: “Yes, it’s easy!” And to help achieve that, we’re releasing today the Free Wolfram Engine for Developers. It’s a full engine for the Wolfram Language, that can be deployed on any system—and called from programs, languages, web servers, or anything.

The Wolfram Engine is the heart of all our products. It’s what implements the Wolfram Language, with all its computational intelligence, algorithms, knowledgebase, and so on. It’s what powers our desktop products (including Mathematica), as well as our cloud platform. It’s what’s inside Wolfram|Alpha—as well as an increasing number of major production systems out in the world. And as of today, we’re making it available for anyone to download, for free, to use in their software development projects.

Many people know the Wolfram Language (often in the form of Mathematica) as a powerful system for interactive computing—and for doing R&D, education, data science and “computational X” for many X. But increasingly it’s also being used “behind the scenes” as a key component in building production software systems. And what the Free Wolfram Engine for Developers now does is to package it so it’s convenient to insert into a whole range of software engineering environments and projects.

It’s worth explaining a bit about how I see the Wolfram Language these days. (By the way, you can run it immediately on the web in the Wolfram Language Sandbox.) The most important thing is to realize that the Wolfram Language as it now exists is really a new kind of thing: a full-scale computational language. Yes, it’s an extremely powerful and productive (symbolic, functional, …) programming language. But it’s much more than that. Because it’s got the unique feature of having a huge amount of computational knowledge built right into it: knowledge about algorithms, knowledge about the real world, knowledge about how to automate things.

We’ve been steadily building up what’s now the Wolfram Language for more than 30 years—and one thing I’m particularly proud of (though it’s hard work; e.g. check out the livestreams!) is how uniform, elegant and stable a design we’ve been able to maintain across the whole language. There are now altogether 5000+ functions in the language, covering everything from visualization to machine learning, numerics, image computation, geometry, higher math and natural language understanding—as well as lots of areas of real-world knowledge (geo, medical, cultural, engineering, scientific, etc.).

In recent years, we’ve also introduced lots of hardcore software engineering capabilities—instant cloud deployment, network programming, web interaction, database connectivity, import/export (200+ formats), process control, unit testing, report generation, cryptography, blockchain, etc. (The symbolic nature of the language makes these particularly clean and powerful.)

The goal of the Wolfram Language is simple, if ambitious: have everything be right there, in the language, and be as automatic as possible. Need to analyze an image? Need geographic data? Audio processing? Solve an optimization problem? Weather information? Generate 3D geometry? Anatomical data? NLP entity identification? Find anomalies in a time series? Send a mail message? Get a digital signature? All these things (and many, many more) are just functions that you can immediately call in any program you write in Wolfram Language. (There are no libraries to hunt down; everything is just integrated into the language.)

Back on the earliest computers, all one had was machine code. But then came simple programming languages. And soon one could also take it for granted that one’s computer would have an operating system. Later also networking, then a user interface, then web connectivity. My goal with the Wolfram Language is to provide a layer of computational intelligence that in effect encapsulates the computational knowledge of our civilization, and lets people take it for granted that their computer will know how to identify objects in an image, or how to solve equations, or what the populations of cities are, or countless other things.

And now, today, what we want to do with the Free Wolfram Engine for Developers is to make this something ubiquitous, and immediately available to any software developer.

The Free Wolfram Engine for Developers implements the full Wolfram Language as a software component that can immediately be plugged into any standard software engineering stack. It runs on any standard platform (Linux, Mac, Windows, RasPi, …; desktop, server, virtualized, distributed, parallelized, embedded). You can use it directly with a script, or from a command line. You can call it from programming languages (Python, Java, .NET, C/C++, …), or from other systems (Excel, Jupyter, Unity, Rhino, …). You can call it through sockets, ZeroMQ, MQTT or its own native WSTP (Wolfram Symbolic Transfer Protocol). It reads and writes hundreds of formats (CSV, JSON, XML, …), and connects to databases (SQL, RDF/SPARQL, Mongo, …), and can call external programs (executables, libraries, …), browsers, mail servers, APIs, devices, and languages (Python, NodeJS, Java, .NET, R, …). Soon it’ll also plug directly into web servers (J2EE, aiohttp, Django, …). And you can edit and manage your Wolfram Language code with standard IDEs, editors and tools (Eclipse, IntelliJ IDEA, Atom, Vim, Visual Studio Code, Git, …).

The Free Wolfram Engine for Developers has access to the whole Wolfram Knowledgebase, through a free Basic subscription to the Wolfram Cloud. (Unless you want real-time data, everything can be cached, so you can run the Wolfram Engine without network connectivity.) The Basic subscription to the Wolfram Cloud also lets you deploy limited APIs in the cloud.

A key feature of the Wolfram Language is that you can run the exact same code anywhere. You can run it interactively using Wolfram Notebooks—on the desktop, in the cloud, and on mobile. You can run it in a cloud API (or scheduled task, etc.), on the public Wolfram Cloud, or in a Wolfram Enterprise Private Cloud. And now, with the Wolfram Engine, you can also easily run it deep inside any standard software engineering stack.

(Of course, if you want to use our whole hyperarchitecture spanning desktop, server, cloud, parallel, embedded, mobile—and interactive, development and production computing—then a good entry point is Wolfram|One, and, yes, there are trial versions available.)

OK, so how does the licensing for Free Wolfram Engine for Developers work? For the past 30+ years, our company has had a very straightforward model: we license our software to generate revenue that allows us to continue our long-term mission of continuous, energetic R&D. We’ve also made many important things available for free—like our main Wolfram|Alpha website, Wolfram Player and basic access to the Wolfram Cloud.

The Free Wolfram Engine for Developers is intended for use in pre-production software development. You can use it to develop a product for yourself or your company. You can use it to conduct personal projects at home, at school or at work. And you can use it to explore the Wolfram Language for future production projects. (Here’s the actual license, if you’re curious.)

When you have a system ready to go into production, then you get a Production License for the Wolfram Engine. Exactly how that works will depend on what kind of system you’ve built. There are options for local individual or enterprise deployment, for distributing the Wolfram Engine with software or hardware, for deploying in cloud computing platforms—and for deploying in the Wolfram Cloud or Wolfram Enterprise Private Cloud.

If you’re making a free, open-source system, you can apply for a Free Production License. Also, if you’re part of a Wolfram Site License (of the type that, for example, most universities have), then you can freely use Free Wolfram Engine for Developers for anything that license permits.

We haven’t worked out all the corners and details of every possible use of the Wolfram Engine. But we are committed to providing predictable and straightforward licensing for the long term (and we’re working to ensure the availability and vitality of the Wolfram Language in perpetuity, independent of our company). We’ve now had consistent pricing for our products for 30+ years, and we want to stay as far away as possible from the many variants of bait-and-switch which have become all too prevalent in modern software licensing.

I’m very proud of what we’ve created with Wolfram Language, and it’s been wonderful to see all the inventions, discoveries and education that have happened with it over decades. But in recent years there’s been a new frontier: the increasingly widespread use of the Wolfram Language inside large-scale software projects. Sometimes the whole project is built in Wolfram Language. Sometimes Wolfram Language is inserted to add some critical computational intelligence, perhaps even just in a corner of the project.

The goal of the Free Wolfram Engine for Developers is to make it easy for anyone to use the Wolfram Language in any software development project—and to build systems that take advantage of its computational intelligence.

We’ve worked hard to make the Free Wolfram Engine for Developers as easy to use and deploy as possible. But if there’s something that doesn’t work for you or your project, please send me mail! Otherwise, please use what we’ve built—and do something great with it!

*To comment, please visit the copy of this post at the Stephen Wolfram Blog »*

Today it’s 10 years since we launched Wolfram|Alpha. At some level, Wolfram|Alpha is a never-ending project. But it’s had a great first 10 years. It was a unique and surprising achievement when it first arrived, and over its first decade it’s become ever stronger and more unique. It’s found its way into more and more of the fabric of the computational world, both realizing some of the long-term aspirations of artificial intelligence, and defining new directions for what one can expect to be possible. Oh, and by now, a significant fraction of a billion people have used it. And we’ve been able to keep it private and independent, and its main website has stayed free and without external advertising.

For me personally, the vision that became Wolfram|Alpha has a very long history. I first imagined creating something like it more than 47 years ago, when I was about 12 years old. Over the years, I built some powerful tools—most importantly the core of what’s now Wolfram Language. But it was only after some discoveries I made in basic science in the 1990s that I felt emboldened to actually try building what’s now Wolfram|Alpha.

It was—and still is—a daunting project. To take all areas of systematic knowledge and make them computable. To make it so that any question that can in principle be answered from knowledge accumulated by our civilization can actually be answered, immediately and automatically.

Leibniz had talked about something like this 350 years ago; Turing 70 years ago. But while science fiction (think the *Star Trek* computer) had imagined it, and AI research had set it as a key goal, 50 years of actual work on question-answering had failed to deliver. And I didn’t know for sure if we were in the right decade—or even the right century—to be able to build what I wanted.

But I decided to try. And it took lots of ideas, lots of engineering, lots of diverse scholarship, and lots of input from experts in a zillion fields. But by late 2008 we’d managed to get Wolfram|Alpha to the point where it was beginning to work. Day by day we were making it stronger. But eventually there was no sense in going further until we could see how people would actually use it.

And so it was that on May 18, 2009, we officially opened Wolfram|Alpha up to the world. And within hours we knew it: Wolfram|Alpha really worked! People asked all kinds of questions, and got successful answers. And it became clear that the paradigm we’d invented of generating synthesized reports from natural language input by using built-in computational knowledge was very powerful, and was just what people needed.

Perhaps because the web interface to Wolfram|Alpha was just a simple input field, some people assumed it was like a search engine, finding content on the web. But Wolfram|Alpha isn’t searching anything; it’s computing custom answers to each particular question it’s asked, using its own built-in computational knowledge—that we’ve spent decades amassing. And indeed, quite soon, it became clear that the vast majority of questions people were asking were ones that simply didn’t have answers already written down anywhere on the web; they were questions whose answers had to be computed, using all those methods and models and algorithms—and all that curated data—that we’d so carefully put into Wolfram|Alpha.

As the years have gone by, Wolfram|Alpha has found its way into intelligent assistants like Siri, and now also Alexa. It’s become part of chatbots, tutoring systems, smart TVs, NASA websites, smart OCR apps, talking (toy) dinosaurs, smart contract oracles, and more. It’s been used by an immense range of people, for all sorts of purposes. Inventors have used it to figure out what might be possible. Leaders and policymakers have used it to make decisions. Professionals have used it to do their jobs every day. People around the world have used it to satisfy their curiosity about all sorts of peculiar things. And countless students have used it to solve problems, and learn.

And in addition to the main, public Wolfram|Alpha, there are now all sorts of custom “enterprise” Wolfram|Alphas operating inside large organizations, answering questions using not only public data and knowledge, but also the internal data and knowledge of those organizations.

It’s fun when I run into high-school and college kids who notice my name and ask “Are you related to Wolfram|Alpha?” “Well”, I say, “actually, I am”. And usually there’s a look of surprise, and a slow dawning of the concept that, yes, Wolfram|Alpha hasn’t always existed: it had to be created, and there was an actual human behind it. And then I often explain that actually I first started thinking about building it a long time ago, when I was even younger than them…

When I started building Wolfram|Alpha I certainly couldn’t prove it would work. But looking back, I realize there were a collection of key things—mostly quite unique to us and our company—that ultimately made it possible. Some were technical, some were conceptual, and some were organizational.

On the technical side, the most important was that we had what was then Mathematica, but is now the Wolfram Language. And by the time we started building Wolfram|Alpha, it was clear that the unique symbolic programming paradigm that we’d invented to be the core of the Wolfram Language was incredibly general and powerful—and could plausibly succeed at the daunting task of providing a way to represent all the computational knowledge in the world.

It also helped a lot that there was so much algorithmic knowledge already built into the system. Need to solve a differential equation to compute a trajectory? Just use the built-in `NDSolve` function! Need to solve a difficult recurrence relation? Just use `RSolve`. Need to simplify a piece of logic? Use `BooleanMinimize`. Need to do the combinatorial optimization of finding the smallest number of coins to give change? Use `FrobeniusSolve`. Need to find out how long to cook a turkey of a certain weight? Use `DSolve`. Need to find the implied volatility of a financial derivative? Use `FinancialDerivative`. And so on.

But what about all that actual data about the world? All the information about cities and movies and food and so on? People might have thought we’d just be able to forage the web for it. But I knew very quickly this wouldn’t work: the data—if it even existed on the web—wouldn’t be systematic and structured enough for us to be able to correctly do actual computations from it, rather than just, for example, displaying it.

So this meant there wouldn’t be any choice but to actually dive in and carefully deal with each different kind of data. And though I didn’t realize it with so much clarity at the time, this is where our company had another extremely rare and absolutely crucial advantage. We’ve always been a very intellectual company (no doubt to our commercial detriment)—and among our staff we, for example, have PhDs in a wide range of subjects, from chemistry to history to neuroscience to architecture to astrophysics. But more than that, among the enthusiastic users of our products we count many of the world’s top researchers across a remarkable diversity of fields.

So when we needed to know about proteins or earthquakes or art history or whatever, it was easy for us to find an expert. At first, I thought the main issue would just be “Where is the best source of the relevant data?” Sometimes that source would be very obvious; sometimes it would be very obscure. (And, yes, it was always fun to run across people who’d excitedly say things like: “Wow, we’ve been collecting this data for decades and nobody’s ever asked for it before!”)

But I soon realized that having raw data was only the beginning; after that came the whole process of understanding it. What units are those quantities in? Does -99 mean that data point is missing? How exactly is that average defined? What is the common name for that? Are those bins mutually exclusive or combined? And so on. It wasn’t just enough to have the data; one also had to have an expert-level dialog with whomever had collected the data.

But then there was another issue: people want answers to questions, not raw data. It’s all well and good to know the orbital parameters for a television satellite, but what most people will actually want to know is where the satellite is in the sky at their location. And to work out something like that requires some method or model or algorithm. And this is where experts were again crucial.

My goal from the beginning was always to get the best research-level results for everything. I didn’t consider it good enough to use the simple formula or the rule of thumb. I wanted to get the best answers that current knowledge could give—whether it was for time to sunburn, pressure in the ocean, mortality curves, tree growth, redshifts in the early universe, or whatever. Of course, the good news was that the Wolfram Language almost always had the built-in algorithmic power to do whatever computations were needed. And it was remarkably common to find that the original research we were using had actually been done with the Wolfram Language.

As we began to develop Wolfram|Alpha we dealt with more and more domains of data, and more and more cross-connections between them. We started building streamlined frameworks for doing this. But one of the continuing features of the Wolfram|Alpha project has been that however good the frameworks are, every new area always seems to involve new and different twists—that can be successfully handled only because we’re ultimately using the Wolfram Language, with all its generality.

Over the years, we’ve developed an elaborate art of data curation. It’s a mixture of automation (these days, often using modern machine learning), management processes, and pure human tender loving care applied to data. I have a principle that there always has to be an expert involved—or you’ll never get the right answer. But it’s always complicated to allocate resources and to communicate correctly across the phases of data curation—and to inject the right level of judgement at the right points. (And, yes, in an effort to make the complexities of the world conveniently amenable to computation, there are inevitably judgement calls involved: “Should the Great Pyramid be considered a building?”, “Should Lassie be considered a notable organism or a fictional character?” “What was the occupation of Joan of Arc?”, and so on.)

When we started building Wolfram|Alpha, there’d already been all sorts of thinking about how large-scale knowledge should best be represented computationally. And there was a sense that—much like logic was seen as somehow universally applicable—so also there should be a universal and systematically structured way to represent knowledge. People had thought about ideas based on set theory, graph theory, predicate logic, and more—and each had had some success.

Meanwhile, I was no stranger to global approaches to things—having just finished a decade of work on my book *A New Kind of Science*, which at some level can be seen as being about the theory of all possible theories. But partly because of the actual science I discovered (particularly the idea of computational irreducibility), and partly because of the general intuition I had developed, I had what I now realize was a crucial insight: there’s not going to be a useful general theory of how to represent knowledge; the best you can ever ultimately do is to think of everything in terms of arbitrary computation.

And the result of this was that when we started developing Wolfram|Alpha, we began by just building up each domain “from its computational roots”. Gradually, we did find and exploit all sorts of powerful commonalities. But it’s been crucial that we’ve never been stuck having to fit all knowledge into a “data ontology graph” or indeed any fixed structure. And that’s a large part of why we’ve successfully been able to make use of all the rich algorithmic knowledge about the world that, for example, the exact sciences have delivered.

Perhaps the most obviously AI-like part of my vision for Wolfram|Alpha was that you should be able to ask it questions purely in natural language. When we started building Wolfram|Alpha there was already a long tradition of text retrieval (from which search engines had emerged), as well as of natural language processing and computational linguistics. But although these all dealt with natural language, they weren’t trying to solve the same problem as Wolfram|Alpha. Because basically they were all taking existing text, and trying to extract from it things one wanted. In Wolfram|Alpha, what we needed was to be able to take questions given in natural language, and somehow really understand them, so we could compute answers to them.

In the past, exactly what it meant for a computer to “understand” something had always been a bit muddled. But what was crucial for the Wolfram|Alpha project was that we were finally in a position to give a useful, practical definition: “understanding” for us meant translating the natural language into precise Wolfram Language. So, for example, if a user entered “What was the gdp of france in 1975?” we wanted to interpret this as the Wolfram Language symbolic expression `Entity["Country", "France"][Dated["GDP", 1975]]`.

And while it was certainly nice to have a precise representation of a question like that, the real kicker was that this representation was immediately computable: we could immediately use it to actually compute an answer.

In the past, a bane of natural language understanding had always been the ambiguity of things like words in natural language. When you say “apple”, do you mean the fruit or the company? When you say “3 o’clock”, do you mean morning or afternoon? On which day? When you say “springfield”, do you mean “Springfield, MA” or one of the 28 other possible Springfield cities?

But somehow, in Wolfram|Alpha this wasn’t such a problem. And it quickly became clear that the reason was that we had something that no previous attempt at natural language understanding had ever had: we had a huge and computable knowledgebase about the world. So “apple” wasn’t just a word for us: we had extensive data about the properties of apples as fruit and Apple as a company. And we could immediately tell that “apple vitamin C” was talking about the fruit, “apple net income” about the company, and so on. And for “Springfield” we had data about the location and population and notoriety of every Springfield. And so on.

It’s an interesting case where things were made easier by solving a much larger problem: we could be successful at natural language understanding because we were also solving the huge problem of having broad and computable knowledge about the world. And also because we had built the whole symbolic language structure of the Wolfram Language.

There were still many issues, however. At first, I’d wondered if traditional grammar and computational linguistics would be useful. But they didn’t apply well to the often-not-very-grammatical inputs people actually gave. And we soon realized that instead, the basic science I’d done in *A New Kind of Science* could be helpful—because it gave a conceptual framework for thinking about the interaction of many different simple rules operating on a piece of natural language.

And so we added the strange new job title of “linguistic curator”, and set about effectively curating the semantic structure of natural language, and creating a practical way to turn natural language into precise Wolfram Language. (And, yes, what we did might shed light on how humans understand language—but we’ve been so busy building technology that we’ve never had a chance to explore this.)

OK, so we can solve the difficult problem of taking natural language and turning it into Wolfram Language. And with great effort we’ve got all sorts of knowledge about the world, and we can compute all kinds of things from it. But given a particular input, what output should we actually generate? Yes, there may be a direct answer to a question (“42”, “yes”, whatever). And in certain circumstances (like voice output) that may be the main thing you want. But particularly when visual display is possible, we quickly discovered that people find richer outputs dramatically more valuable.

And so, in Wolfram|Alpha we use the computational knowledge we have to automatically generate a whole report about the question you asked:

We’ve worked hard on both the structure and content of the information presentation. There’d never been anything quite like it before, so everything had to be invented. At the top, there are sometimes “Assumings” (“Which Springfield did you mean?”, etc.)—though the vast majority of the time, our first choice is correct. We found it worked very well to organize the main output into a series of “pods”, often with graphical or tabular contents. Many of the pods have buttons that allow for drilldown, or alternatives.

Everything is generated programmatically. And which pods are there, with what content, and in what sequence, is the result of lots of algorithms and heuristics—including many that I personally devised. (Along the way, we basically had to invent a whole area of “computational aesthetics”: automatically determining what humans will find aesthetic and easy to interpret.)

In most large software projects, one’s building things to precise specifications. But one of the complexities of Wolfram|Alpha is that so much of what it does is heuristic. There’s no “right answer” to exactly what to plot in a particular pod, over what range. It’s a judgement call. And the overall quality of Wolfram|Alpha directly depends on doing a good job at making a vast number of such judgement calls.

But who should make these judgement calls? It’s not something pure programmers are used to doing. It takes real computational thinking skills, and it also usually takes serious knowledge of each content area. Sometimes similar judgement calls get repeated, and one can just say “do it like that other case”. But given how broad Wolfram|Alpha is, it’s perhaps not surprising that there are an incredible number of different things that come up.

And as we approached the launch of Wolfram|Alpha I found myself making literally hundreds of judgement calls every day. “How many different outputs should we generate here?” “Should we add a footnote here?” “What kind of graphic should we produce in that case?”

In my long-running work on designing Wolfram Language, the goal is to make everything precise and perfect. But for Wolfram|Alpha, the goal is instead just to have it behave as people want—regardless of whether that’s logically perfect. And at first, I worried that with all the somewhat arbitrary judgement calls we were making to achieve that, we’d end up with a system that felt very incoherent and unpredictable. But gradually I came to understand a sort of logic of heuristics, and we developed a good rhythm for inventing heuristics that fit together. And in the end—with a giant network of heuristic algorithms—I think we’ve been very successful at creating a system that broadly just automatically does what people want and expect.

Looking back now, more than a decade after the original development of Wolfram|Alpha, it begins to seem even more surprising—and fortuitous—that the project ended up being possible at all. For it is clear now that it critically relied on a whole collection of technical, conceptual and organizational capabilities that we (and I) happened to have developed by just that time. And had even one of them been missing, it would probably have made the whole project impossible.

But even given the necessary capabilities, there was the matter of actually doing the project. And it certainly took a lot of leadership and tenacity from me—as well as all sorts of specific problem solving—to pursue a project that most people (including many of those working on it) thought, at least at first, was impossible.

How did the project actually get started? Well, basically I just decided one day to do it. And, fortunately, my situation was such that I didn’t really have to ask anyone else about it—and as a launchpad I already had a successful, private company without outside investors that had been running well for more than a decade.

From a standard commercial point of view, most people would have seen the Wolfram|Alpha project as a crazy thing to pursue. It wasn’t even clear it was possible, and it was certainly going to be very difficult and very long term. But I had worked hard to put myself in a position where I could do projects just because I thought they were intellectually valuable and important—and this was one I had wanted to do for decades.

One awkward feature of Wolfram|Alpha as a project is that it didn’t work, until it did. When I tried to give early demos, too little worked, and it was hard to see the point of the whole thing. And this led to lots of skepticism, even by my own management team. So I decided it was best to do the project quietly, without saying much about it. And though it wasn’t my intention, things ramped up to the point where a couple hundred people were working completely under the radar (in our very geographically distributed organization) on the project.

But finally, Wolfram|Alpha really started to work. I gave a demo to my formerly skeptical management team, and by the end of an hour there was uniform enthusiasm, and lots of ideas and suggestions.

And so it was that in the spring of 2009, we prepared to launch Wolfram|Alpha.

On March 4, 2009, the wolframalpha.com domain lit up, with a simple:

On March 5, I posted a short (and, in the light of the past decade, satisfyingly prophetic) blog that began:

We were adding features and fixing bugs at a furious pace. And rack by rack we were building infrastructure to actually support the system (yes, below all those layers of computational intelligence there are ultimately computers with power cables and network connectors and everything else):

At the beginning, we had about 10,000 cores set up to run Wolfram|Alpha (back then, virtualization wasn’t an option for the kind of performance we wanted). But we had no real idea if this would be enough—or what strange things missed by our simulations might happen when real people started using the system.

We could just have planned to put up a message on the site if something went wrong. But I thought it would be more interesting—and helpful—to actually show people what was going on behind the scenes. And so we decided to do something very unusual—and livestream to the internet the process of launching Wolfram|Alpha.

We planned our initial go-live to occur on the evening of Friday, May 15, 2009 (figuring that traffic would be lower on a Friday evening). And we built our version of a “Mission Control” to coordinate everything:

There were plenty of last-minute issues, many of them captured on the livestream. But in classic Mission Control style, each of our teams finally confirmed that we were “go for launch”—and at 9:33:50 pm CT, I pressed the big “Activate” button, and soon all network connections were open, and Wolfram|Alpha was live to the world.

Queries immediately started flowing in from around the world—and within a couple of hours it was clear that the concept of Wolfram|Alpha was a success—and that people found it very useful. It wasn’t long before bugs and suggestions started coming in too. And for a decade we’ve been being told we should give answers about the strangest things (“How many teeth does a snail have?” “How many spiders does the average American eat?” “Which superheroes can hold Thor’s hammer?” “What is the volume of a dog’s eyeball?”).

After our initial go-live on Friday evening, we spent the weekend watching how Wolfram|Alpha was performing (and fixing some hair-raising issues, for example about the routing of traffic to our different colos). And then, on Monday May 18, 2009, we declared Wolfram|Alpha officially launched.

So what’s happened over the past decade? Every second, there’s been new data flowing into Wolfram|Alpha. Weather. Stock prices. Aircraft positions. Earthquakes. Lots and lots more. Some things update only every month or every year (think: government statistics). Other things update when something happens (think: deaths, elections, etc.) Every week, there are administrative divisions that change in some country around the world. And, yes, occasionally there’s even a new official country (actually, only South Sudan in the past decade).

Wolfram|Alpha has got both broader and deeper in the past decade. There are new knowledge domains. About cat breeds, shipwrecks, cars, battles, radio stations, mines, congressional districts, anatomical structures, function spaces, glaciers, board games, mythological entities, yoga poses and many, many more. Of course, the most obvious domains, like countries, cities, movies, chemicals, words, foods, people, materials, airlines and mountains were already present when Wolfram|Alpha first launched. But over the past decade, we’ve dramatically extended the coverage of these.

What a decade ago was a small or fragmentary area of data, we’ve now systematically filled out—often with great effort. 140,000+ new kinds of food. 350,000 new notable people. 170+ new properties about 58,000 public companies. 100+ new properties about species (tail lengths, eye counts, etc.). 1.6 billion new data points from the US Census. Sometimes we’ve found existing data providers to work with, but quite often we’ve had to painstakingly curate the data ourselves.

It’s amazing how much in the world can be made computable if one puts in the effort. Like military conflicts, for example, which required both lots of historical work, and lots of judgement. And with each domain we add, we’ve put more and more effort into ensuring that it connects with other domains (What was the geolocation of the battle? What historical countries were involved? Etc.).

From even before Wolfram|Alpha launched, we had a wish list of domains to add. Some were comparatively easy. Others—like military conflicts or anatomical structures—took many years. Often, we at first thought a domain would be easy, only to discover all sorts of complicated issues (I had no idea how many different categories of model, make, trim, etc. are important for cars, for example).

In earlier years, we did experiments with volunteer and crowd-sourced data collection and curation. And in some specific areas this worked well, like local information from different countries (how do shoe sizes work in country X?), and properties of fictional characters (who were Batman’s parents?). But as we’ve built out more sophisticated tools, with more automation—as well as tuning our processes for making judgement calls—it’s become much more difficult for outside talent to be effective.

For years, we’ve been the world’s most prolific reporter of bugs in data sources. But with so much computable data about so many things, as well as so many models about how things work, we’re now in an absolutely unique position to validate, cross-check data—and use the latest machine learning to discover patterns and detect anomalies.

Of course, data is just one part of the Wolfram|Alpha story. Because Wolfram|Alpha is also full of algorithms—both precise and heuristic—for computing all kinds of things. And over the past decade, we’ve added all sorts of new algorithms, based on recent advances in science. We’ve also been able to steadily polish what we have, covering all those awkward corner cases (“Are angle units really dimensionless or not?”, “What is the country code of a satphone?”, and so on).

One of the big unknowns when we first launched Wolfram|Alpha was how people would interact with it, and what forms of linguistic input they would give. Many billions of queries later, we know a lot about that. We know a thousand ways to ask how much wood a woodchuck can chuck, etc. We know all the bizarre variants people use to specify even simple arithmetic with units. Every day we collect the “fallthroughs”—inputs we didn’t understand. And for a decade now we’ve been steadily extending our knowledgebase and our natural language understanding system to address them.

Ever since we first launched what’s now the Wolfram Language 30+ years ago, we’ve supported things that would now be called machine learning. But over the past decade, we’ve also become leaders in modern neural nets and deep learning. And in some specific situations, we’ve now been able to make good use of this technology in Wolfram|Alpha.

But there’s been no magic bullet, and I don’t expect one. If one wants to get data that’s systematically computable, one can’t forage it from the web, even with the finest modern machine learning. One can use machine learning to make suggestions in the data curation pipeline, but in the end, if you want to get the right answer, you need a human expert who can exercise judgement based on the accumulated knowledge of a field. (And, yes, the same is true of good training sets for many machine learning tasks.)

In the natural language understanding we need to do for Wolfram|Alpha, machine learning can sometimes help, especially in speeding things up. But if one wants to be certain about the symbolic interpretation of natural language, then—a bit like for doing arithmetic—to get good reliability and efficiency there’s basically no choice but to use the systematic algorithmic approach that we’ve been developing for many years.

Something else that’s advanced a lot since Wolfram|Alpha was launched is our ability to handle complex questions that combine many kinds of knowledge and computation. To do this has required several things. It’s needed more systematically computable data, with consistent structure across domains. It’s needed an underlying data infrastructure that can handle more complex queries. And it’s needed the ability to handle more sophisticated linguistics. None of these have been easy—but they’ve all steadily advanced.

By this point, Wolfram|Alpha is one of the more complex pieces of software and data engineering that exists in the world. It helps that it’s basically all written in Wolfram Language. But over time, different parts have outgrown the frameworks we originally built for them. And an important thing we’ve done over the past decade is to take what we’ve learned from all our experience, and use it to systematically build a sequence of more efficient and more general frameworks. (And, yes, it’s never easy refactoring a large software system, but the high-level symbolic character of the Wolfram Language helps a lot.)

There’s always new development going on in the Wolfram|Alpha codebase—and in fact we normally redeploy a new version every two weeks. Wolfram|Alpha is a very complex system to test. Partly that’s because what it does is so diverse. Partly that’s because the world it’s trying to represent is a complex place. And partly it’s because human language usage is so profoundly non-modular. (“3 chains” is probably—at least for now—a length measurement, “2 chains” is probably a misspelling of a rapper, and so on.)

What should Wolfram|Alpha know about? My goal has always been to have it eventually know about everything. But obviously one’s got to start somewhere. And when we were first building Wolfram|Alpha we started with what we thought were the “most obvious” areas. Of course, once Wolfram|Alpha was launched, the huge stream of actual questions that people ask have defined a giant to-do list, which we’ve steadily been working through, now for a decade.

When Wolfram|Alpha gets used in a new environment, new kinds of questions come up. Sometimes they don’t make sense (like “Where did I put my keys?” asked of Wolfram|Alpha on a phone). But often they do. Like asking Wolfram|Alpha on a device in a kitchen “Can dogs eat X?”. (And, yes, we’ll be trying to give the best answer current science can provide.)

But I have to admit that, particularly before we launched Wolfram|Alpha, I was personally one of our main sources of “we should know about this” input. I collected reference books, seeing what kinds of things they covered. Wherever I went, I looked for informational posters to see what was on them. And whenever I wondered about pretty much anything, I’d try to see how we could compute about it.

“How long will it take me to read this document?” “What country does that license plate come from?” “What height percentile are my kids at?” “How big is a typical 50-year-old oak tree?” “How long can I stay in the sun today?” “What planes are overhead now?” And on and on. Thousands upon thousands of different kinds of questions.

Often we’d be contacting world experts on different, obscure topics—always trying to get definitive computational knowledge about everything. Sometimes it’d seem as if we’d gone quite overboard, working out details nobody would ever possibly care about. But then we’d see people using those details, and sometimes we’d hear “Oh, yes, I use it every day; I don’t know anyplace else to get this right”. (I’ve sometimes thought that if Wolfram|Alpha had been out before 2008, and people could have seen our simulations, they wouldn’t have been caught with so many adjustable-rate mortgages.)

And, yes, it’s a little disappointing when one realizes that some fascinating piece of computational knowledge that took considerable effort to get right in Wolfram|Alpha will—with current usage patterns—probably only be used a few times in a century. But I view the Wolfram|Alpha project in no small part as a long-term effort to encapsulate the knowledge of our civilization, regardless of whether any of it happens to be popular right now.

So even if few people make queries about caves or cemeteries or ocean zones right now, or want to know about different types of paper, or custom screw threads, or acoustic absorption in different materials, I’m glad we’ve got all these things in Wolfram|Alpha. Because now it’s computational knowledge, that can be used by anyone, anytime in the future.

We’ve put—and continue to put—an immense amount of effort into developing and running Wolfram|Alpha. So how do we manage to support doing that? What’s the business model?

The main Wolfram|Alpha website is simply free for everyone. Why? Because we want it to be that way. We want to democratize computational knowledge, and let anyone anywhere use what we’ve built.

Of course, we hope that people who use the Wolfram|Alpha website will want to buy other things we make. But on the website itself there’s simply no “catch”: we’re not monetizing anything. We’re not running external ads; we’re not selling user data; we’re just keeping everything completely private, and always have.

But obviously there are ways in which we are monetizing Wolfram|Alpha—otherwise we wouldn’t be able to do everything we’re doing. At the simplest level, there are subscription-based Pro versions on the website that have extra features of particular interest to students and professionals. There’s a Wolfram|Alpha app that has extra features optimized for mobile devices. There are also about 50 specialized apps (most for both mobile and web) that support more structured access to Wolfram|Alpha, convenient for students taking courses, hobbyists with particular interests, and professionals with standard workflows they repeatedly follow.

Then there are Wolfram|Alpha APIs—which are widely licensed by companies large and small (there’s a free tier for hobbyists and developers). There are multiple different APIs. Some are optimized for spoken results, some for back-and-forth conversation, some for visual display, and so on. Sometimes the API is used for some very specific purpose (calculus, particular socioeconomic data, tide computations, whatever). But more often it’s just set up to take any natural language query that arrives. (These days, specialized APIs are actually usually better built directly with Wolfram Language, as I’ll discuss a bit later.) Most of the time, the Wolfram|Alpha API runs on our servers, but some of our largest customers have private versions running inside their infrastructure.

When people access Wolfram|Alpha from different parts of the world, we automatically use local conventions for things like units, currency and so on. But when we first built Wolfram|Alpha we fundamentally did it for English language only. I always believed, though, that the methods for natural language understanding that we invented would work for other languages too, despite all their differences in structure. And it turns out that they do.

Each language is a lot of work, though. Even the best automated translation helps only a little; to get reliable results one has to actually build up a new algorithmic structure for each language. But that’s only the beginning. There’s also the issue of automatic natural language generation for output. And then there’s localized data relevant for the countries that use a particular language.

But we’re gradually working on building versions of Wolfram|Alpha for other languages. Nearly five years ago we actually built a full Wolfram|Alpha for Chinese—but, sadly, regulatory issues in China have so far prevented us from deploying it there. Recently we released a version for Japanese (right now set up to handle mainly student-oriented queries). And we’ve got versions for five other languages in various stages of completion (though we’ll typically need local partners to deploy them properly).

Beyond Wolfram|Alpha on the public web, there are also private versions of Wolfram|Alpha. In the simplest case, a private Wolfram|Alpha is just a copy of the public Wolfram|Alpha, but running inside a particular organization’s infrastructure. Data updates flow into the private Wolfram|Alpha from the outside, but no queries for the private Wolfram|Alpha ever need to leave the organization.

Ordinary Wolfram|Alpha deals with public computational knowledge. But the technology of Wolfram|Alpha can also be applied to private data in an organization. And in recent years an important part of the business story of Wolfram|Alpha is what we call Enterprise Wolfram|Alpha: custom versions of Wolfram|Alpha that answer questions using both public computational knowledge, and private knowledge inside an organization.

For years I’ve run into CEOs who look at Wolfram|Alpha and say, “I wish I could do that kind of thing with my corporate data; it’d be so much easier for my company to make decisions…” Well, that’s what Enterprise Wolfram|Alpha is for. And over the past several years we’ve been installing Enterprise Wolfram|Alpha in some of the world’s largest companies in all sorts of industries, from healthcare to financial services, retail, and so on.

For a few years now, there’s been a lot of talk (and advertising) about the potential for “applying AI in the enterprise”. But I think it’s fair to say that with Enterprise Wolfram|Alpha we’ve got a serious, enterprise use of AI up and running right now—delivering very successful results.

The typical pattern is that you ask a question in natural language, and Enterprise Wolfram|Alpha then generates a report about the answer, using a mixture of public and private knowledge. “What were our sales of foo-pluses in Europe between Christmas and New Year?” Enterprise Wolfram|Alpha has public knowledge about what dates we’re talking about, and what Europe is. But then it’s got to figure out the internal linguistics of what foo-pluses are, and then go query an internal sales database about how many were sold. Finally, it’s got to generate a report that gives the answer (perhaps both the number of units and dollar amount), as well as, probably, a breakdown by country (perhaps normalized by GDP), comparisons to previous years, maybe a time series of sales by day, and so on.

Needless to say, there’s plenty of subtlety in getting a useful result. Like what the definition of Europe is. Or the fact that Christmas (and New Year’s) can be on different dates in different cultures (and, of course, Wolfram|Alpha has all the necessary data and algorithms). Oh, and then one has to start worrying about currency conversion rates (which of course Wolfram|Alpha has)—as well as about conventions about conversion dates that some particular company may use.

Like any sophisticated piece of enterprise software, Enterprise Wolfram|Alpha has to be configured for each particular customer, and we have a business unit called Wolfram Solutions that does that. The goal is always to map the knowledge in an organization to a clear symbolic Wolfram Language form, so it becomes computable in the Wolfram|Alpha system. Realistically, for a large organization, it’s a lot of work. But the good news is that it’s possible—because Wolfram Solutions gets to use the whole curation and algorithm pipeline that we’ve developed for Wolfram|Alpha.

Of course, we can use all the algorithmic capabilities of the Wolfram Language too. So if we have to handle textual data we’re ready with the latest NLP tools, or if we want to be able to make predictions we’re ready with the latest statistics and machine learning, and so on.

Businesses started routinely putting their data onto computers more than half a century ago. But now across pretty much every industry, more acutely than ever, the challenge is to actually use that data in meaningful ways. Eventually everyone will take for granted that they can just ask about their data, like on *Star Trek*. But the point is that with Enterprise Wolfram|Alpha we have the technology to finally make this possible.

It’s a very successful application of Wolfram|Alpha technology, and the business potential for it is amazing. But for us the main limiting factor is that as a business it’s so different from the rest of what we do. Our company is very much focused on R&D—but Enterprise Wolfram|Alpha requires a large-scale customer-facing organization, like a typical enterprise software company. (And, yes, we’re exploring working with partners for this, but setting up such things has proved to be a slow process!)

By the way, people sometimes seem to think that the big opportunity for AI in the enterprise is in dealing with unstructured corporate data (such as free-form text), and finding “needles in haystacks” there. But what we’ve consistently seen is that in typical enterprises most of their data is actually stored in very structured databases. And the challenge, instead, is to answer unstructured queries.

In the past, it’s been basically impossible to do this in anything other than very simple ways. But now we can see why: because you basically need the whole Wolfram|Alpha technology stack to be able to do it. You need natural language understanding, you need computational knowledge, you need automated report generation, and so on. But that’s what Enterprise Wolfram|Alpha has. And so it’s finally able to solve this problem.

But what does it mean? It’s a little bit like when we first introduced Mathematica 30+ years ago. Before then, a typical scientist wouldn’t expect to use a computer themselves for a computation: they’d delegate it to an expert. But one of the great achievements of Mathematica is that it made things easy enough that scientists could actually compute for themselves. And so, similarly, typical executives in companies don’t directly compute answers themselves; instead, they ask their IT department to do it—then hope the results they get back a week later makes sense. But the point is that with Enterprise Wolfram|Alpha, executives can actually get questions answered themselves, immediately. And the consequences of that for making decisions are pretty spectacular.

The Wolfram Language is what made Wolfram|Alpha possible. But over the past decade Wolfram|Alpha has also given back big time to Wolfram Language, delivering both knowledgebase and natural language understanding.

It’s interesting to compare Wolfram|Alpha and Wolfram Language. Wolfram|Alpha is for quick computations, specified in a completely unstructured way using natural language, and generating as output reports intended for human consumption. Wolfram Language, on the other hand, is a precise symbolic language intended for building up arbitrarily complex computations—in a way that can be systematically understood by computers and humans.

One of the central features of the Wolfram Language is that it can deal not only with abstract computational constructs, but also with things in the real world, like cities and chemicals. But how should one specify these real-world things? Documentation listing the appropriate way to specify every city wouldn’t be practical or useful. But what Wolfram|Alpha provided was a way to specify real-world things, using natural language.

Inside Wolfram|Alpha, natural language input is translated to Wolfram Language. And that’s what’s now exposed in the Wolfram Language, and in Wolfram Notebooks. Type + = and a piece of natural language (like “LA”). The output—courtesy of Wolfram|Alpha natural language understanding technology—is a symbolic entity representing Los Angeles. And that symbolic entity is then a precise object that the Wolfram Language can use in computations.

I didn’t particularly anticipate it, but this interplay between the do-it-however-you-want approach of Wolfram|Alpha and the precise symbolic approach of the Wolfram Language is exceptionally powerful. It gets the best of both worlds—and it’s an important element in allowing the Wolfram Language to assume its unique position as a full-scale computational language.

What about the knowledgebase of Wolfram|Alpha, and all the data it contains? Over the past decade we’ve spent immense effort fully integrating more and more of this into the Wolfram Language. It’s always difficult to get data to the point where it’s computable enough to use in Wolfram|Alpha—but it’s even more difficult to make it fully and systematically computable in the way that’s needed for the Wolfram Language.

Imagine you’re dealing with data about oceans. To make it useful for Wolfram|Alpha you have to get it to the point where if someone asks about a specific named ocean, you can systematically retrieve or compute properties of that ocean. But to make it useful for Wolfram Language, you have to get it to the point where someone can do computations about all oceans, with none missing.

A while ago I invented a 10-step hierarchy of data curation. For data to work in Wolfram|Alpha, you have to get it to level 9 in the hierarchy. But to get it to work in Wolfram Language, you have to get it all the way to level 10. And if it takes a few months to get some data to level 9, it can easily take another year to get it to level 10.

So it’s been a big achievement that over the past decade we’ve managed to get the vast majority of the Wolfram|Alpha knowledgebase up to the level where it can be integrated into the Wolfram Language. So all that data is now not only good enough for human consumption, but also good enough that one can systematically build up computations using it.

All the integration into the Wolfram Language means it’s in some sense now possible to “implement Wolfram|Alpha” in a single line of Wolfram Language code. But it also means that it’s easy to make Wolfram Language instant APIs that do more specific Wolfram|Alpha-like things.

There’s an increasing amount of interconnection between Wolfram|Alpha and Wolfram Language. For example, on the Wolfram|Alpha website most output pods have an “Open Code” button, which opens a Wolfram Notebook in the Wolfram Cloud, with Wolfram Language input that corresponds to what was computed in that pod.

In other words, you can use results from Wolfram|Alpha to “seed” a Wolfram Notebook, in which you can then edit or add inputs do a complete, multi-step Wolfram Language computation. (By the way, you can always generate full Wolfram|Alpha output inside a Wolfram Notebook too.)

When Wolfram|Alpha first launched nobody had seen anything like it. A decade later, people have learned to take some aspects of it for granted, and have gotten used to having it available in things like intelligent assistants. But what will the future of Wolfram|Alpha now be?

Over the past decade we’ve progressively strengthened essentially everything about Wolfram|Alpha—to the point where it’s now excellently positioned for steady long-term growth in future decades. But with Wolfram|Alpha as it exists today, we’re now also in a position to start attacking all sorts of major new directions. And—important as what Wolfram|Alpha has achieved in its first decade has been—I suspect that in time it will be dwarfed by what comes next.

A decade ago, nobody had heard of “fake news”. Today, it’s ubiquitous. But I’m proud that Wolfram|Alpha stands as a beacon of accurate knowledge. And it’s not just knowledge that humans can use; it’s knowledge that’s computable, and suitable for computers too.

More and more is being done these days with computational contracts—both on blockchains and elsewhere. And one of the central things such contracts require is a way to know what’s actually happened in the world—or, in other words, a systematic source of computational facts.

But that’s exactly what Wolfram|Alpha uniquely provides. And already the Wolfram|Alpha API has become the *de facto* standard for computational facts. But one’s going to see a lot more of Wolfram|Alpha here in the future.

It’s going to put increasing pressure on the reliability of the computational knowledge in Wolfram|Alpha. Because it won’t be long before there will routinely be whole chains of computational contracts—that do important things in the world—and that trigger as soon as Wolfram|Alpha has delivered some particular fact on which they depend.

We’ve developed all sorts of procedures to validate facts. Some are automated—and depend on “theorems” that must be true about data, or cross-correlations or statistical regularities that should exist. Others ultimately rely on human judgement. (A macabre example is our obituary feed: we automatically detect news reports about deaths of people in our knowledgebase. These are then passed to our 24/7 site monitors, who confirm, or escalate the judgement call if needed. Somehow I’m on the distribution list for confirmation requests—and over the past decade there’ve been far too many times when this is how I’ve learned that someone I know has died.)

We take our responsibility as the world’s source of computational facts very seriously, and we’re planning more and more ways to add checks and balances—needless to say, defining what we’re doing using computational contracts.

When we first started developing Wolfram|Alpha, nobody was talking about computational contracts (though, to be fair, I had already thought about them as a potential application of my computational ideas). But now it turns out that Wolfram|Alpha is central to what can be done with them. And as a core component in the long history of the development of systematic knowledge, I think it’s inevitable that over time there will be all sorts of important uses of Wolfram|Alpha that we can’t yet foresee.

In the early days of artificial intelligence, much of what people imagined AI would be like is basically what Wolfram|Alpha has now delivered. So what can now be done with this?

We can certainly put “general knowledge AI” everywhere. Not just in phones and cars and televisions and smart speakers, but also in augmented reality and head- and ear-mounted devices and many other places too.

One of the Wolfram|Alpha APIs we provide is a “conversational” one, that can go back and forth clarifying and extending questions. But what about a full Wolfram|Alpha Turing test–like bot? Even after all these years, general-purpose bots have tended to be disappointing. And if one just connects Wolfram|Alpha to them, there tends to be quite a mismatch between general bot responses and “smart facts” from Wolfram|Alpha. (And, yes, in a Turing test competition, the presence of Wolfram|Alpha is a dead giveaway—because it knows much more than any human would.) But with progress in my symbolic discourse language–and probably some modern machine learning—I suspect it’ll be possible to make a more successful general-purpose bot that’s more integrated with Wolfram|Alpha.

But what I think is critical in many future applications of Wolfram|Alpha is to have additional sources of data and input. If one’s making a personal intelligent assistant, for example, then one wants to give it access to as much personal history data (messages, sensor data, video, etc.) as possible. (We already did early experiments on this back in 2011 with Facebook data.)

Then one can use Wolfram|Alpha to ask questions not only about the world in general, but also about one’s own interaction with it, and one’s own history. One can ask those questions explicitly with natural language—or one can imagine, for example, preemptively delivering answers based on video or some other aspect of one’s current environment.

Beyond personal uses, there are also organizational and enterprise ones. And indeed we already have Enterprise Wolfram|Alpha—making use of data inside organizations. So far, we’ve been building Enterprise Wolfram|Alpha systems mainly for some of the world’s largest companies—and every system has been unique and extensively customized. But in time—especially as we deal with smaller organizations that have more commonality within a particular industry—I expect that we’ll be able to make Enterprise Wolfram|Alpha systems that are much more turnkey, effectively by curating the possible structures of businesses and their IT systems.

And, to be clear, the potential here is huge. Because basically every organization in the world is today collecting data. And Enterprise Wolfram|Alpha will provide a realistic way for anyone in an organization to ask questions about their data, and make decisions based on it.

There are so many sources of data for Wolfram|Alpha that one can imagine. It could be photographs from drones or satellites. It could be video feeds. It could be sensor data from industrial equipment or robots. It could be telemetry from inside a game or a virtual world (like from our new UnityLink). It could be the results of a simulation of some system (say in Wolfram SystemModeler). But in all cases, one can expect to use the technology of Wolfram|Alpha to provide answers to free-form questions.

One can think of Wolfram|Alpha as enabling a kind of AI-powered human interface. And one can imagine using it not only to ask questions about existing data, but also as a way to control things, and to get actions taken. We’ve done experiments with Wolfram|Alpha-based interfaces to complex software systems. But one could as well do this with consumer devices, industrial systems, or basically anything that can be controlled through a connection to a computer.

Not everything is best done with pure Wolfram|Alpha—or with something like natural language. Many things are better done with the full computational language that we have in the Wolfram Language. But when we’re using this language, we’re of course still using the Wolfram|Alpha technology stack.

Wolfram|Alpha is already well on its way to being a ubiquitous presence in the computational infrastructure of the world. And between its direct use, and its use in Wolfram Language, I think we can expect that in the future we’ll all end up routinely encountering Wolfram|Alphas all the time.

For many decades our company—and I—have been single-mindedly pursuing the goal of realizing the potential of computation and the computational paradigm. And in doing this, I think we’ve built a very unique organization, with very unique capabilities.

And looking back a decade after the launch of Wolfram|Alpha, I think it’s no surprise that Wolfram|Alpha has such a unique place in the world. It is, in a sense, the kind of thing that our company is uniquely built to create and develop.

I’ve wanted Wolfram|Alpha for nearly 50 years. And it’s tremendously satisfying to have been able to create what I think will be a defining intellectual edifice in the long history of systematic knowledge. It’s been a good first decade for Wolfram|Alpha. And I begin its second decade with great enthusiasm for the future and for everything that can be done with Wolfram|Alpha.

Happy 10th birthday, Wolfram|Alpha.

*To comment, please visit the copy of this post at the Stephen Wolfram Blog »*

The Wolfram Language gives programmers a unique computational language with an enormous array of sophisticated algorithms and built-in real-world knowledge. For many years, people have asked us how to access all the power of our technology from other software environments and programming languages. And over the years, we have built many such connections, like Wolfram CloudConnector for Excel, WSTP (Wolfram Symbolic Transfer Protocol) for C/C++ programs and, of course, J/Link, which provides access to the Wolfram Language directly from Java.

So today we’re happy to formally announce a new and often-requested connection that allows you to call the Wolfram Language directly and efficiently from Python: the Wolfram Client Library for Python. And, even better, this client library is fully open source as the WolframClientForPython git repository under the MIT License, so you can clone it and use it any way you see fit.

“The Wolfram Client Library for Python is fully open source.”

The Wolfram Client Library makes it easy to integrate the large collection of Wolfram Language algorithms as well as the Wolfram Knowledgebase directly into any Python code that you already have. This saves you considerable time and effort when developing new code. In this post, we’ll first show you how to set up a connection from Python to the Wolfram Language. Next, we’ll explore a few methods and examples you can use to do a computation in the Wolfram Language and then call it for use in your Python session. For a complete introductory tutorial and full reference documentation, visit the documentation home page for the Wolfram Client Library for Python.

Let’s start with a simple example, which computes the mean and standard deviation of one million numbers drawn from a normal distribution. This example shows how to call a Wolfram Language function from Python and compares the results from Python and the Wolfram Language to show that they are numerically close to one another.

First, to connect to the Wolfram Language, you need to create a new session with the Wolfram Engine:

```
>>> from wolframclient.evaluation import WolframLanguageSession
>>> session = WolframLanguageSession()
```

✕

from wolframclient.evaluation import WolframLanguageSession session=WolframLanguageSession()

To call Wolfram Language functions, you need to import the `wl` factory:

```
>>> from wolframclient.language import wl
```

✕

from wolframclient.language import wl

Now you can evaluate any Wolfram Language code. Set the Python variable sample to a list of one million random numbers drawn from the normal distribution, with a mean of 0 and a standard deviation of 1:

```
>>> sample = session.evaluate(wl.RandomVariate(
wl.NormalDistribution(0,1), 1e6))
```

✕

sample = session.evaluate( wl.RandomVariate(wl.NormalDistribution(0,1), 1e6))

You can take a look at the first five of them:

```
>>> sample[:5]
[0.44767075774581,
0.9662810005828261,
-1.327910570542906,
-0.2383857558557122,
1.1826399551062043]
```

✕

sample[:5]

You can compute the mean value of this sample with the Wolfram Language. As expected, it is close to zero:

```
>>> session.evaluate (wl.Mean(sample))
0.0013371607703851515
```

✕

session.evaluate(wl.Mean(sample))

You can also directly compute this in Python, to verify that you get a numerically similar result:

```
>>> from statistics import mean
>>> mean(sample)
0.0013371607703851474
```

✕

from statistics import mean mean(sample)

Similarly, you can compute the standard deviation of sample with the Wolfram Language:

```
>>> session.evaluate(wl.StandardDeviation(sample))
1.0014296230797068
```

✕

session.evaluate(wl.StandardDeviation(sample))

Again run the following code in Python to verify that you get a similar result:

```
>>> stdev(sample)
1.0014296230797068
```

✕

stdev(sample)

It’s good to see that these results are comparable. Now you know how to call some simple Wolfram Language functions from Python. Let’s continue with a more exciting example.

Let’s take a look at a built-in Wolfram Language function that’s not readily available in Python, `WolframAlpha`:

```
>>> moons = session.evaluate(wl.WolframAlpha('moons of Saturn', 'Result'))
```

✕

moons = session.evaluate( wl.WolframAlpha('moons of Saturn', 'Result'))

The `WolframAlpha` function is one of the high-level functions in the Wolfram Language that interacts with the Wolfram|Alpha servers via a web API. You can use this API directly from Python, but doing it by calling the `WolframAlpha` function is much more powerful and convenient because you can access all the data framework functions from the Wolfram Language directly. Let’s take a look at what the Python variable moons contains:

```
>>> moons
EntityClass['PlanetaryMoon', 'SaturnMoon']
```

✕

moons

The output here is the Python representation of a Wolfram Language expression, which can be reused in any subsequent evaluation. For example, if you want to get the list of Saturn’s first four moons (by proximity) explicitly, you can do this:

```
>>> session.evaluate(wl.EntityList(moons))[:4]
[Entity['PlanetaryMoon', 'S2009S1'],
Entity['PlanetaryMoon', 'Pan'],
Entity['PlanetaryMoon', 'Daphnis'],
Entity['PlanetaryMoon', 'Atlas']]
```

✕

session.evaluate(wl.EntityList(moons))[:4]

Or you can easily get the four largest moons of Saturn by mass with this small snippet of code:

```
>>> bigmoons = session.evaluate(wl.EntityList(
wl.SortedEntityClass(moons, wl.Rule("Mass","Descending"),4)))
>>> bigmoons
[Entity['PlanetaryMoon', 'Titan'],
Entity['PlanetaryMoon', 'Rhea'],
Entity['PlanetaryMoon', 'Iapetus'],
Entity['PlanetaryMoon', 'Dione']]
```

✕

bigmoons = session.evaluate(wl.EntityList( wl.SortedEntityClass(moons, wl.Rule("Mass","Descending"),4))) bigmoons

And you can get a simple array of strings with the names of these moons like this:

```
>>> session.evaluate(wl.Map(wl.Function(wl.Slot()("Name")), bigmoons))
['Titan', 'Rhea', 'Iapetus', 'Dione']
```

✕

session.evaluate(wl.Map(wl.Function( wl.Slot()("Name")), bigmoons))

This is all pretty cool. Let’s take a look at another example, using the Wolfram Language’s built-in image processing and machine learning functions.

First, let’s switch over to another mode to do evaluations directly in the Wolfram Language. So far you’ve used the `wl` factory to build up Wolfram Language expressions in Python. But you can also evaluate Python strings containing Wolfram Language code, and sometimes this is easier to read:

```
>>> from wolframclient.language import wlexpr
```

✕

from wolframclient.language import wlexpr

For example, you can evaluate 1+1 in the Wolfram Language by sending it as a string:

```
>>> session.evaluate('1+1')
2
```

✕

session.evaluate('1+1')

Using this method, you can write a small snippet of Wolfram Language code that takes an image and uses the built-in face-detection algorithm to find the location of a face in an image. Here, the image we’re using is the famous painting titled *Girl with a Pearl Earring* by the Dutch painter Johannes Vermeer (but it works on almost any image with recognizable faces). Because the Python terminal interface does not support the display of images, we’ll need use a Jupyter notebook instead, together with the Python Image Library (PIL) package, to help with displaying the result:

✕
from PIL import Image import io session.evaluate( wlexpr(''' image = ImageResize[ Import["Girl_with_a_Pearl_Earring.jpg"], 300]; boxes = FindFaces[image]; face = ImageAssemble[{{image,HighlightImage[image, boxes, "Blur"]}}]; ''') ) data = session.evaluate( wlexpr('ExportByteArray[ face, "PNG" ]') ) Image.open(io.BytesIO) |

Quite easy and powerful. But what if you don’t have a local installation of the Wolfram Engine, and want to use the Wolfram Client Library for Python? You can still use the Wolfram Language directly in the Wolfram Cloud.

The Wolfram Cloud provides easy access to the Wolfram Language without needing to install it locally. The Wolfram Cloud provides various services, including a notebook web interface for Wolfram Language programming as well as the capability to deploy arbitrary Wolfram Language web APIs.

Here you’ll make use of the latter, deploying a Wolfram Language web API. This particular API accepts the names of two countries (`country1` and `country2`), finds the capital city for each country and then computes the distance between them (in kilometers):

✕
CloudDeploy[ APIFunction[{"country1"->"String","country2"->"String"}, QuantityMagnitude[ GeoDistance[ EntityValue[Entity["Country", #country1], "CapitalCity"], EntityValue[Entity["Country", #country2], "CapitalCity"] ], "Kilometers" ]&, "WXF" ], CloudObject["api/public/capital_distance"], Permissions->"Public"] |

After the deployment of this API, you can start a new Wolfram Language session, but this time you connect to the Wolfram Cloud instead of the local desktop engine:

```
>>> from wolframclient.evaluation WolframCloudSession
>>> cloud = WolframCloudSession()
```

✕

from wolframclient.evaluation WolframCloudSession cloud = WolframCloudSession()

To call the API, you have to provide the username (user1) and the API endpoint (api/public/capital_distance). With that information, you can call the cloud…

```
>>> api = ('user1', 'api/public/capital_distance')
>>> result = cloud.call(api, {'country1': 'Netherlands',
'country2': 'Spain'})
```

✕

api = ('user1', 'api/public/capital_distance') result = cloud.call(api, {'country1': 'Netherlands', 'country2': 'Spain'})

… and get the result:

```
>>> result.get()
1481.4538329484521
```

✕

result.get()

Once again, easy and useful.

If you want to keep your deployed Wolfram Language API private, so that only you can use it, you can deploy the API with `Permissions → "Private"`. Then, to authenticate yourself to the private API, you can generate (in the Wolfram Language) a secured authentication key:

✕
key = GenerateSecuredAuthenticationKey["myapp"] |

Copy the outputs from these two inputs:

✕
key["ConsumerKey"] |

✕
key["ConsumerSecret"] |

Then paste them into your Python session:

```
>>> from wolframclient.evaluation import SecuredAuthenticationKey
>>> sak = SecuredAuthenticationKey(
... '<<paste-consumer-key-here>>',
... '<<paste-consumer-secret-here>>')
```

✕

from wolframclient.evaluation import SecuredAuthenticationKey sak = SecuredAuthenticationKey( ... '<>', ... '< >')

And finally, start a new authenticated cloud session:

```
>>> cloud = WolframCloudSession(credentials=sak)
>>> cloud.start()
>>> cloud.authorized()
True
```

✕

cloud = WolframCloudSession(credentials=sak) cloud.start() cloud.authorized()

That’s it. At this point you (and only you) can use any Wolfram Language API that you have deployed privately.

To make everything very fast and efficient, the Wolfram Client Library for Python uses the open WXF format to exchange expressions between Python and the Wolfram Language. WXF is a binary format for faithfully serializing Wolfram Language expressions, in a form suitable for interchange with external programs. The library function export can serialize Python objects to string input form and WXF, and natively supports a set of built-in Python classes such as dict, list and strings:

```
>>> from wolframclient.serializers import export
>>> export({
'list': [1,2,3],
'string': u'abc',
'etc': [0, None, -1.2]
})
b'<|"list" -> {1, 2, 3}, "string" -> "abc", "etc" -> {0, None, -1.2}|>'
```

✕

from wolframclient.serializers import export export({ 'list': [1,2,3], 'string': u'abc', 'etc': [0, None, -1.2] }) b'<|"list" -> {1, 2, 3}, "string" -> "abc", "etc" -> {0, None, -1.2}|>'

WXF represents numeric arrays with packed data, allowing efficient support for NumPy arrays.

Create a new array of 255 unsigned 8-bit integers:

```
>>> import numpy
>>> array=numpy.arange(255, dtype='uint8')
```

✕

import numpy array=numpy.arange(255, dtype='uint8')

Serialize it to WXF bytes and compute the byte count:

```
>>> wxf=export(array, target_format='wxf')
>>> len(wxf)
262
```

✕

wxf=export(array, target_format='wxf') len(wxf)

NumPy arrays back many Python libraries. Therefore, an efficient and compact serialization helps in interfacing the Python ecosystem with the Wolfram Language. A direct consequence of supporting NumPy is that the serialization of PIL images is in general very efficient; most of the pixel data *modes* map to one of the numeric array types, specified by `NumericArrayType`. It’s also worth mentioning that pandas Series and DataFrame are supported natively. The library also provides an extensible mechanism for serializing arbitrary classes.

Install the latest version of the Wolfram Client Library for Python with pip:

✕
$ pip install wolframclient |

It requires Python 3.5.3 (or above) and Wolfram Language 11.3 (or above). Check out the full documentation on the Wolfram Client Library for Python. The entire source code is hosted in the WolframClientForPython repository on the Wolfram Research GitHub site. And if you see a way to improve it, you can help us make it better by contributing pull requests to this repository.

We’re very excited about this release and hope you find it useful. Let us know what you think in the comments section or on Wolfram Community, and we’ll do our best to respond to questions.

]]>I’ve sometimes found it a bit of a struggle to explain what the Wolfram Language really is. Yes, it’s a computer language—a programming language. And it does—in a uniquely productive way, I might add—what standard programming languages do. But that’s only a very small part of the story. And what I’ve finally come to realize is that one should actually think of the Wolfram Language as an entirely different—and new—kind of thing: what one can call a *computational language*.

So what is a computational language? It’s a language for expressing things in a computational way—and for capturing computational ways of thinking about things. It’s not just a language for telling computers what to do. It’s a language that both computers and humans can use to represent computational ways of thinking about things. It’s a language that puts into concrete form a computational view of everything. It’s a language that lets one use the computational paradigm as a framework for formulating and organizing one’s thoughts.

It’s only recently that I’ve begun to properly internalize just how broad the implications of having a computational language really are—even though, ironically, I’ve spent much of my life engaged precisely in the consuming task of building the world’s only large-scale computational language.

It helps me to think about a historical analog. Five hundred years ago, if people wanted to talk about mathematical ideas and operations, they basically had to use human natural language, essentially writing out everything in terms of words. But the invention of mathematical notation about 400 years ago (starting with +, ×, =, etc.) changed all that—and began to provide a systematic structure and framework for representing mathematical ideas.

The consequences were surprisingly dramatic. Because basically it was this development that made modern forms of mathematical thinking (like algebra and calculus) feasible—and that launched the mathematical way of thinking about the world as we know it, with all the science and technology that’s come from it.

Well, I think it’s a similar story with computational language. But now what’s happening is that we’re getting a systematic way to represent—and talk about—computational ideas, and the computational way of thinking about the world. With standard programming languages, we’ve had a way to talk about the low-level operation of computers. But with computational language, we now have a way to apply the computational paradigm directly to almost anything: we have a language and a notation for doing computational X, for basically any field “X” (from archaeology to zoology, and beyond).

There’ve been some “mathematical X” fields for a while, where typically the point is to formulate things in terms of traditional mathematical constructs (like equations), that can then “mechanically” be solved (at least, say, with Mathematica!). But a great realization of the past few decades has been that the computational paradigm is much broader: much more can be represented computationally than just mathematically.

Sometimes one’s dealing with very simple abstract programs (and, indeed, I’ve spent years exploring the science of the computational universe of such programs). But often one’s interested in operations and entities that relate to our direct experience of the world. But the crucial point here is that—as we’ve learned in building the Wolfram Language—it’s possible to represent such things in a computational way. In other words, it’s possible to have a computational language that can talk about the world—in computational terms.

And that’s what’s needed to really launch all those possible “computational X” fields.

Let’s say we want to talk about planets. In the Wolfram Language, planets are just symbolic entities:

✕
EntityList[EntityClass["Planet", All]] |

We can compute things about them (here, the mass of Jupiter divided by the mass of Earth):

✕
Entity["Planet", "Jupiter"]["Mass"]/Entity["Planet", "Earth"]["Mass"] |

Let’s make an image collage in which the mass of each planet determines how big it’s shown:

✕
ImageCollage[ EntityClass["Planet", All]["Mass"] -> EntityClass["Planet", All]["Image"]] |

To talk about the real world in computational terms, you have to be able to compute things about it. Like here, the Wolfram Language is computing the current position (as I write this) of the planet Mars:

✕
Entity["Planet", "Mars"][EntityProperty["Planet", "HelioCoordinates"]] |

And here it’s making a 3D plot of a table of its positions for each of the next 18 months from now:

✕
ListPointPlot3D[Table[ Entity["Planet", "Mars"][ Dated[EntityProperty["Planet", "HelioCoordinates"], Now + Quantity[n, "Months"]]], {n, 18}]] |

Let’s do another example. Take an image, and find the human faces in it:

✕
FacialFeatures[CloudGet["https://wolfr.am/DpadWvjE"], "Image"] |

As another example of computation-meets-the-real-world, we can make a histogram (say, in 5-year bins) of the estimated ages of people in the picture:

✕
Histogram[ FacialFeatures[CloudGet["https://wolfr.am/DpadWvjE"], "Age"], {5}] |

It’s amazing what ends up being computable. Here are rasterized images of each letter of the Greek alphabet distributed in “visual feature space”:

✕
FeatureSpacePlot[Rasterize /@ Alphabet["Greek"]] |

Yes, it is (I think) impressive what the Wolfram Language can do. But what’s more important here is to see how it lets one specify what to do. Because this is where computational language is at work—giving us a way to talk computationally about planets and human faces and visual feature spaces.

Of course, once we’ve formulated something in computational language, we’re in a position (thanks to the whole knowledgebase and algorithmbase of the Wolfram Language) to actually do a computation about it. And, needless to say, this is extremely powerful. But what’s also extremely powerful is that the computational language itself gives us a way to formulate things in computational terms.

Let’s say we want to know how efficient the Roman numeral system was. How do we formulate that question computationally? We might think about knowing the string lengths of Roman numerals, and comparing them to the lengths of modern integers. It’s easy to express that in Wolfram Language. Here’s a Roman numeral:

✕
RomanNumeral[188] |

And here’s its string length:

✕
StringLength[RomanNumeral[188]] |

Now here’s a plot of all Roman numeral lengths up to 200, divided by the corresponding integer lengths—with callouts automatically showing notable values:

✕
ListLinePlot[Table[ Callout[StringLength[RomanNumeral[n]]/IntegerLength[n], n], {n, 200}]] |

It’s easy enough to make a histogram for all numbers up to 1000:

✕
Histogram[Table[ StringLength[RomanNumeral[n]]/IntegerLength[n], {n, 1000}]] |

But of course in actual usage, some numbers are more common than others. So how can we capture that? Well, here’s one (rather naive) computational approach. Let’s just analyze the Wikipedia article about arithmetic, and see what integers it mentions. Again, that computational concept is easy to express in the Wolfram Language: finding cases of numbers in the article, then selecting those that are interpreted as integers:

✕
Select[IntegerQ][ TextCases[WikipediaData["arithmetic"], "Number" -> "Interpretation"]] |

There are some big numbers, with Roman-numeral representations for which the notion of “string length” doesn’t make much sense:

✕
RomanNumeral[7485696] |

And then there’s 0, for which the Romans didn’t have an explicit representation. But restricting to “Roman-stringable” numbers, we can make our histogram again:

✕
Histogram[ Map[StringLength[RomanNumeral[#]]/IntegerLength[#] &][ Select[IntegerQ[#] && 0 < # < 5000 &][ TextCases[WikipediaData["arithmetic"], "Number" -> "Interpretation"]]]] |

And what’s crucial here is that—with Wolfram Language—we’re in a position to formulate our thinking in terms of computational concepts, like `StringLength` and `TextCases` and `Select` and `Histogram`. And we’re able to use the computational language to express our computational thinking—in a way that humans can read, and the computer can compute from.

As a practical matter, the examples of computational language we’ve just seen look pretty different from anything one would normally do with a standard programming language. But what is the fundamental difference between a computational language and a programming language?

First and foremost, it’s that a computational language tries to intrinsically be able to talk about whatever one might think about in a computational way—while a programming language is set up to intrinsically talk only about things one can directly program a computer to do. So for example, a computational language can intrinsically talk about things in the real world—like the planet Mars or New York City or a chocolate chip cookie. A programming language can intrinsically talk only about abstract data structures in a computer.

Inevitably, a computational language has to be vastly bigger and richer than a programming language. Because while a programming language just has to know about the operation of a computer, a computational language tries to know about everything—with as much knowledge and computational intelligence as possible about the world and about computation built into it.

To be fair, the Wolfram Language is the sole example that exists of a full-scale computational language. But one gets a sense of magnitude from it. While the core of a standard programming language typically has perhaps a few tens of primitive functions built in, the Wolfram Language has more than 5600—with many of those individually representing major pieces of computational intelligence. And in its effort to be able to talk about the real world, the Wolfram Language also has millions of entities of all sorts built into it. And, yes, the Wolfram Language has had more than three decades of energetic, continuous development put into it.

Given a programming language, one can of course start programming things. And indeed many standard programming languages have all sorts of libraries of functions that have been created for them. But the objective of these libraries is not really the same as the objective of a true computational language. Yes, they’re providing specific “functions to call”. But they’re not trying to create a way to represent or talk about a broad range of computational ideas. To do that requires a coherent computational language—of the kind I’ve been building in the Wolfram Language all these years.

A programming language is (needless to say) intended as something in which to write programs. And while it’s usually considered desirable for humans to be able—at least at some level—to read the programs, the ultimate point is to provide a way to tell a computer what to do. But computational language can also achieve something else. Because it can serve as an expressive medium for communicating computational ideas to humans as well as to computers.

Even when one’s dealing with abstract algorithms, it’s common with standard programming languages to want to talk in terms of some kind of “pseudocode” that lets one describe the algorithms without becoming enmeshed in the (often fiddly) details of actual implementation. But part of the idea of computational language is always to have a way to express computational ideas directly in the language: to have the high-level expressiveness and readability of pseudocode, while still having everything be precise, complete and immediately executable on a computer.

Looking at the examples above, one thing that’s immediately obvious is that having the computational language be symbolic is critical. In most standard programming languages, `x` on its own without a value doesn’t mean anything; it has to stand for some structure in the memory of the computer. But in a computational language, one’s got to be able to have things that are purely symbolic, and that represent, for example, entities in the real world—that one can operate on just like any other kind of data.

There’s a whole cascade of wonderful unifications that flow from representing everything as a symbolic expression, crucial in being able to coherently build up a full-scale computational language. And to make a computational language as readily absorbable by humans as possible, there are also all sorts of detailed issues of interface—like having hierarchically structured notebooks, allowing details of computational language to be iconized for display, and so on.

Particularly in this age of machine learning one might wonder why one would need a precisely defined computational language at all. Why not just use natural language for everything?

Wolfram|Alpha provides a good example (indeed, probably the most sophisticated one that exists today) of what can be done purely with natural language. And indeed for the kinds of short questions that Wolfram|Alpha normally handles, it proves that natural language can work quite well.

But what if one wants to build up something more complicated? Just like in the case of doing mathematics without notation, it quickly becomes impractical. And I could see this particularly clearly when I was writing an introductory book on the Wolfram Language—and trying to create exercises for it. The typical form of an exercise is: “Take this thing described in natural language, and implement it in Wolfram Language”. Early in the book, this worked OK. But as soon as things got more complicated, it became quite frustrating. Because I’d immediately know what I wanted to say in Wolfram Language, but it took a lot of effort to express it in natural language for the exercise, and often what I came up with was hard to read and reminiscent of legalese.

One could imagine that with enough back-and-forth, one might be able to explain things to a computer purely in natural language. But to get any kind of clear idea of what the computer has understood, one needs some more structured representation—which is precisely what computational language provides.

And it’s certainly no coincidence that the way Wolfram|Alpha works is first to translate whatever natural language input it’s given to precise Wolfram Language—and only then to compute answers from it.

In a sense, using computational language is what lets us leverage the last few centuries of exact science and systematic knowledge. Earlier in history, one imagined that one could reason about everything just using words and natural language. But three or four centuries ago—particularly with mathematical notation and other mathematical ideas—it became clear that one could go much further if one had a structured, formal way of talking about the world. And computational language now extends that—bringing a much wider range of things into the domain of formal computational thinking, and going still further beyond natural language.

Of course, one argument for trying to use natural language is that “everybody already knows it”. But the whole point is to be able to apply computational thinking—and to do that systematically, one needs a new way of expressing oneself, which is exactly what computational language provides.

Computational language is something quite different from natural language, but in its construction it still uses natural language and people’s understanding of it. Because in a sense the “words” in the computational language are based on words in natural language. So, for example, in the Wolfram Language, we have functions like `StringLength`, `TextCases` and `FeatureSpacePlot`.

Each of these functions has a precise computational definition. But to help people understand and remember what the functions do, we use (very carefully chosen) natural language words in their names. In a sense, we’re leveraging people’s understanding of natural language to be able to create a higher level of language. (By the way, with our “code captions” mechanism, we’re able to at least annotate everything in lots of natural languages beyond English.)

It’s a slightly different story when it comes to the zillions of real-world entities that a computational language has to deal with. For a function like `TextCases`, you both have to know what it’s called, and how to use it. But for an entity like New York City, you just have to somehow get hold of it—and then it’s going to work the same as any other entity. And a convenient way to get hold of it is just to ask for it, by whatever (natural language) name you know for it.

For example, in the Wolfram Language you can just use a “free-form input box”. Type `nyc` and it’ll get interpreted as the official New York City entity:

✕
\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "nyc", Typeset`boxes$$ = TemplateBox[{"\"New York City\"", RowBox[{"Entity", "[", RowBox[{"\"City\"", ",", RowBox[{"{", RowBox[{"\"NewYork\"", ",", "\"NewYork\"", ",", "\"UnitedStates\""}], "}"}]}], "]"}], "\"Entity[\\\"City\\\", {\\\"NewYork\\\", \\\"NewYork\\\", \ \\\"UnitedStates\\\"}]\"", "\"city\""}, "Entity"], Typeset`allassumptions$$ = {{"type" -> "Clash", "word" -> "nyc", "template" -> "Assuming \"${word}\" is ${desc1}. Use as ${desc2} instead", "count" -> "2", "Values" -> {{"name" -> "City", "desc" -> "a city", "input" -> "*C.nyc-_*City-"}, {"name" -> "VisualArts", "desc" -> "a photograph", "input" -> "*C.nyc-_*VisualArts-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.274794`5.890552239367699, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {173., {7., 15.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\) |

You can use this entity to do computations:

✕
GeoArea[\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "nyc", Typeset`boxes$$ = TemplateBox[{"\"New York City\"", RowBox[{"Entity", "[", RowBox[{"\"City\"", ",", RowBox[{"{", RowBox[{"\"NewYork\"", ",", "\"NewYork\"", ",", "\"UnitedStates\""}], "}"}]}], "]"}], "\"Entity[\\\"City\\\", {\\\"NewYork\\\", \\\"NewYork\\\", \ \\\"UnitedStates\\\"}]\"", "\"city\""}, "Entity"], Typeset`allassumptions$$ = {{"type" -> "Clash", "word" -> "nyc", "template" -> "Assuming \"${word}\" is ${desc1}. Use as ${desc2} instead", "count" -> "2", "Values" -> {{"name" -> "City", "desc" -> "a city", "input" -> "*C.nyc-_*City-"}, {"name" -> "VisualArts", "desc" -> "a photograph", "input" -> "*C.nyc-_*VisualArts-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.274794`5.890552239367699, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {173., {7., 15.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\)] |

Of course, this kind of free-form input can be ambiguous. Type `ny` and the first interpretation is New York state:

✕
\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "ny", Typeset`boxes$$ = TemplateBox[{"\"New York, United States\"", RowBox[{"Entity", "[", RowBox[{"\"AdministrativeDivision\"", ",", RowBox[{"{", RowBox[{"\"NewYork\"", ",", "\"UnitedStates\""}], "}"}]}], "]"}], "\"Entity[\\\"AdministrativeDivision\\\", {\\\"NewYork\\\", \ \\\"UnitedStates\\\"}]\"", "\"administrative division\""}, "Entity"], Typeset`allassumptions$$ = {{"type" -> "Clash", "word" -> "ny", "template" -> "Assuming \"${word}\" is ${desc1}. Use as ${desc2} instead", "count" -> "2", "Values" -> {{"name" -> "USState", "desc" -> "a US state", "input" -> "*C.ny-_*USState-"}, {"name" -> "City", "desc" -> "a city", "input" -> "*C.ny-_*City-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.321865`5.9592187470275455, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {333., {7., 15.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\) |

Press the little dots and you get to say you want New York City instead:

✕
\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "nyc", Typeset`boxes$$ = TemplateBox[{"\"New York City\"", RowBox[{"Entity", "[", RowBox[{"\"City\"", ",", RowBox[{"{", RowBox[{"\"NewYork\"", ",", "\"NewYork\"", ",", "\"UnitedStates\""}], "}"}]}], "]"}], "\"Entity[\\\"City\\\", {\\\"NewYork\\\", \\\"NewYork\\\", \ \\\"UnitedStates\\\"}]\"", "\"city\""}, "Entity"], Typeset`allassumptions$$ = {{"type" -> "Clash", "word" -> "nyc", "template" -> "Assuming \"${word}\" is ${desc1}. Use as ${desc2} instead", "count" -> "2", "Values" -> {{"name" -> "City", "desc" -> "a city", "input" -> "*C.nyc-_*City-"}, {"name" -> "VisualArts", "desc" -> "a photograph", "input" -> "*C.nyc-_*VisualArts-"}}}}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.274794`5.890552239367699, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {173., {7., 15.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\) |

For convenience, the inputs here are natural language. But the outputs—sometimes after a bit of disambiguation—are precise computational language, ready to be used wherever one wants.

And in general, it’s very powerful to be able to use natural language to specify small chunks of computational language. To express large-scale computational thinking, one needs the formality and structure of computational language. But “small utterances” can be given in natural language—like in Wolfram|Alpha—then translated to precise computational language:

✕
\!\(\*NamespaceBox["LinguisticAssistant", DynamicModuleBox[{Typeset`query$$ = "population of nyc", Typeset`boxes$$ = RowBox[{TemplateBox[{"\"New York City\"", RowBox[{"Entity", "[", RowBox[{"\"City\"", ",", RowBox[{"{", RowBox[{"\"NewYork\"", ",", "\"NewYork\"", ",", "\"UnitedStates\""}], "}"}]}], "]"}], "\"Entity[\\\"City\\\", {\\\"NewYork\\\", \\\"NewYork\\\", \ \\\"UnitedStates\\\"}]\"", "\"city\""}, "Entity"], "[", TemplateBox[{"\"city population\"", RowBox[{"EntityProperty", "[", RowBox[{"\"City\"", ",", "\"Population\""}], "]"}], "\"EntityProperty[\\\"City\\\", \\\"Population\\\"]\""}, "EntityProperty"], "]"}], Typeset`allassumptions$$ = {}, Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 0.701535`6.297594336611864, "Messages" -> {}}}, DynamicBox[ ToBoxes[AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, Dynamic[Typeset`query$$], Dynamic[Typeset`boxes$$], Dynamic[Typeset`allassumptions$$], Dynamic[Typeset`assumptions$$], Dynamic[Typeset`open$$], Dynamic[Typeset`querystate$$]], StandardForm], ImageSizeCache -> {291., {11., 18.}}, TrackedSymbols :> {Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}], DynamicModuleValues :> {}, UndoTrackedVariables :> {Typeset`open$$}], BaseStyle -> {"Deploy"}, DeleteWithContents -> True, Editable -> False, SelectWithContents -> True]\) |

✕
EntityClass[ "City", {EntityProperty["City", "Population"] -> TakeLargest[5]}] // EntityList |

✕
Plot[Cos[x], {x, -6.6, 6.6}, PlotStyle -> Purple] |

I think by now there’s little doubt the introduction of the computational paradigm is the single most important intellectual development of the past century. And going forward, I think computational language is going to be crucial in being able to broadly make use of that paradigm—much as many centuries ago, mathematical notation was crucial to launching the widespread use of the mathematical paradigm.

How should one express and communicate the ideas of a “computational X” field? Blobs of low-level programming language code won’t do it. Instead, one needs something that can talk directly about things in the field—whether they are genes, animals, words, battles or whatever. And one also needs something that humans can readily read and understand. And this is precisely what computational language can provide.

Of course, computational language also has the giant bonus that computers can understand it, and that it can be used to specify actual computations to do. In other words, by being able to express something in computational language, you’re not only finding a good way to communicate it to humans, you’re also setting up something that can leverage the power of actual computation to automatically produce things.

And I suspect that in time it will become clear that the existence of computational language as a communication medium is what ultimately succeeded in launching a huge range of computational X fields. Because it’s what will allow the ideas in these fields to be put in a concrete form that people can think in terms of.

How will the computational language be presented? Often, I suspect, it will be part of what I call computational essays. A computational essay mixes natural language text with computational language—and with the outputs of actual computations described by the computational language. It’s a little like how for the past couple of centuries, technical papers have typically relied on mixing text and formulas.

But a computational essay is something much more powerful. For one thing, people can not only read the computational language in a computational essay, but they can also immediately reuse it elsewhere. In addition, when one writes a computational essay, it’s a computer-assisted activity, in which one shares the load with the computer. The human has to write the text and the computational language, but then the computer can automatically generate all kinds of results, infographics, etc. as described by the computational language.

In practice it’s important that computational essays can be presented in Wolfram Notebooks, in the cloud and on the desktop, and that these notebooks can contain all sorts of dynamic and computational elements.

One can expect to use computational essays for a wide range of things—whether papers, reports, exercises or whatever. And I suspect that computational essays, written with computational language, will become the primary form of communication for computational X fields.

I doubt we can yet foresee even a fraction of the places where computational language will be crucial. But one place that’s already clear is in defining computational contracts. In the past, contracts have basically always been written in natural language—or at least in the variant that is legalese. But computational language provides an alternative.

With the Wolfram Language as it is today, we can’t cover everything in every contract. But it’s already clear how we can use computational language to represent many kinds of things in the world that are the subject of contracts. And the point is that with computational language we can write a precise contract that both humans and machines can understand.

In time there’ll be computational contracts everywhere: for commerce, for defining goals, for AI ethics, and so on. And computational language is what will make them all possible.

When literacy in natural language began to become widespread perhaps 500 years ago, it led to sweeping changes in how the world could be organized, and in the development of civilization. In time I think it’s inevitable that there’ll also be widespread literacy in computational language. Certainly that will lead to much broader application of computational thinking (and, for example, the development of many “computational X” fields). And just as our world today is full of written natural language, so in the future we can expect that there will be computational language everywhere—that both defines a way for us humans to think in computational terms, and provides a bridge between human thinking and the computation that machines and AIs can do.

I’ve talked a lot about the general concept of computational language. But in the world today, there’s actually only one example that exists of a full-scale computational language: the Wolfram Language. At first, it might seem strange that one could say this so categorically. With all the technology out there in the world, how could something be that unique?

But it is. And I suppose this becomes a little less surprising when one realizes that we’ve been working on the Wolfram Language for well over thirty years—or more than half of the whole history of modern computing. And indeed, the span of time over which we’ve been able to consistently pursue the development of the Wolfram Language is now longer than for almost any other software system in history.

Did I foresee the emergence of the Wolfram Language as a full computational language? Not entirely. When I first started developing what’s now the Wolfram Language I wanted to make it as general as possible—and as flexible in representing computational ideas and processes.

At first, its most concrete applications were to mathematics, and to various kinds of modeling. But as time went on, I realized that more and more types of things could fit into the computational framework that we’d defined. And gradually this started to include things in the real world. Then, about a decade and a half ago, I realized that, yes, with the whole symbolic language we’d defined, we could just start systematically representing all those things like cities and chemicals in pretty much the same way as we’d represented abstract things before.

I’d always had the goal of putting as much knowledge as possible into the language, and of automating as much as possible. But from the beginning I made sure that the language was based on a small set of principles—and that as it grew it maintained a coherent and unified design.

Needless to say, this wasn’t easy. And indeed it’s been my daily activity now for more than 30 years (with, for example, 300+ hours of it livestreamed over the past year). It’s a difficult process, involving both deep understanding of every area the language covers, as well as a string of complicated judgement calls. But it’s the coherence of design that this achieves that has allowed the language to maintain its unity even as it has grown to encompass all sorts of knowledge about the real world, as well as all those other things that make it a full computational language.

Part of what’s made the Wolfram Language possible is the success of its principles and basic framework. But to actually develop it has also involved the creation of a huge tower of technology and content—and the invention of countless algorithms and meta-algorithms, as well as the acquisition and curation of immense amounts of data.

It’s been a strange mixture of intellectual scholarship and large-scale engineering—that we’ve been fortunate enough to be able to consistently pursue for decades. In many ways, this has been a personal mission of mine. And along the way, people have often asked me how to pigeonhole what we’re building. Is it a calculation system? Is it an encyclopedia-like collection of data? Is it a programming language?

Well, it’s all of those things. But they’re only part of the story. And as the Wolfram Language has developed, it’s become increasingly clear how far away it is from existing categories. And it’s only quite recently that I’ve finally come to understand what it is we’ve managed to build: the world’s only full computational language. Having understood this, it starts to be easier to see just how what we’ve been doing all these years fits into the arc of intellectual history, and what some of its implications might be going forward.

From a practical point of view, it’s great to be able to respond to that obvious basic question: “What is the Wolfram Language?” Because now we have a clear answer: “It’s a computational language!” And, yes, that’s very important!

*To comment, please visit the copy of this post at the Stephen Wolfram Blog »*

Neural networks are a programming approach that is inspired by the neurons in the human brain and that enables computers to learn from observational data, be it images, audio, text, labels, strings or numbers. They try to model some unknown function (for example, ) that maps this data to numbers or classes by recognizing patterns in the data. They are built from these components:

- Encoders and decoders (to convert the input data type to numeric tensors)

- Layers (to perform operations on these tensors, depending on the applications)

- Containers (to hold these operations in a sensible way)

Once the neural network is built from these components, it needs to be trained (in other words, optimized).

As you might have guessed, the optimization (or minimizing the “loss” of the network) is done through stochastic gradient descent in an iterative fashion. The inputs are fed to the net repeatedly; the error/loss is computed each time and is then used to update the model’s parameters using back propagation. Back propagation, or “back error propagation,” involves distributing the error computed during forward propagation back to the network’s layers.

The input data provided in any form needs to be converted to numeric tensors. Here are a few examples of tensors and their corresponding ranks (dimensions):

- Rank 0 (scalars): 0.0

- Rank 1 (vectors): {0.0, 1.0}

- Rank 2 (matrices): {{1.,2.,3.} , {3., 2., 1.}}

- Rank-
*n*tensors: {… {… {1., 2., 3.}…}…}

These examples provide valuable insight into how we can transform images into the corresponding ranked tensors:

Take an image and convert it to tensors using a `NetEncoder` function, applied to images:

✕
image = CloudGet["https://wolfr.am/De49Ylam"]; |

✕
enc = NetEncoder[{"Image", {64, 64}, ColorSpace -> "Grayscale" }] |

The variable `encoded` indeed contains a 64×64 matrix of a single dimension (corresponding to the grayscale `ColorSpace`). We can confirm that by looking at the dimensions of this array:

✕
encoded = enc[image]; Dimensions@encoded |

`NetDecoder`, on the other hand, can take a numeric tensor and convert it to the data type of your choice. Typically, one takes the output of the neural net and feeds it to the net decoder. Here, to illustrate the workings of a net decoder, let us simply feed back the “encoded” image to see if the decoder works as expected. Because we get our original image back in the specified 64×64 matrix, we have confirmation that it’s working:

✕
dec = NetDecoder[{"Image", ColorSpace -> "Grayscale"}] |

✕
dec[encoded] |

A neural network consists of “layers” through which information is processed from the input to the output tensor. Each layer is defined by its mathematical operation. So, mathematically, we can define a linear layer as an affine transformation , where is the “weight matrix” and the vector is the “bias vector”:

✕
data = {2, 10, 3}; layer = NetInitialize@LinearLayer[2, "Input" -> 3] layer[data] |

✕
linear[data_, weight_, bias_] := weight.data + bias |

✕
linear[data, Normal@NetExtract[layer, "Weights"], Normal@NetExtract[layer, "Biases"]] |

Here I’ve listed the various layers in the Wolfram Language. You can start exploring the layers, each of which is defined by its associated mathematical operation, to see what tasks they accomplish and how they do it.

Generally, most people use the following rule of thumb: “When you don’t know how to fine-tune your neural network, stack more layers.” So it makes us wonder: how do you stack them? The answer leads us to the third component mentioned earlier: containers.

Once the data is converted into numeric tensors using `NetEncoder` and the layers are chosen for particular applications, they need to be “stitched” together using the containers. `NetChain` can be used to connect different layers in a chain-like fashion to create a neural net, while `NetGraph` can be used to connect different layers to create a graph, connecting these layers. Let’s try out some ways to use these containers to create networks.

Here we construct a chained network with nonlinear activation, which specifies the vector input of a specified size and produces vector outputs of size 3:

✕
NetChain[{LinearLayer[30], ElementwiseLayer[Tanh], LinearLayer[3], ElementwiseLayer[Tanh], LinearLayer[3], ElementwiseLayer[LogisticSigmoid]}, "Input" -> 2] |

Combine two layers with `NetGraph`, using `ThreadingLayer` for the corresponding operation:

✕
net = NetGraph[ {ElementwiseLayer[Ramp], ElementwiseLayer[Tanh], ThreadingLayer[Times], ThreadingLayer[Plus]}, {1 -> 2 -> 3, 1 -> 3 -> 4, 2 -> 4}] |

Stacking more layers brings us to a very serious problem of overfitting. When you keep stacking layers, you increase the number of tunable parameters in a network to thousands and even millions. The “deep” networks that are well suited to perform tasks like image classification, object detection and natural language processing are comprised of lots of stacked layers, and contain millions of parameters. When you have millions of parameters in your network, the network tends to “memorize” your data. In other words, the network will closely fit to your data, learn the eccentricities (or the “noise”) of your data and will not be able to generalize to other data. In an ideal world, where you have access to infinite data, the problem of overfitting would not arise. However, since we do not have access to infinite data (or the infinite training time required for infinite data), we need to make the best of what we have.

We divide our data into training, validation and test sets, keeping good enough percentages for the latter two. Finally, we can perform data augmentation: `ImageAugmentationLayer` automatically helps augment images to your dataset by performing random cropping and other operations. Next, we use various regularization techniques. Apart from the very familiar concepts of L1 and L2 regularization, which can be found as options for `NetTrain`, `DropoutLayer` can also be used to tackle the problem of overfitting. `DropoutLayer` sets the input elements to 0 with probability during training, multiplying the remainder by and works as an effective and widely used regularization technique. Other methods like early stopping and multi-task learning can also be useful.

The current consensus in the neural net community is that building your own net architecture is unnecessary for the majority of neural net applications, and will usually hurt performance. Rather, adapting a pretrained net to your own problem is almost always a better approach in terms of performance. Luckily, this approach has the added benefit of being much easier to work with!

Thus, having a large neural net repository is absolutely key to being productive with the neural net framework, as it allows you to look for a net close to the problem you are solving, do minimal amounts of “surgery” on the net to adapt it to your specific problem and then train it.

The Wolfram Neural Net Repository gives Wolfram Language users easy access to the latest net architectures and pretrained nets, representing thousands of hours of computation time on powerful GPUs. The repository consists of publicly available models converted from other neural net frameworks (such as Caffe, Torch, MXNet, TensorFlow, etc.) into the Wolfram neural network format. In addition, we have trained a number of nets ourselves, which you can find in the Wolfram Neural Net Repository and the introductory blog. All the neural network models in the repository can be programmatically accessed via `NetModel`. Here’s a sample of 10 random models:

✕
RandomSample[NetModel[], 10] |

Let’s look at an example of the net surgery process to solve the cat-versus-dog classification problem. First, obtain a net similar to our problem:

✕
net = NetModel["ResNet-50 Trained on ImageNet Competition Data"] |

The last two layers are specialized for the `ImageNet` classification task, and won’t be needed for our purposes. So we simply remove the last two layers using `NetDrop`:

✕
netFeature = NetDrop[net, -2] |

Note that it is particularly easy to do “net surgery” in the Wolfram Language: nets are symbolic expressions that can be manipulated using a large set of surgery functions, such as `NetTake`, `NetDrop`, `NetAppend`, `NetJoin`, etc. Now we simply need to define a new `NetChain` that will classify an image as “dog” or “cat”:

✕
netNew = NetChain[ <|"feature" -> netFeature, "classifier" -> LinearLayer[], "probabilities" -> SoftmaxLayer[]|>, "Output" -> NetDecoder[{"Class", {"dog", "cat"}}]] |

✕
NetTrain[netNew, catdogTrain, "FinalPlots", LearningRateMultipliers -> {"classifier" -> 1, _ -> 0}, ValidationSet -> catdogTest, MaxTrainingRounds -> 10, Method -> "SGD" ] |

Neural networks are data-driven algorithms, so the first step is to investigate your data thoroughly. Various statistical and visualization techniques can be used to see patterns and variations in the data. Once you have a better understanding of your data, decide on your network. The best bet is to start from networks that have been trained and validated by established researchers, or at least take inspiration from the various “building units” in them. A great place to start is the Wolfram Neural Net Repository, where you can play with various network surgery functions. Once you have created the architecture, start experimenting with various parameters, initializations and losses. It is absolutely okay to overfit at this stage! Finally, you can use regularization techniques in the original model or the ones discussed to generalize your model.

Visit the Wolfram Data Repository and Neural Net Repository for a combination of immediately useful resources for getting started.

- Wolfram U class: Exploring the Neural Net Framework from Building to Training

- Wolfram Blog: “Launching the Wolfram Neural Net Repository”

- Get started: Wolfram Neural Net Repository

- Need data? Wolfram Data Repository

Graph theory has a history dating back to 1735, when the Königsberg bridge problem was proved to be not possible by Swiss mathematician Leonhard Euler. Today it touches us in many ways, from discovering the shortest route while on vacation to returning relevant links with a web search. Smooth traffic flow, efficient package delivery and reliable power grids utilize graph theory and affect our daily lives.

The Wolfram Language provides an extensive set of tools to reveal the underlying structures of a particular graph. Styling plays a significant role in the ability to analyze these graph structures easily. The `PlotTheme` family of themes, already used by the Wolfram Language’s visualizations and gauges, has found its way to the graph functions and provides a simple way to apply various styles.

As with any family, each member has its own unique personality. The personality of each theme was designed to help with the visual challenges that surface during graph analysis. Some themes work well locating paths in a complex graph, while others add visual excitement to an otherwise mundane graph.

`PlotTheme` sets a theme for Wolfram Language visualizations. A theme is a list of option values called by a single string name.

✕
CompleteGraph[5, PlotTheme -> "Web"] |

Included are the eight original base themes plus three feature themes found only in `Graph`: `"LargeGraph"`, `"ClassicLabeled"` and `"IndexLabeled"`.

Each theme automatically handles its own highlighting style.

`PlotTheme` does not overwhelm with an unlimited number of themes. Instead, each theme includes specific useful features.

For example, these features can help locate a path in a complex network:

Spice up a graph with different colors and shapes.

Locate a specific vertex with labels.

Produce artwork suitable for one-color printing.

Gratify an artistic side with color variety.

And of course, for the sentimental, the original styles remain available.

A `PlotTheme` combined with `GraphHighlightStyle` eases the task of finding that perfect style. As illustrated here, it helps the `"LargeGraph"` theme reveal a path in a complex graph:

This section is for those who came to this blog expecting a function to connect the dots. The function utilizes `Graph` and `PlotTheme` and is included with the downloadable notebook of this blog.

✕
connectTheDots[CloudGet["https://wolfr.am/D4Xf2C4o"], PlotTheme -> "Minimal"] |

Set the `"ShowSolution"` option to `True` to see the result.

✕
connectTheDots[CloudGet["https://wolfr.am/D4Xf2C4o"], "ShowSolution" -> True] |

Other themes provide unique styling.

✕
connectTheDots[CloudGet["https://wolfr.am/D4Xf2C4o"], PlotTheme -> "Marketing"] |

Visually simplifying a problem with graph theory is a necessity in our world of growing complexity. Explore examples in the Wolfram Demonstrations Project to see what others have discovered. Visit the Graph Visualization and `PlotTheme` documentation pages to learn more.

Today we’re releasing Version 12 of Wolfram Language (and Mathematica) on desktop platforms, and in the Wolfram Cloud. We released Version 11.0 in August 2016, 11.1 in March 2017, 11.2 in September 2017 and 11.3 in March 2018. It’s a big jump from Version 11.3 to Version 12.0. Altogether there are 278 completely new functions, in perhaps 103 areas, together with thousands of different updates across the system:

In an “integer release” like 12, our goal is to provide fully-filled-out new areas of functionality. But in every release we also want to deliver the latest results of our R&D efforts. In 12.0, perhaps half of our new functions can be thought of as finishing areas that were started in previous “.1” releases—while half begin new areas. I’ll discuss both types of functions in this piece, but I’ll be particularly emphasizing the specifics of what’s new in going from 11.3 to 12.0.

I must say that now that 12.0 is finished, I’m amazed at how much is in it, and how much we’ve added since 11.3. In my keynote at our Wolfram Technology Conference last October I summarized what we had up to that point—and even that took nearly 4 hours. Now there’s even more.

What we’ve been able to do is a testament both to the strength of our R&D effort, and to the effectiveness of the Wolfram Language as a development environment. Both these things have of course been building for three decades. But one thing that’s new with 12.0 is that we’ve been letting people watch our behind-the-scenes design process—livestreaming more than 300 hours of my internal design meetings. So in addition to everything else, I suspect this makes Version 12.0 the very first major software release in history that’s been open in this way.

OK, so what’s new in 12.0? There are some big and surprising things—notably in chemistry, geometry, numerical uncertainty and database integration. But overall, there are lots of things in lots of areas—and in fact even the basic summary of them in the Documentation Center is already 19 pages long:

Although nowadays the vast majority of what the Wolfram Language (and Mathematica) does isn’t what’s usually considered math, we still put immense R&D effort into pushing the frontiers of what can be done in math. And as a first example of what we’ve added in 12.0, here’s the rather colorful `ComplexPlot3D`:

✕
ComplexPlot3D[Gamma[z],{z,-4-4I,4+4I}] |

It’s always been possible to write Wolfram Language code to make plots in the complex plane. But only now have we solved the math and algorithm problems that are needed to automate the process of robustly plotting even quite pathological functions in the complex plane.

Years ago I remember painstakingly plotting the dilogarithm function, with its real and imaginary parts. Now `ReImPlot` just does it:

✕
ReImPlot[PolyLog[2, x], {x, -4, 4}] |

The visualization of complex functions is (pun aside) a complex story, with details making a big difference in what one notices about a function. And so one of the things we’ve done in 12.0 is to introduce carefully selected standardized ways (such as named color functions) to highlight different features:

✕
ComplexPlot[(z^2+1)/(z^2-1),{z,-2-2I,2+2I},ColorFunction->"CyclicLogAbsArg"] |

Measurements in the real world often have uncertainty that gets represented as values with ± errors. We’ve had add-on packages for handling “numbers with errors” for ages. But in Version 12.0 we’re building in computation with uncertainty, and we’re doing it right.

The key is the symbolic object `Around[ x, δ]`, which represents a value “around

✕
Around[7.1,.25] |

You can do arithmetic with `Around`, and there’s a whole calculus for how the uncertainties combine:

✕
Sqrt[Around[7.1,.25]]+Around[1,.1] |

If you plot `Around` numbers, they’ll be shown with error bars:

✕
ListPlot[Table[Around[n,RandomReal[Sqrt[n]]],{n,20}]] |

There are lots of options—like here’s one way to show uncertainty in both *x* and *y*:

✕
ListPlot[Table[Around[RandomReal[10],RandomReal[1]],20,2],IntervalMarkers->"Ellipses"] |

You can have `Around` quantities:

✕
1/Around[Quantity[3, "Metres"], Quantity[3.5, "Centimetres"]] |

And you can also have symbolic `Around` objects:

✕
Around[x,Subscript[δ, x]]+Around[y,Subscript[δ, y]] |

But what really is an `Around` object? It’s something where there are certain rules for combining uncertainties, that are based on uncorrelated normal distributions. But there’s no statement being made that `Around[ x, δ]` represents anything that actually in detail follows a normal distribution—any more than that

OK, so let’s say you make a bunch of measurements of some value. You can get an estimate of the value—together with its uncertainty—using `MeanAround` (and, yes, if the measurements themselves have uncertainties, these will be taken into account in weighting their contributions):

✕
MeanAround[{1.4,1.7,1.8,1.2,1.5,1.9,1.7,1.3,1.7,1.9,1.0,1.7}] |

Functions all over the system—notably in machine learning—are starting to have the option `ComputeUncertaintyTrue`, which makes them give `Around` objects rather than pure numbers.

`Around` might seem like a simple concept, but it’s full of subtleties—which is the main reason it’s taken until now for it to get into the system. Many of the subtleties revolve around correlations between uncertainties. The basic idea is that the uncertainty of every `Around` object is assumed to be independent. But sometimes one has values with correlated uncertainties—and so in addition to `Around`, there’s also `VectorAround`, which represents a vector of potentially correlated values with a specified covariance matrix.

There’s even more subtlety when one’s dealing with things like algebraic formulas. If one replaces `x` here with an `Around`, then, following the rules of `Around`, each instance is assumed to be uncorrelated:

✕
(Exp[x]+Exp[x/2])/.x->Around[0,.3] |

But probably one wants to assume here that even though the value of `x` may be uncertain, it’s going to be same for each instance, and one can do this using the function `AroundReplace` (notice the result is different):

✕
AroundReplace[Exp[x]+Exp[x/2],x->Around[0,.3]] |

There’s lots of subtlety in how to display uncertain numbers. Like how many trailing 0s should you put in:

✕
Around[1,.0006] |

Or how much precision of the uncertainty should you include (there’s a conventional breakpoint when the trailing digits are 35):

✕
{Around[1.2345,.000312],Around[1.2345,.00037]} |

In rare cases where lots of digits are known (think, for example, some physical constants), one wants to go to a different way to specify uncertainty:

✕
Around[1.23456789,.000000001] |

And it goes on and on. But gradually `Around` is going to start showing up all over the system. By the way, there are lots of other ways to specify `Around` numbers. This is a number with 10% relative error:

✕
Around[2,Scaled[.1]] |

This is the best `Around` can do in representing an interval:

✕
Around[Interval[{2,3}]] |

For a distribution, `Around` computes variance:

✕
Around[NormalDistribution[2,1]] |

It can also take into account asymmetry by giving asymmetric uncertainties:

✕
Around[LogNormalDistribution[2,1]] |

In making math computational, it’s always a challenge to both be able to “get everything right”, and not to confuse or intimidate elementary users. Version 12.0 introduces several things to help. First, try solving an irreducible quintic equation:

✕
Solve[x^5 + 6 x + 1 == 0, x] |

In the past, this would have shown a bunch of explicit `Root` objects. But now the `Root` objects are formatted as boxes showing their approximate numerical values. Computations work exactly the same, but the display doesn’t immediately confront people with having to know about algebraic numbers.

When we say `Integrate`, we mean “find an integral”, in the sense of an antiderivative. But in elementary calculus, people want to see explicit constants of integration (as they always have in Wolfram|Alpha), so we added an option for that (and C[*n*] also has a nice, new output form):

✕
Integrate[x^3,x,GeneratedParameters->C] |

When we benchmark our symbolic integration capabilities we do really well. But there’s always more that can be done, particularly in terms of finding the simplest forms of integrals (and at a theoretical level this is an inevitable consequence of the undecidability of symbolic expression equivalence). In Version 12.0 we’ve continued to pick away at the frontier, adding cases like:

✕
\[Integral]Sqrt[ Sqrt[x] + Sqrt[2 x + 2 Sqrt[x] + 1] + 1] \[DifferentialD]x |

✕
\[Integral]x^2/(ProductLog[a/x] + 1) \[DifferentialD]x |

In Version 11.3 we introduced asymptotic analysis, being able to find asymptotic values of integrals and so on. Version 12.0 adds asymptotic sums, asymptotic recurrences and asymptotic solutions to equations:

✕
AsymptoticSum[1/Sqrt[k], {k, 1, n}, {n, \[Infinity], 5}] |

✕
AsymptoticSolve[x y^4 - (x + 1) y^2 + x == 1, y, {x, 0, 3}, Reals] |

One of the great things about making math computational is that it gives us new ways to explain math itself. And something we’ve been doing is to enhance our documentation so that it explains the math as well as the functions. For example, here’s the beginning of the documentation about `Limit`—with diagrams and examples of the core mathematical ideas:

Polygons have been part of the Wolfram Language since Version 1. But in Version 12.0 they’re getting generalized: now there’s a systematic way to specify holes in them. A classic geographic use case is the polygon for South Africa—with its hole for the country of Lesotho.

In Version 12.0, much like `Root`, `Polygon` gets a convenient new display form:

✕
RandomPolygon[20] |

You can compute with it just as before:

✕
Area[%] |

`RandomPolygon` is new too. You can ask, say, for 5 random convex polygons, each with 10 vertices, in 3D:

✕
Graphics3D[RandomPolygon[3->{"Convex",10},5]] |

There are lots of new operations on polygons. Like `PolygonDecomposition`, which can, for example, decompose a polygon into convex parts:

✕
RandomPolygon[8] |

✕
PolygonDecomposition[%, "Convex"] |

Polygons with holes introduce a need for other kinds of operations too, like `OuterPolygon`, `SimplePolygonQ`, and `CanonicalizePolygon`.

Polygons are pretty straightforward to specify: you just give their vertices in order (and if they have holes, you also give the vertices for the holes). Polyhedra are a bit more complicated: in addition to giving the vertices, you have to say how these vertices form faces. But in Version 12.0, `Polyhedron` lets you do this in considerable generality, including voids (the 3D analog of holes), etc.

But first, recognizing their 2000+ years of history, Version 12.0 introduces functions for the five Platonic solids:

✕
Graphics3D[Dodecahedron[]] |

And given the Platonic solids, one can immediately start computing with them:

✕
Volume[Dodecahedron[]] |

Here’s the solid angle subtended at vertex 1 (since it’s Platonic, all the vertices give the same angle):

✕
PolyhedronAngle[Dodecahedron[],1] |

Here’s an operation done on the polyhedron:

✕
Graphics3D[BeveledPolyhedron[Dodecahedron[],1]] |

✕
Volume[DualPolyhedron[BeveledPolyhedron[Dodecahedron[],1]]] |

Beyond the Platonic solids, Version 12 also builds in all the “uniform polyhedra” (*n* edges and *m* faces meet at each vertex)—and you can also get symbolic `Polyhedron` versions of named polyhedra from `PolyhedronData`:

✕
Graphics3D[AugmentedPolyhedron[PolyhedronData["Spikey","Polyhedron"],2]] |

You can make any polyhedron (including a “random” one, with `RandomPolyhedron`), then do whatever computations you want on it:

✕
RegionUnion[Dodecahedron[{0,0,0}],Dodecahedron[{1,1,1}]] |

✕
SurfaceArea[%] |

Mathematica and the Wolfram Language are very powerful at doing both explicit computational geometry and geometry represented in terms of algebra. But what about geometry the way it’s done in Euclid’s *Elements*—in which one makes geometric assertions and then sees what their consequences are?

Well, in Version 12, with the whole tower of technology we’ve built, we’re finally able to deliver a new style of mathematical computation—that in effect automates what Euclid was doing 2000+ years ago. A key idea is to introduce symbolic “geometric scenes” that have symbols representing constructs such as points, and then to define geometric objects and relations in terms of them.

For example, here’s a geometric scene representing a triangle *a*, *b*, *c*, and a circle through *a*, *b* and *c*, with center *o*, with the constraint that *o* is at the midpoint of the line from *a* to *c*:

✕
GeometricScene[{a,b,c,o},{Triangle[{a,b,c}],CircleThrough[{a,b,c},o],o==Midpoint[{a,c}]}] |

On its own, this is just a symbolic thing. But we can do operations on it. For example, we can ask for a random instance of it, in which *a*, *b*, *c* and *o* are made specific:

✕
RandomInstance[GeometricScene[{a,b,c,o},{Triangle[{a,b,c}],CircleThrough[{a,b,c},o],o==Midpoint[{a,c}]}]] |

You can generate as many random instances as you want. We try to make the instances as generic as possible, with no coincidences that aren’t forced by the constraints:

✕
RandomInstance[GeometricScene[{a,b,c,o},{Triangle[{a,b,c}],CircleThrough[{a,b,c},o],o==Midpoint[{a,c}]}],3] |

OK, but now let’s “play Euclid”, and find geometric conjectures that are consistent with our setup:

✕
FindGeometricConjectures[GeometricScene[{a,b,c,o},{Triangle[{a,b,c}],CircleThrough[{a,b,c},o],o==Midpoint[{a,c}]}]] |

For a given geometric scene, there may be many possible conjectures. We try to pick out the interesting ones. In this case we come up with two—and what’s illustrated is the first one: that the line *ba* is perpendicular to the line *cb*. As it happens, this result actually appears in Euclid (it’s in Book 3, as part of Proposition 31)— though it’s usually called Thales’s theorem.

In 12.0, we now have a whole symbolic language for representing typical things that appear in Euclid-style geometry. Here’s a more complex situation—corresponding to what’s called Napoleon’s theorem:

✕
RandomInstance[ GeometricScene[{"C", "B", "A", "C'", "B'", "A'", "Oc", "Ob", "Oa"}, {Triangle[{"C", "B", "A"}], TC == Triangle[{"A", "B", "C'"}], TB == Triangle[{"C", "A", "B'"}], TA == Triangle[{"B", "C", "A'"}], GeometricAssertion[{TC, TB, TA}, "Regular"], "Oc" == TriangleCenter[TC, "Centroid"], "Ob" == TriangleCenter[TB, "Centroid"], "Oa" == TriangleCenter[TA, "Centroid"], Triangle[{"Oc", "Ob", "Oa"}]}]] |

In 12.0 there are lots of new and useful geometric functions that work on explicit coordinates:

✕
CircleThrough[{{0,0},{2,0},{0,3}}] |

✕
TriangleMeasurement[Triangle[{{0,0},{1,2},{3,4}}],"Inradius"] |

For triangles there are 12 types of “centers” supported, and, yes, there can be symbolic coordinates:

✕
TriangleCenter[Triangle[{{0,0},{1,2},{3,y}}],"NinePointCenter"] |

And to support setting up geometric statements we also need “geometric assertions”. In 12.0 there are 29 different kinds—such as `"Parallel"`, `"Congruent"`, `"Tangent"`, `"Convex"`, etc. Here are three circles asserted to be pairwise tangent:

✕
RandomInstance[GeometricScene[{a,b,c},{GeometricAssertion[{Circle[a],Circle[b],Circle[c]},"PairwiseTangent"]}]] |

Version 11.3 introduced `FindEquationalProof` for generating symbolic representations of proofs. But what axioms should be used for these proofs? Version 12.0 introduces `AxiomaticTheory`, which gives axioms for various common axiomatic theories.

Here’s my personal favorite axiom system:

✕
AxiomaticTheory["WolframAxioms"] |

What does this mean? In a sense it’s a more symbolic symbolic expression than we’re used to. In something like 1 + `x` we don’t say what the value of `x` is, but we imagine that it can have a value. In the expression above, a, b and c are pure “formal symbols” that serve an essentially structural role, and can’t ever be thought of as having concrete values.

What about the · (center dot)? In 1 + `x` we know what + means. But the · is intended to be a purely abstract operator. The point of the axiom is in effect to define a constraint on what · can represent. In this particular case, it turns out that the axiom is an axiom for Boolean algebra, so that · can represent `Nand` and `Nor`. But we can derive consequences of the axiom completely formally, for example with `FindEquationalProof`:

✕
FindEquationalProof[p·q==q·p,AxiomaticTheory["WolframAxioms"]] |

There’s quite a bit of subtlety in all of this. In the example above, it’s useful to have · as the operator, not least because it displays nicely. But there’s no built-in meaning to it, and `AxiomaticTheory` lets you give something else (here `f`) as the operator:

✕
AxiomaticTheory[{"WolframAxioms",<|"Nand"->f|>}] |

What’s the “`Nand`” doing there? It’s a name for the operator (but it shouldn’t be interpreted as anything to do with the value of the operator). In the axioms for group theory, for example, several operators appear:

✕
AxiomaticTheory["GroupAxioms"] |

This gives the default representations of the various operators here:

✕
AxiomaticTheory["GroupAxioms","Operators"] |

`AxiomaticTheory` knows about notable theorems for particular axiomatic systems:

✕
AxiomaticTheory["GroupAxioms","NotableTheorems"] |

The basic idea of formal symbols was introduced in Version 7, for doing things like representing dummy variables in generated constructs like these:

✕
PDF[NormalDistribution[0,1]] |

✕
Sum[2^n n!, n] |

✕
Entity["Surface", "Torus"][EntityProperty["Surface", "AlgebraicEquation"]] |

You can enter a formal symbol using `\[FormalA]` or Esc.aEsc, etc. But back in Version 7, `\[FormalA]` was rendered as a. And that meant the expression above looked like:

✕
Function[{\[FormalA], \[FormalC]}, Function[{\[FormalX], \[FormalY], \[FormalZ]}, \[FormalA]^4 - 2 \[FormalA]^2 \[FormalC]^2 + \[FormalC]^4 - 2 \[FormalA]^2 \[FormalX]^2 - 2 \[FormalC]^2 \[FormalX]^2 + \[FormalX]^4 - 2 \[FormalA]^2 \[FormalY]^2 - 2 \[FormalC]^2 \[FormalY]^2 + 2 \[FormalX]^2 \[FormalY]^2 + \[FormalY]^4 - 2 \[FormalA]^2 \[FormalZ]^2 + 2 \[FormalC]^2 \[FormalZ]^2 + 2 \[FormalX]^2 \[FormalZ]^2 + 2 \[FormalY]^2 \[FormalZ]^2 + \[FormalZ]^4]] |

I always thought this looked incredibly complicated. And for Version 12 we wanted to simplify it. We tried many possibilities, but eventually settled on single gray underdots—which I think look much better.

In `AxiomaticTheory`, both the variables and the operators are “purely symbolic”. But one thing that’s definite is the arity of each operator, which one can ask `AxiomaticTheory`:

✕
AxiomaticTheory["BooleanAxioms"] |

✕
AxiomaticTheory["BooleanAxioms","OperatorArities"] |

Conveniently, the representation of operators and arities can immediately be fed into `Groupings`, to get possible expressions involving particular variables:

✕
Groupings[{a,b},%] |

Axiomatic theories represent a classic historical area for mathematics. Another classical historical area—much more on the applied side—is the *n*-body problem. Version 12.0 introduces `NBodySimulation`, which gives simulations of the *n*-body problem. Here’s a three-body problem (think Earth-Moon-Sun) with certain initial conditions (and inverse-square force law):

✕
NBodySimulation["InverseSquare",{<|"Mass"->1,"Position"->{0,0},"Velocity"->{0,.5}|>, <|"Mass"->1,"Position"->{1,1},"Velocity"->{0,-.5}|>, <|"Mass"->1,"Position"->{0,1},"Velocity"->{0,0}|>},4] |

You can ask about various aspects of the solution; this plots the positions as a function of time:

✕
ParametricPlot[Evaluate[%[All, "Position", t]], {t, 0, 4}] |

Underneath, this is just solving differential equations, but—a bit like `SystemModel`—`NBodySimulation` provides a convenient way to set up the equations and handle their solutions. And, yes, standard force laws are built in, but you can define your own.

We’ve been polishing the core of the Wolfram Language for more than 30 years now, and in each successive version we end up introducing some new extensions and conveniences.

We’ve had the function `Information` ever since Version 1.0, but in 12.0 we’ve greatly extended it. It used to just give information about symbols (although that’s been modernized as well):

✕
Information[Sin] |

But now it also gives information about lots of kinds of objects. Here’s information on a classifier:

✕
Information[Classify["NotablePerson"]] |

Here’s information about a cloud object:

✕
Information[CloudPut[100!]] |

Hover over the labels in the “information box” and you can find out the names of the corresponding properties:

✕
Information[CloudPut[100!],"FileHashMD5"] |

For entities, `Information` gives a summary of known property values:

✕
Information[Entity["Element", "Tungsten"]] |

Over the past few versions, we’ve introduced a lot of new summary display forms. In Version 11.3 we introduced `Iconize`, which is essentially a way of creating a summary display form for anything. `Iconize` has proved to be even more useful than we originally anticipated. It’s great for hiding unnecessary complexity both in notebooks and in pieces of Wolfram Language code. In 12.0 we’ve redesigned how `Iconize` displays, particularly to make it “read nicely” inside expressions and code.

You can explicitly iconize something:

✕
{a,b,Iconize[Range[10]]} |

Press the + and you’ll see some details:

Press and you’ll get the original expression again:

If you have lots of data you want to reference in a computation, you can always store it in a file, or in the cloud (or even in a data repository). It’s usually more convenient, though, to just put it in your notebook, so you have everything in the same place. One way to avoid the data “taking over your notebook” is to put in closed cells. But `Iconize` provides a much more flexible and elegant way to do this.

When you’re writing code, it’s often convenient to “iconize in place”. The right-click menu now lets you do that:

✕
Plot[Sin[x], {x, 0, 10}, PlotStyle -> Red, Filling -> Axis, FillingStyle -> LightYellow] |

Talking of display, here’s something small but convenient that we added in 12.0:

✕
PercentForm[0.3] |

And here are a couple of other “number conveniences” that we added:

✕
NumeratorDenominator[11/4] |

✕
MixedFractionParts[11/4] |

Functional programming has always been a central part of the Wolfram Language. But we’re continually looking to extend it, and to introduce new, generally useful primitives. An example in Version 12.0 is `SubsetMap`:

✕
SubsetMap[Reverse, {a, b, c, xxx, yyy, zzz}, {2, 5}] |

✕
SubsetMap[Reverse@*Map[f], {a, b, c, xxx, yyy, zzz}, {2, 5}] |

Functions are normally things that can take several inputs, but always give a single piece of output. In areas like quantum computing, however, one’s interested instead in having inputs and outputs. `SubsetMap` effectively implements functions, picking up inputs from specified positions in a list, applying some operation to them, then putting back the results at the same positions.

I started formulating what’s now `SubsetMap` about a year ago. And I quickly realized that actually I could really have used this function in all sorts of places over the years. But what should this particular “lump of computational work” be called? My initial working name was `ArrayReplaceFunction` (which I shortened to `ARF` in my notes). In a sequence of (livestreamed) meetings we went back and forth. There were ideas like `ApplyAt` (but it’s not really `Apply`) and `MutateAt` (but it’s not doing mutation in the lvalue sense), as well as `RewriteAt`, `ReplaceAt`, `MultipartApply` and `ConstructInPlace`. There were ideas about curried “function decorator” forms, like `PartAppliedFunction`, `PartwiseFunction`, `AppliedOnto`, `AppliedAcross` and `MultipartCurry`.

But somehow when we explained the function we kept on coming back to talking about how it was operating on a subset of a list, and how it was really like `Map`, except that it was operating on multiple elements at a time. So finally we settled on the name `SubsetMap`. And—in yet another reinforcement of the importance of language design—it’s remarkable how, once one has a name for something like this, one immediately finds oneself able to reason about it, and see where it can be used.

For many years we’ve worked hard to make the Wolfram Language the highest-level and most automated system for doing state-of-the-art machine learning. Early on, we introduced the “superfunctions” `Classify` and `Predict` that do classification and prediction tasks in a completely automated way, automatically picking the best approach for the particular input given. Along the way, we’ve introduced other superfunctions—like `SequencePredict`, `ActiveClassification` and `FeatureExtract`.

In Version 12.0 we’ve got several important new machine learning superfunctions. There’s `FindAnomalies`, which finds “anomalous elements” in data:

✕
FindAnomalies[{1.2, 2.5, 3.2, 107.6, 4.6, 5, 5.1, 204.2}] |

Along with this, there’s `DeleteAnomalies`, which deletes elements it considers anomalous:

✕
DeleteAnomalies[{1.2, 2.5, 3.2, 107.6, 4.6, 5, 5.1, 204.2}] |

There’s also `SynthesizeMissingValues`, which tries to generate plausible values for missing pieces of data:

✕
SynthesizeMissingValues[{{1.1,1.4},{2.3,3.1},{3,4},{Missing[],5.4},{8.7,7.5}}] |

How do these functions work? They’re all based on a new function called `LearnDistribution`, which tries to learn the underlying distribution of data, given a certain set of examples. If the examples were just numbers, this would essentially be a standard statistics problem, for which we could use something like `EstimatedDistribution`. But the point about `LearnDistribution` is that it works with data of any kind, not just numbers. Here it is learning an underlying distribution for a collection of colors:

✕
dist = LearnDistribution[{RGBColor[0.5172966964096541, 0.4435322033449375, 1.], RGBColor[0.3984626930847484, 0.5592892024442906, 1.], RGBColor[0.6149389612362844, 0.5648721294502163, 1.], RGBColor[0.4129156497559272, 0.9146065592632544, 1.], RGBColor[0.7907065846445507, 0.41054133291260947`, 1.], RGBColor[0.4878854162550912, 0.9281119680196579, 1.], RGBColor[0.9884362181280959, 0.49025178842859785`, 1.], RGBColor[0.633242503827218, 0.9880985331612835, 1.], RGBColor[0.9215182482568276, 0.8103084921468551, 1.], RGBColor[0.667469513641223, 0.46420827644204676`, 1.]}] |

Once we have this “learned distribution”, we can do all sorts of things with it. For example, this generates 20 random samples from it:

✕
RandomVariate[dist,20] |

But now think about `FindAnomalies`. What it has to do is to find out which data points are anomalous relative to what’s expected. Or, in other words, given the underlying distribution of the data, it finds what data points are outliers, in the sense that they should occur only with very low probability according to the distribution.

And just like for an ordinary numerical distribution, we can compute the PDF for a particular piece of data. Purple is pretty likely given the distribution of colors we’ve learned from our examples:

✕
PDF[dist, RGBColor[ 0.6323870562875563, 0.3525878887878987, 1.0002083564175581`]] |

But red is really really unlikely:

✕
PDF[dist, RGBColor[1, 0, 0]] |

For ordinary numerical distributions, there are concepts like CDF that tell us cumulative probabilities, say that we’ll get results that are “further out” than a particular value. For spaces of arbitrary things, there isn’t really a notion of “further out”. But we’ve come up with a function we call `RarerProbability`, that tells us what the total probability is of generating an example with a smaller PDF than something we give:

✕
RarerProbability[dist, RGBColor[ 0.6323870562875563, 0.3525878887878987, 1.0002083564175581`]] |

✕
RarerProbability[dist, RGBColor[1, 0, 0]] |

Now we’ve got a way to describe anomalies: they’re just data points that have a very small rarer probability. And in fact `FindAnomalies` has an option `AcceptanceThreshold` (with default value 0.001) that specifies what should count as “very small”.

OK, but let’s see this work on something more complicated than colors. Let’s train an anomaly detector by looking at 1000 examples of handwritten digits:

✕
AnomalyDetection[RandomSample[ResourceData["MNIST"][[All,1]],1000]] |

Now `FindAnomalies` can tell us which examples are anomalous:

✕
FindAnomalies[AnomalyDetection[RandomSample[ResourceData["MNIST"][[All,1]],1000]], {\!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x+84O9URsb6P1ilPk1jAoLzWOUymJiEcchNY2Srm80kcAObHC9z1/8w Jm9sUh0sWf+/2DItxyJ1T5Cp9f8tJqbDWOTmMgHlinDK8UpyMVn+xCL3K4iJ Eei7TdicAgT2jIyFOKT+5zGJ38YhtYiRtR6H1CtuRkNcJlozMa/BIfVYiMkA h9QjAyatF9gkrqo2GjDpPMeq6RzQ0zrPsBv4NI4p+AcuN1ITAABxtMfa "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x/kgJGJcTUOqV4mFqY12KWKmBiZZI9jlwPqYsEu9ciKgYnRGrsuK6Au 68e4dDEw4dbFVIpdFyNeu7D77NEqoC6mXhLt+n8Mt79C5XGGYhhuf4F14bAL t7+OyeMKw///LYH+wi7z//9jayYWXHLUBgCB+cHS "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x/M4I7MI1SBfL2vMOYpxsuocqGMd2DMLehyzoy9MKYTulwWYwuMKY0p dxanXCJCQJrtBqqcOWMXlPVLUhdVCmim76qdm+fNu76wktHr27dvyHLtjChA GFnuZbkTI6NiQIB/ABvDhOXn0Ez9+/37LxAtJPDsPy4gZIZT6gZnC065HYyn cMr1IQIWAyQy+q/ELSd7FbfcBNxmir/DKUcVAADomc0b "], {{0, 28}, {28, 0}}, { 0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwUDwZwxxCnVOZfnHJ8T75rYAiKC4Cp2n+rMKRcXm8E099+GaNLsW/7OQ9E 2/9ZhqFtwT8HEMV07J8RulT7v1WMINru3xsBNKmw339cQDTroX/FaFLSD/9N BzOU/n1gR5Mz+vdeBESz7P2XwsAujSIn+/zfxZychqO9//7dOHoRzX8xf/5B wN9fi/250ExVCwWC4n8//TE8BwW6/97ikmJI+HcLl5Tc43+TcUhxrP33uxmH nO+/P6W4jDz4bxNOl9AFAAAYpls0 "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x8UgKEep5QDbrn9DAz7SdC2vx6uDYtyGO2AKQU1CtO2/bAAxLStHqYa 05FAKTBwwLRtPwMSwHQHg0M9RDu6bRAZCAPd/fX1cLPRtSGZjaENydr9uOTw pR88cvuxuBJJDqd19AAAMwi/NQ== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x+M4FkLV9+VK1fmZ2czqn5GlbotwwwETBDiOKrcPCYkuW1oZvZJOMWu BgJrZiaJ92hyPx6DqSkCzMInsTrokAwTk/AybDLH6oH2WWPR9emYPCczE1fj DwyZ9/tVwU6Uj9//BU3qoBIz3A+qaPbVMzExMjHJNU8p0hFgYij9jSy3Sl4t 48CBVyDm1UIm5lcoGj8ignAquhwCXHHFDBeYq3CFy9srSUxMTJjhcvbYxn51 kB+CMKSmcHGygPwnf/wzhpwbSIts8GrMIAO6gUktp+05DrdTEQAAo1CVcQ== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x/s4PcnHBKvNzqZCNX8wRD/8nmfPx8jEAjFo0udtVcAinNouWpp5aPL WTIysoauvIvNpq0cIhpJ2B1xgpvR7zsOKWFGRhxufyLEyLj5H1apz9ZAF0rk fsP01/8f9YwQEP0VTeZGqTojo+Xmfd0yjIy6P1HlUhk52yc+BrG6uTnPosol MDKKN4K93MDPuA1V7m0qMBA5tUr0tRgZ1dAt/L/SGOIUBqU3mA79sn9KhrOz c+0HrD4c3AAAH4+4UQ== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x+UoCeKkZGBkZGx4hmmHDsTFBg/wi3H1I0h186hbwkErExM2piGbnsP JN7xYpUDgr9vzUAWYkpcqK3NB9u3BUMqghXqlKq/6FKdMFfqY/qvFu4Fz1fo clOYmTSNjAyx27ei/Or//3+WcaHJ/UZi8qHKnbdtug6TckEzcx4Tk+xVMOuz J1DKFsmY/3v5gZLX/v/fPUkeKCV7AsUZU4FCchYWPCBHStehOnG/EsxvzNpn 0N3/ygEiJbcIw2v//7/v7nYUKOq+hkWKugAABiF8Xw== "], {{0, 28}, {28, 0}}, { 0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9GwAHim/376+sxpRgYHOoZwABdjgEJYJOrd6iv378fS1DtR6jC7Sg8 cvV45erxGEl1OWzeI8Ip+LU5kGMk0JX7ybHOgTwj0QEApknS3g== "], {{0, 28}, { 28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwUD16Qe0EEp9yBfw045Vb924hTbtm/YJxyH/964ZY7j1Mq4F8/Trl6PHKb yJbrwiNngEtK6CduOZF/N7hwy53DaZ3Ifzxy/1bgkcvHYyZuOd5DrjjlqAUA H0Iyqg== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x+s4FJKpNVW7FLzeBgZGdnPYJPaz83IKcXIGIVF6q8ro+zlN6u5NBu/ YMidY2RfBaS2MjLOQpf67MiYDaL/KDLyP0WT62OUfQBmTGJkbECTC2PMh9qr xMh4HkXqLovsTyjzDi/jORS5HsZkOFsMTS6csRfGvMfDegVZ6pU41w0Y25C7 HEXbOkZxKOtvC8tGVFduhMn97GZMQPNBCUyunpHxCppcG0Tu6yZWsdP/0OSe mXAD3X1Jg1Hi/H8MUMgo1mUkyqq+HlPq/3JmYLzyVmCRAYLZVTbBP7FLURMA AEeuuRo= "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\), \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9gcJ9hDy6pX9GMc3FI/bjAyPgWh9wrZUbeH7hsY2QMwmXdHUamaTik PlgzsuHStpORMQKXnCuj0C8cUqdZGZVxaZvFyIjLJf/dGKU+4TKSjTERl7bt eIz0YpR9h0PqPDNuI2czyj/GIfVRl9EVl7Y5jIwzccntl5X+jEuOTgAACjPm MQ== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\)}] |

We first introduced our symbolic framework for constructing, exploring and using neural networks back in 2016, as part of Version 11. And in every version since then we’ve added all sorts of state-of-the-art features. In June 2018 we introduced our Neural Net Repository to make it easy to access the latest neural net models from the Wolfram Language—and already there are nearly 100 curated models of many different types in the repository, with new ones being added all the time.

So if you need the latest BERT “transformer” neural network (that was added today!), you can get it from `NetModel`:

✕
NetModel["BERT Trained on BookCorpus and English Wikipedia Data"] |

You can open this up and see the network that’s involved (and, yes, we’ve updated the display of net graphs for Version 12.0):

And you can immediately use the network, here to produce some kind of “meaning features” array:

✕
NetModel["BERT Trained on BookCorpus and English Wikipedia Data"][ "What a wonderful network!"] // MatrixPlot |

In Version 12.0 we’ve introduced several new layer types—notably `AttentionLayer`, which lets one set up the latest “transformer” architectures—and we’ve enhanced our “neural net functional programming” capabilities, with things like `NetMapThreadOperator`, and multiple-sequence `NetFoldOperator`. In addition to these “inside-the-net” enhancements, Version 12.0 adds all sorts of new `NetEncoder` and `NetDecoder` cases, such as BPE tokenization for text in hundreds of languages, and the ability to include custom functions for getting data into and out of neural nets.

But some of the most important enhancements in Version 12.0 are more infrastructural. `NetTrain` now supports multi-GPU training, as well as dealing with mixed-precision arithmetic, and flexible early-stopping criteria. We’re continuing to use the popular MXNet low-level neural net framework (to which we’ve been major contributors)—so we can take advantage of the latest hardware optimizations. There are new options for seeing what’s happening during training, and there’s also `NetMeasurements` that allows you to make 33 different types of measurements on the performance of a network:

✕
NetMeasurements[NetModel["LeNet Trained on MNIST Data"], {\!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9YUI/HAQ4M+3HKMTDU4zYSt5wDA24z8QUGHjmgdQ54tOFySj0eIx3w +ICAkftxa8NpHR4jCXicrECpxxPO+3G7hE4AAARG3ZY= "], {{0, 28}, {28, 0}}, { 0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\) -> 1, \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x964N8LDwZGxtQ72OROMvJOKA9glLmJKXWRVWTn//8fuhkljqBLfZZn fQyiTzExWl5Hk/Nn1AHTmxgZGd2/ocopMn4E0z9NGbnT/6BIvRMzggg8VmDq RjPyHOMsEPV7tRyjH7pTVjOeA8pcjWJk1DiIIcem2NygD3QHH4bU//9NYoyi kQv4GDsxpYCuefz2nQJj3V9scv///wpk9MYh9W8mo/wH7FL/rzDynsAh9Vqa uR2H1BdFpnxcUoaMoTik/ocxOv3BIfVcEBRmVAIAcZ7Grw== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\) -> 9, \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x8Q8C2ckTEiPz//DRapaCYmKUkmJqZ7mHJ3mZjMUlOnpqa+xpR7r2X2 Fad9sfm43dKDR86Pq6F1dV9RpqpKa+v7b2hyTBCgIQIkxLuuIMtt4Wdikm3e u/fbLP9VtcpMnA+QJZ9c3fEKyvxzxYtJ9ChO6ycz7sUp99vd6gVOyRQFLCEE AV9YRR/hkrvNVIrEu1O2bNkFOC+eaSmS3De3UBkeccvWw58/v2qNZ/a5i2pQ FyswTBjAwSNzA92Ww6vElBlBUrKXsbjh+Rv+3nv37r3CIkUPAAABtrX9 "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\) -> 5, \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/6cx+OTNyMjIkIdN6n0UEwjwP8UiF8gEAcYvMeUEoHJMN7HK8WmKAeVS MeVShXvW/z8uxMTkgCn36xOIlMcqBwZLuHDLNQPt68cutYsXuzuB4Ls7UIr3 LrrwgubmSf/TgVJcO9BkrjVwMjGxybMC5WaiSd1XhYUJk+FjNLlOuBSTxg1U qZVcCDkmlYMHD9ZvhsvNYkIHFhhyVnlKGHLrQT5myrz08v/9Nk0gi8XiAMLC KUCB8J9g5uM5c+bMxxowAwoAzGhtzQ== "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\) -> 2, \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x8u4FBT01YEb11TJQPLIRiviYmJxaoTAorYWZiYmDj+wOR+bmNnQgGO 25FMfVuS5YqQ4tqHZum3Z2DgzsQk+QK7sxZyM9luwOFkDyamJTiklvExeb/H LnWKn4n/MHapN15M/CtwmBjGxDQdh9RqASad19ilDvMz8S3CLvXBl4kpHIeJ AcBgfIVdagsfE9Np7FIHeJiYrHBom8bEZITDjSC5KBxS/88ElbzBJYcNAAB0 /LWr "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]\) -> 7}, "Perplexity"] |

Neural nets aren’t the only—or even always the best—way to do machine learning. But one thing that’s new in Version 12.0 is that we’re now able to use self-normalizing networks automatically in `>Classify` and `Predict`, so they can easily take advantage of neural nets when it makes sense.

We introduced `ImageIdentify`, for identifying what an image is of, back in Version 10.1. In Version 12.0 we’ve managed to generalize this, to figure out not only what an image is of, but also what’s in an image. So, for example, `ImageCases` will show us cases of known kinds of objects in an image:

✕
ImageCases[CloudGet["https://wolfr.am/CMoUVVTH"]] |

For more details, `ImageContents` gives a dataset about what’s in an image:

✕
ImageContents[CloudGet["https://wolfr.am/CMoUVVTH"]] |

You can tell `ImageCases` to look for a particular kind of thing:

✕
ImageCases[CloudGet["https://wolfr.am/CMoUVVTH"], "zebra"] |

And you can also just test to see whether an image contains a particular kind of thing:

✕
ImageContainsQ[CloudGet["https://wolfr.am/CMoUVVTH"], "zebra"] |

In a sense, `ImageCases` is like a generalized version of `FindFaces`, for finding human faces in an image. Something new in Version 12.0 is that `FindFaces` and `FacialFeatures` have become more efficient and robust—with `FindFaces` now based on neural networks rather than classical image processing, and the network for `FacialFeatures` now being 10 MB rather than 500 MB:

✕
FacialFeatures[CloudGet["https://wolfr.am/CO20sk12"]] // Dataset |

Functions like `ImageCases` represent “new-style” image processing, of a type that didn’t seem conceivable only a few years ago. But while such functions let one do all sorts of new things, there’s still lots of value in more classical techniques. We’ve had fairly complete classical image processing in the Wolfram Language for a long time, but we continue to make incremental enhancements.

An example in Version 12.0 is the `ImagePyramid` framework, for doing multiscale image processing:

✕
ImagePyramid[CloudGet["https://wolfr.am/CTWBK9Em"]][All] |

There are several new functions in Version 12.0 concerned with color computation. A key idea is `ColorsNear`, which represents a neighborhood in perceptual color space, here around the color `Pink`:

✕
ChromaticityPlot3D[ColorsNear[Pink,.2]] |

The notion of color neighborhoods can be used, for example, in the new `ImageRecolor` function:

✕
ImageRecolor[CloudGet["https://wolfr.am/CT2rFF6e"], ColorsNear[RGBColor[ Rational[1186, 1275], Rational[871, 1275], Rational[1016, 1275]], .02] -> Orange] |

As I sit at my computer writing this, I’ll say something to my computer, and capture it:

Play Audio

Here’s a spectrogram of the audio I captured:

✕
Spectrogram[%] |

So far we could do this in Version 11.3 (though `Spectrogram` got 10 times faster in 12.0). But now here’s something new:

✕
SpeechRecognize[%%] |

We’re doing speech-to-text! We’re using state-of-the-art neural net technology, but I’m amazed at how well it works. It’s pretty streamlined, and we’re perfectly well able to handle even very long pieces of audio, say stored in files. And on a typical computer the transcription will run at about actual real-time speed, so that an hour of speech will take about an hour to transcribe.

Right now we consider `SpeechRecognize` experimental, and we’ll be continuing to enhance it. But it’s interesting to see another major computational task just become a single function in the Wolfram Language.

In Version 12.0, there are other enhancements too. `SpeechSynthesize` supports new languages and new voices (as listed by `VoiceStyleData[]`).

There’s now `WebAudioSearch`—analogous to `WebImageSearch`—that lets you search for audio on the web:

✕
WebAudioSearch["rooster"] |

You can retrieve actual `Audio` objects:

✕
WebAudioSearch["rooster","Samples",MaxItems->3] |

Then you can make spectrograms or other measurements:

✕
Spectrogram /@% |

And then—new in Version 12.0—you can use `AudioIdentify` to try to identify the category of sound (is that a talking rooster?):

✕
AudioIdentify/@%% |

We still consider `AudioIdentify` experimental. It’s an interesting start, but it definitely doesn’t, for example, work as well as `ImageIdentify`.

A more successful audio function is `PitchRecognize`, which tries to recognize the dominant frequency in an audio signal (it uses both “classical” and neural net methods). It can’t yet deal with “chords”, but it works pretty much perfectly for “single notes”.

When one deals with audio, one often wants not just to identify what’s in the audio, but to annotate it. Version 12.0 introduces the beginning of a large-scale audio framework. Right now `AudioAnnotate` can mark where there’s silence, or where there’s something loud. In the future, we’ll be adding speaker identification and word boundaries, and lots else. And to go along with these, we also have functions like `AudioAnnotationLookup`, for picking out parts of an audio object that have been annotated in particular ways.

Underneath all this high-level audio functionality there’s a whole infrastructure of low-level audio processing. Version 12.0 greatly enhances `AudioBlockMap` (for applying filters to audio signals), as well as introduces functions like `ShortTimeFourier`.

A spectrogram can be viewed a bit like a continuous analog of a musical score, in which pitches are plotted as a function of time. In Version 12.0 there’s now `InverseSpectrogram`—that goes from an array of spectrogram data to audio. Ever since Version 2 in 1991, we’ve had `Play` to generate sound from a function (like `Sin[100 t]`). Now with `InverseSpectrogram` we have a way to go from a “frequency-time bitmap” to a sound. (And, yes, there are tricky issues about best guesses for phases when one only has magnitude information.)

Starting with Wolfram|Alpha, we’ve had exceptionally strong natural language understanding (NLU) capabilities for a long time. And this means that given a piece of natural language, we’re good at understanding it as Wolfram Language—that we can then go and compute from:

✕
EntityValue[ EntityClass[ "Country", {EntityProperty["Country", "EntityClasses"] -> EntityClass["Country", "Europe"], EntityProperty["Country", "Population"] -> TakeLargest[5]}], EntityProperty["Country", "Flag"]] |

But what about natural language processing (NLP)—where we’re taking potentially long passages of natural language, and not trying to completely understand them, but instead just find or process particular features of them? Functions like `TextSentences`, `TextStructure`, `TextCases` and `WordCounts` have given us basic capabilities in this area for a while. But in Version 12.0—by making use of the latest machine learning, as well as our longstanding NLU and knowledgebase capabilities—we’ve now jumped to having very strong NLP capabilities.

The centerpiece is the dramatically enhanced version of `TextCases`. The basic goal of `TextCases` is to find cases of different types of content in a piece of text. An example of this is the classic NLP task of “entity recognition”—with `TextCases` here finding what country names appear in the Wikipedia article about ocelots:

✕
TextCases[WikipediaData["ocelots"],"Country"->"Interpretation"] |

We could also ask what islands are mentioned, but now we won’t ask for a Wolfram Language interpretation:

✕
TextCases[WikipediaData["ocelots"],"Island"] |

`TextCases` isn’t perfect, but it does pretty well:

✕
TextCases[WikipediaData["ocelots"],"Date"] |

It supports a whole lot of different content types too:

You can ask it to find pronouns, or reduced relative clauses, or quantities, or email addresses, or occurrences of any of 150 kinds of entities (like companies or plants or movies). You can also ask it to pick out pieces of text that are in particular human or computer languages, or that are about particular topics (like travel or health), or that have positive or negative sentiment. And you can use constructs like `Containing` to ask for combinations of these things (like noun phrases that contain the name of a river):

✕
TextCases[WikipediaData["ocelots"],Containing["NounPhrase","River"]] |

`TextContents` lets you see, for example, details of all the entities that were detected in a particular piece of text:

✕
TextContents[TextSentences[WikipediaData["ocelots"], 1]] |

And, yes, one can in principle use these capabilities through `FindTextualAnswer` to try to answer questions from text—but in a case like this, the results can be pretty wacky:

✕
FindTextualAnswer[WikipediaData["ocelots"],"weight of an ocelot",5] |

Of course, you can get a real answer from our actual built-in curated knowledgebase:

✕
Entity["Species", "Species:LeopardusPardalis"][ EntityProperty["Species", "Weight"]] |

By the way, in Version 12.0 we’ve added a variety of little “natural language convenience functions”, like `Synonyms` and `Antonyms`:

✕
Synonyms["magnificent"] |

One of the “surprise” new areas in Version 12.0 is computational chemistry. We’ve had data on explicit known chemicals in our knowledgebase for a long time. But in Version 12.0 we can compute with molecules that are specified simply as pure symbolic objects. Here’s how we can specify what turns out to be a water molecule:

✕
Molecule[{Atom["H"],Atom["H"],Atom["O"]},{Bond[{1,3}],Bond[{2,3}]}] |

And here’s how we can make a 3D rendering:

✕
MoleculePlot3D[%] |

We can deal with “known chemicals”:

✕
Molecule[Entity["Chemical", "Caffeine"]] |

We can use arbitrary IUPAC names:

✕
MoleculePlot3D[Molecule["2,4,6-trimethoxybenzaldehyde"]] |

Or we “make up” chemicals, for example specifying them by their SMILES strings:

✕
MoleculePlot3D[Molecule["O1CNNONC(N(OOC)OO)CCNONOCONCCONNCOC1"]] |

But we’re not just generating pictures here. We can also compute things from the structure—like symmetries:

✕
Molecule["C1=CC=CC=C1"]["PointGroup"] |

Given a molecule, we can do things like highlight carbon-oxygen bonds:

✕
MoleculePlot[ Molecule["C=C1[C@H](O)C2O[C@@]3(CC[C@H](/C=C/[C@@H](C)[C@@H]4CC(C)=C[\ C@@]5(O[C@H](C[C@@](C)(O)C(=O)O)CC[C@H]5O)O4)O3)CC[C@H]2O[C@H]1[C@@H](\ O)C[C@H](C)C1O[C@@]2(CCCCO2)CC[C@H]1C"], Bond[{"C", "O"}]] |

Or highlight structures, say specified by SMARTS strings (here any 5-member ring):

✕
MoleculePlot[Molecule["C=C1[C@H](O)C2O[C@@]3(CC[C@H](/C=C/[C@@H](C)[C@@H]4CC(C)=C[C@@]5(O[C@H](C[C@@](C)(O)C(=O)O)CC[C@H]5O)O4)O3)CC[C@H]2O[C@H]1[C@@H](O)C[C@H](C)C1O[C@@]2(CCCCO2)CC[C@H]1C"], MoleculePattern["[r5]"]] |

You can also do searches for “molecule patterns”; the results come out in terms of atom numbers:

✕
FindMoleculeSubstructure[Molecule["C=C1[C@H](O)C2O[C@@]3(CC[C@H](/C=C/[C@@H](C)[C@@H]4CC(C)=C[C@@]5(O[C@H](C[C@@](C)(O)C(=O)O)CC[C@H]5O)O4)O3)CC[C@H]2O[C@H]1[C@@H](O)C[C@H](C)C1O[C@@]2(CCCCO2)CC[C@H]1C"], MoleculePattern["[r5]"], All] |

The computational chemistry capabilities we’ve added in Version 12.0 are pretty general and pretty powerful (with the caveat that so far they only deal with organic molecules). At the lowest level they view molecules as labeled graphs with edges corresponding to bonds. But they also know about physics, and correctly account for atomic valences and bond configurations. Needless to say, there are lots of details (about stereochemistry, symmetry, aromaticity, isotopes, etc.). But the end result is that molecular structure and molecular computation have now successfully been added to the list of areas that are integrated into the Wolfram Language.

The Wolfram Language already has strong capabilities for geographic computing, but Version 12.0 adds more functions, and enhances some of those that were already there.

For example, there’s now `RandomGeoPosition`, which generates a random lat-long location. One might think this would be trivial, but of course one has to worry about coordinate transformations—and what makes it much more nontrivial is that one can tell it to pick points only inside a certain region, here the country of France:

✕
GeoListPlot[RandomGeoPosition[Entity["Country", "France"],100]] |

A theme of new geographic capabilities in Version 12.0 is handling not just geographic points and regions, but also geographic vectors. Here’s the current wind vector, for example, at the position of the Eiffel Tower, represented as a `GeoVector`, with speed and direction (there’s also `GeoVectorENU`, which gives east, north and up components, as well as `GeoGridVector` and `GeoVectorXYZ`):

✕
WindVectorData[Entity["Building", "EiffelTower::5h9w8"],Now,"DownwindGeoVector"] |

Functions like `GeoGraphics` let you visualize discrete geo vectors. `GeoStreamPlot` is the geo analog of `StreamPlot` (or `ListStreamPlot`)—and shows streamlines formed from geo vectors (here from `WindDirectionData`):

✕
GeoStreamPlot[CloudGet["https://wolfr.am/CTZnxuQI"]] |

Geodesy is a mathematically sophisticated area, and we pride ourselves on doing it well in the Wolfram Language. In Version 12.0, we’ve added a few new functions to fill in some details. For example, we now have functions like `GeoGridUnitDistance` and `GeoGridUnitArea` which give the distortion (basically, eigenvalues of the Jacobian) associated with different geo projections at every position on Earth (or Moon, Mars, etc.).

One direction of visualization that we’ve been steadily developing is what one might call “meta-graphics”: the labeling and annotation of graphical things. We introduced `Callout` in Version 11.0; in Version 12.0 it’s been extended to things like 3D graphics:

✕
Plot3D[Callout[Exp[-(x^2+y^2)],"maximum",{0,0}],{x,-2,2},{y,-2,2}] |

It’s pretty good at figuring out where to label things, even when they get a bit complex:

✕
PolarPlot[Evaluate[Table[Callout[Sin[n θ],n],{n,4}]],{θ,0,π}] |

There are lots of details that matter in making graphics really look good. Something that’s been enhanced in Version 12.0 is ensuring that columns of graphics line up on their frames, regardless of the length of their tick labels. We’ve also added `LabelVisibility`, which allows you to specify the relative priorities with which different labels should be made visible.

Another new feature of Version 12.0 is multipanel plot layout, where different datasets are shown in different panels, but the panels share axes whenever they can:

✕
ListPlot[Table[RandomReal[10,50],6],PlotLayout->{"Column",3}] |

Our curated knowledgebase—that for example powers Wolfram|Alpha—is vast and continually growing. And with every version of the Wolfram Language we’re progressively tightening its integration into the core of the language.

In Version 12.0 one thing we’re doing is to expose hundreds of types of entities directly in the language:

Before Version 12.0, the Wolfram|Alpha Example pages served as a proxy for documenting many types of entities. But now there’s Wolfram Language documentation for all of them:

There are still functions like `SatelliteData`, `WeatherData` and `FinancialData` that handle entity types that routinely need complex selection or computation. But in Version 12.0, every entity type can be accessed in the same way, with natural language (“control + =”) input, and “yellow-boxed” entities and properties:

✕
Entity["Element", "Tungsten"][ EntityProperty["Element", "MeltingPoint"]] |

By the way, one can also use entities implicitly, like here asking for the 5 elements with the highest known melting points:

✕
Entity["Element", "MeltingPoint" -> TakeLargest[5]] // EntityList |

And one can use `Dated` to get a time series of values:

✕
Entity["University", "HarvardUniversity::cmp42"][ Dated[EntityProperty["University", "EstTotalUndergrad"], All]] |

We’ve made it really convenient to work with data that’s built into the Wolfram Knowledgebase. You have entities, and it’s very easy to ask about properties and so on:

✕
Entity["City", {"NewYork", "NewYork", "UnitedStates"}]["Population"] |

But what if you have your own data? Can you set it up so you can use it as easily as this? A major new feature of Version 11 was the addition of `EntityStore`, in which one can define one’s own entity types, then specify entities, properties and values.

The Wolfram Data Repository contains a bunch of examples of entity stores. Here’s one:

✕
ResourceData["Entity Store of Books in Stephen Wolfram's Library"] |

It describes a single entity type: an `"SWLibraryBook"`. To be able to use entities of this type just like built-in entities, we “register” the entity store:

✕
EntityRegister[%] |

Now we can do things like ask for 10 random entities of type `"SWLibraryBook"`:

✕
RandomEntity["SWLibraryBook",10] |

Each entity in the entity store has a variety of properties. Here’s a dataset of the values of properties for one particular entity:

✕
Entity["SWLibraryBook", "OL4258186M::mudwv"]["Dataset"] |

OK, but with this setup we’re basically reading the whole contents of an entity store into memory. This makes it very efficient to do whatever Wolfram Language operations one wants on it. But it’s not a good scalable solution for large amounts of data—for example, data that is too big to fit in memory.

But what’s a typical source of large data? Very often it’s a database, and usually a relational one that can be accessed using SQL. We’ve had our DatabaseLink package for low-level read-write access to SQL databases for well over a decade. But in Version 12.0 we’re adding some major built-in features that allow external relational databases to be handled in the Wolfram Language just like entity stores, or built-in parts of the Wolfram Knowledgebase.

Let’s start off with a toy example. Here’s a symbolic representation of a small relational database that happens to be stored in a file:

✕
RelationalDatabase[FindFile["ExampleData/ecommerce-database.sqlite"]] |

Immediately we get a box that summarizes what’s in the database, and tells us that this database has 8 tables. If we open up the box, we can start inspecting the structure of those tables:

We can then set this relational database up as an entity store in the Wolfram Language. It looks very much the same as the library book entity store above, but now the actual data isn’t pulled into memory; instead it’s still in the external relational database, and we’re just defining a (“ORM-like”) mapping to entities in the Wolfram Language:

✕
EntityStore[%] |

Now we can register this entity store, which sets up a bunch of entity types that (at least by default) are named after the names of the tables in the database:

✕
EntityRegister[%] |

And now we can do “entity computations” on these, just like we would on built-in entities in the Wolfram Knowledgebase. Each entity here corresponds to a row in the “employees” table in the database:

✕
EntityList["employees"] |

For a given entity type, we can ask what properties it has. These “properties” correspond to columns in the table in the underlying database:

✕
EntityProperties["employees"] |

Now we can ask for the value of a particular property of a particular entity:

✕
Entity["employees", 1076][EntityProperty["employees", "lastName"]] |

We can also pick out entities by giving criteria; here we’re asking for “payments” entities with the 4 largest values of the “amount” property:

✕
EntityList[EntityClass["payments","amount"->TakeLargest[4]]] |

We can equally ask for the values of these largest amounts:

✕
EntityValue[EntityClass["payments","amount"->TakeLargest[4]],"amount"] |

OK, but here’s where it gets more interesting: so far we’ve been looking at a little file-backed database. But we can do exactly the same thing with a giant database hosted on an external server.

As an example, let’s connect to the terabyte-sized OpenStreetMap PostgreSQL database that contains what is basically the street map of the world:

As before, let’s register the tables in this database as entity types. Like most in-the-wild databases there are little glitches in the structure, which are worked around, but generate warnings:

✕
EntityRegister[EntityStore[%]] |

But now we can ask questions about the database—like how many geo points or “nodes” there are in all the streets of the world (and, yes, it’s a big number, which is why the database is big):

✕
EntityValue["planet_osm_nodes", "EntityCount"] |

Here we’re asking for the names of the objects with the 10 largest (projected) areas in the (101 GB) planet_osm_polygon table (and, yes, it takes under a second):

✕
EntityValue[ EntityClass["planet_osm_polygon", "way_area" -> TakeLargest[10]], "name"] // Timing |

So how does all this work? Basically what’s happening is that our Wolfram Language representation is getting compiled into low-level SQL queries that are then sent to be executed directly on the database server.

Sometimes you’ll ask for results that are just final values (like, say, the “amounts” above). But in other cases you’ll want something intermediate—like a collection of entities that have been selected in a particular way. And of course this collection could have a billion entries. So a very important feature of what we’re introducing in Version 12.0 is that we can represent and manipulate such things purely symbolically, resolving them to something specific only at the end.

Going back to our toy database, here’s an example of how we’d specify a class of entities obtained by aggregating the total `creditLimit` for all `customers` with a given value of `country`:

✕
AggregatedEntityClass["customers", "creditLimit" -> Total, "country"] |

At first, this is just something symbolic. But if we ask for specific values, then actual database queries get done, and we get specific results:

✕
EntityValue[%, {"country", "creditLimit"}] |

There’s a family of new functions for setting up different kinds of queries. And the functions actually work not only for relational databases, but also for entity stores, and for the built-in Wolfram Knowledgebase. So, for example, we can ask for the average atomic mass for a given period in the periodic table of elements:

✕
AggregatedEntityClass["Element", "AtomicMass" -> Mean, "Period"]["AtomicMass"] |

An important new construct is `EntityFunction`. `EntityFunction` is like `Function`, except that its variables represent entities (or classes of entities) and it describes operations that can be performed directly on external databases. Here’s an example with built-in data, in which we’re defining a “filtered” entity class in which the filtering criterion is a function which tests population values. The `FilteredEntityClass` itself is just represented symbolically, but `EntityList` actually performs the query, and resolves an explicit list of (here, unsorted) entities:

✕
FilteredEntityClass["Country", EntityFunction[c, c["Population"] > Quantity[10^8"People"]]] |

✕
EntityList[%] |

In addition to `EntityFunction`, `AggregatedEntityClass` and `SortedEntityClass`, Version 12.0 includes `SampledEntityClass` (to get a few entities from a class), `ExtendedEntityClass` (to add computed properties) and `CombinedEntityClass` (to combine properties from different classes). With these primitives, one can build up all the standard operations of “relational algebra”.

In standard database programming, one typically ends up with a whole jungle of “joins” and “foreign keys” and so on. Our Wolfram Language representation lets you operate at a higher level—where basically joins become function composition and foreign keys are just different entity types. (If you want to do explicit joins, though, you can—for example using `CombinedEntityClass`.)

What’s going on under the hood is that all those Wolfram Language constructs are getting compiled into SQL, or, more accurately, the specific dialect of SQL that’s suitable for the particular database you’re using (we currently support SQLite, MySQL, PostgreSQL and MS-SQL, with support for OracleSQL coming soon). When we do the compilation, we’re automatically checking types, to make sure you get a meaningful query. Even fairly simple Wolfram Language specifications can end up turning into many lines of SQL. For example,

✕
EntityFunction[c, c["employees"]["firstName"] <> " " <> c["employees"]["lastName"] <> " is the rep for " <> c["customerName"] <> ". Their manager is " <> c["employees"]["employees-reportsTo"]["firstName"] <> " " <> c["employees"]["employees-reportsTo"]["lastName"] <> "."][ Entity["customers", 103]] |

would produce the following intermediate SQL (here for querying the SQLite database):

The database integration system we have in Version 12.0 is pretty sophisticated—and we’ve been working on it for quite a few years. It’s an important step forward in allowing the Wolfram Language to directly handle a new level of “bigness” in big data—and to let the Wolfram Language directly do data science on terabyte-sized datasets and beyond. Like finding which street-like entities in the world have “Wolfram” in their name:

✕
FilteredEntityClass["planet_osm_line", EntityFunction[s, StringContainsQ[s["name"], "Wolfram"]]]["name"] |

What is the best way to represent knowledge about the world? It’s an issue that’s been debated by philosophers (and others) since antiquity. Sometimes people said logic was the key. Sometimes mathematics. Sometimes relational databases. But now we at least know one solid foundation (or at least, I’m pretty sure we do): everything can be represented by computation. This is a powerful idea—and in a sense that’s what makes everything we do with Wolfram Language possible.

But are there subsets of general computation that are useful for representing at least certain kinds of knowledge? One that we use extensively in the Wolfram Knowledgebase is the notion of entities (“New York City”), properties (“population”) and their values (“8.6 million people”). Of course such triples don’t represent all knowledge in the world (“what will the position of Mars be tomorrow?”). But they’re a decent start when it comes to certain kinds of “static” knowledge about distinct things.

So how can one formalize this kind of knowledge representation? One answer is through graph databases. And in Version 12.0—in alignment with many “semantic web” projects—we’re supporting graph databases using RDF, and queries against them using SPARQL. In RDF the central object is an IRI (“Internationalized Resource Identifier”), that can represent an entity or a property. A “triplestore” then consists of a collection of triples (“subject”, “predicate”, “object”), with each element in each triple being an IRI (or a literal, such as a number). The whole object can then be thought of as a graph database or graph store, or, mathematically, a hypergraph. (It’s a hypergraph because the predicate “edges” can also be vertices elsewhere.)

You can build your own `RDFStore` much like you build an `EntityStore`—and in fact you can query any Wolfram Language `EntityStore` using SPARQL just like you query an `RDFStore`. And since the entity-property part of the Wolfram Knowledgebase can be treated as an entity store, you can also query this. So here, finally, is an example. The country-city list `Entity["Country"], Entity["City"]}` in effect represents an RDF store. Then `SPARQLSelect` is an operator acting on this store. What it does is to try to find a triple that matches what you’re asking for, with a particular value for the “SPARQL variable” `x`:

✕
Needs["GraphStore`"] |

✕
SPARQLSelect[RDFTriple[Entity["Country", "USA"], EntityProperty["Country", "CapitalCity"], SPARQLVariable["x"]]][{Entity["Country"], Entity["City"]}] |

Of course, there’s also a much simpler way to do this in the Wolfram Language:

✕
Entity["Country", "USA"][EntityProperty["Country", "CapitalCity"]] |

But with SPARQL you can do much more exotic things—like ask what properties relate the US to Mexico:

✕
SPARQLSelect[RDFTriple[Entity["Country", "USA"],SPARQLVariable["x"],Entity["Country", "Mexico"]]][{Entity["Country"]}] |

Or whether there is a path based on the bordering country relation from Portugal to Germany:

✕
SPARQLAsk[ SPARQLPropertyPath[ Entity["Country", "Portugal"], {EntityProperty["Country", "BorderingCountries"] ..}, Entity["Country", "Germany"]]][Entity["Country"]] |

In principle you can just write a SPARQL query as a string (a bit like you can write an SQL string). But what we’ve done in Version 12.0 is introduce a symbolic representation of SPARQL that allows computation on the representation itself, making it easy, for example, to automatically generate complex SPARQL queries. (And it’s particularly important to do this because, on their own, practical SPARQL queries have a habit of getting extremely long and ponderous.)

OK, but are there RDF stores out in the wild? It’s been a long-running hope that a large part of the web will somehow eventually be tagged enough to “become semantic” and in effect be a giant RDF store. It’d be great if this happened, but so far it definitely hasn’t. Still, there are a few public RDF stores out there, and also some RDF stores within organizations, and with our new capabilities in Version 12.0 we’re in a unique position to do interesting things with them.

An incredibly common form of problem in industrial applications of mathematics is: “What configuration minimizes cost (or maximizes payoff) if certain constraints have to be satisfied?” More than half a century ago, the so-called simplex algorithm was invented for solving linear versions of this kind of problem, in which both the objective function (cost, payoff) and the constraints are linear functions of the variables in the problem. By the 1980s much more efficient (“interior point”) methods had been invented—and we’ve had these for doing “linear programming” in the Wolfram Language for a long time.

But what about nonlinear problems? Well, in the general case, one can use functions like `NMinimize`. And they do a state-of-the-art job. But it’s a hard problem. However, some years ago, it became clear that even among nonlinear optimization problems, there’s a class of so-called convex optimization problems that can actually be solved almost as efficiently as linear ones. (“Convex” means that both the objective and the constraints involve only convex functions—so that nothing can “wiggle” as one approaches an extremum, and there can’t be any local minima that aren’t global minima.)

In Version 12.0, we’ve now got strong implementations for all the various standard classes of convex optimization. Here’s a simple case, involving minimizing a quadratic form with a couple of linear constraints:

✕
QuadraticOptimization[2x^2+20y^2+6x y+5x,{-x+y>=2,y>=0},{x,y}] |

`NMinimize` could already do this particular problem in Version 11.3:

✕
NMinimize[{2x^2+20y^2+6x y+5x,{-x+y>=2,y>=0}},{x,y}] |

But if one had more variables, the old `NMinimize` would quickly bog down. In Version 12.0, however, `QuadraticOptimization` will continue to work just fine, up to more than 100,000 variables with more than 100,000 constraints (so long as they’re fairly sparse).

In Version 12.0 we’ve got “raw convex optimization” functions like `SemidefiniteOptimization` (that handles linear matrix inequalities) and `ConicOptimization` (that handles linear vector inequalities). But functions like `NMinimize` and `FindMinimum` will also automatically recognize when a problem can be solved efficiently by being transformed to a convex optimization form.

How does one set up convex optimization problems? Larger ones involve constraints on whole vectors or matrices of variables. And in Version 12.0 we now have functions like `VectorGreaterEqual` (input as ≥) that can immediately represent these.

Partial differential equations are hard, and we’ve been working on more and more sophisticated and general ways to handle them for 30 years. We first introduced `NDSolve` (for ODEs) in Version 2, back in 1991. We had our first (1+1-dimensional) numerical PDEs by the mid-1990s. In 2003 we introduced our powerful, modular framework for handling numerical differential equations. But in terms of PDEs we were still basically only dealing with simple, rectangular regions. To go beyond that required building our whole computational geometry system, which we introduced in Version 10. And with this, we released our first finite element PDE solvers. In Version 11, we then generalized to eigen problems.

Now, in Version 12, we’re introducing another major generalization: nonlinear finite element analysis. Finite element analysis involves decomposing regions into little discrete triangles, tetrahedra, etc.—on which the original PDE can be approximated by a large number of coupled equations. When the original PDE is linear, these equations will also be linear—and that’s the typical case people consider when they talk about “finite element analysis”.

But there are many PDEs of practical importance that aren’t linear—and to tackle these one needs nonlinear finite element analysis, which is what we now have in Version 12.0.

As an example, here’s what it takes to solve the nastily nonlinear PDE that describes the height of a 2D minimal surface (say, an idealized soap film), here over an annulus, with (Dirichlet) boundary conditions that make it wiggle sinusoidally at the edges (as if the soap film were suspended from wires):

✕
NDSolveValue[{Inactive[Div][(1/Sqrt[1 + \!\( \*SubscriptBox[\(\[Del]\), \({x, y}\)]\(u[x, y]\)\).\!\( \*SubscriptBox[\(\[Del]\), \({x, y}\)]\(u[x, y]\)\)]) Inactive[Grad][ u[x, y], {x, y}], {x, y}] == 0, DirichletCondition[u[x, y] == Sin[2 \[Pi] (x + y)], True]}, u, {x, y} \[Element] Region[Annulus[{0, 0}, {0.3, 1}]]] |

On my computer it takes just a quarter of a second to solve this equation, and get an interpolating function. Here’s a plot of the interpolating function representing the solution:

✕
Plot3D[%[x, y], {x, y} \[Element] Region[Annulus[{0, 0}, {0.3, 1}]] , MeshFunctions -> {#3 &}] |

We’ve put a lot of engineering into optimizing the execution of Wolfram Language programs over the years. Already in 1989 we started automatically compiling simple machine-precision numerical computations to instructions for an efficient virtual machine (and, as it happens, I wrote the original code for this). Over the years, we’ve extended the capabilities of this compiler, but it’s always been limited to fairly simple programs.

In Version 12.0 we’re taking a major step forward, and we’re releasing the first version of a new, much more powerful compiler that we’ve been working on for several years. This compiler is both able to handle a much broader range of programs (including complex functional constructs and elaborate control flows), and it’s also compiling not to a virtual machine but instead directly to optimized native machine code.

In Version 12.0 we still consider the new compiler experimental. But it’s advancing rapidly, and it’s going to have a dramatic effect on the efficiency of lots of things in the Wolfram Language. In Version 12.0, we’re just exposing a “kit form” of the new compiler, with specific compilation functions. But we’ll progressively be making the compiler operate more and more automatically—figuring out with machine learning and other methods when it’s worth taking the time to do what level of compilation.

At a technical level, the new Version 12.0 compiler is based on LLVM, and works by generating LLVM code—linking in the same low-level runtime library that the Wolfram Language kernel itself uses, and calling back to the full Wolfram Language kernel for functionality that isn’t in the runtime library.

Here’s the basic way one compiles a pure function in the current version of the new compiler:

✕
FunctionCompile[Function[Typed[x,"Integer64"],x^2]] |

The resulting compiled code function works just like the original function, though faster:

✕
%[12] |

A big part of what lets `FunctionCompile` produce a faster function is that you’re telling it to make assumptions about the type of argument it’s going to get. We’re supporting lots of basic types (like `"Integer32"` and `"Real64"`). But when you use `FunctionCompile`, you’re committing to particular argument types, so much more streamlined code can be produced.

A lot of the sophistication of the new compiler is associated with inferring what types of data will be generated in the execution of a program. (There are lots of graph theoretic and other algorithms involved, and needless to say, all the metaprogramming for the compiler is done with the Wolfram Language.)

Here’s an example that involves a bit of type inference (the type of `fib` is deduced to be `"Integer64""Integer64"`: an integer function returning an integer):

✕
FunctionCompile[Function[{Typed[n,"Integer64"]},Module[{fib},fib=Function[{x},If[x<=1,1,fib[x-1]+fib[x-2]]]; fib[n]]]] |

On my computer `cf`[25] runs about 300 times faster than the uncompiled function. (Of course, the compiled version fails when its output is no longer of type `"Integer64"`, but the standard Wolfram Language version continues to work just fine.)

Already the compiler can handle hundreds of Wolfram Language programming primitives, appropriately tracking what types are produced—and generating code that directly implements these primitives. Sometimes, however, one will want to use sophisticated functions in the Wolfram Language for which it doesn’t make sense to generate one’s own compiled code—and where what one really wants to do is just to call into the Wolfram Language kernel for these functions. In Version 12.0 `KernelFunction` lets one do this:

✕
FunctionCompile[Function[Typed[x,"Real64"],Typed[KernelFunction[AiryAi],{"Real64"}->"Real64"][x]]] |

OK, but let’s say one’s got a compiled code function. What can one do with it? Well, first of all one can just run it inside the Wolfram Language. One can store it too, and run it later. Any particular compilation is done for a specific processor architecture (e.g. 64-bit x86). But `CompiledCodeFunction` automatically keeps enough information to do additional compilation for a different architecture if it’s needed.

But given a `CompiledCodeFunction`, one of the interesting new possibilities is that one can directly generate code that can be run even outside the Wolfram Language environment. (Our old compiler had the `CCodeGenerate` package which provided slightly similar capabilities in simple cases—though even then relies on an elaborate toolchain of C compilers etc.)

Here’s how one can export raw LLVM code (notice that things like tail recursion optimization automatically get done—and notice also the symbolic function and compiler options at the end):

✕
FunctionCompileExportString[Function[{Typed[n,"Integer64"]},Module[{fib},fib=Function[{x},If[x<=1,1,fib[x-1]+fib[x-2]]]; fib[n]]]] |

If one uses `FunctionCompileExportLibrary`, then one gets a library file—.dylib on Mac, .dll on Windows and .so on Linux. One can use this in the Wolfram Language by doing `LibraryFunctionLoad`. But one can also use it in an external program.

One of the main things that determines the generality of the new compiler is the richness of its type system. Right now the compiler supports 14 atomic types (such as `"Boolean"`, `"Integer8"`, `"Complex64"`, etc.). It also supports type constructors like `"PackedArray"`—so that, for example, `TypeSpecifier["PackedArray"]["Real64", 2]` corresponds to a rank-2 packed array of 64-bit reals.

In the internal implementation of the Wolfram Language (which, by the way, is itself mostly in Wolfram Language) we’ve had an optimized way to store arrays for a long time. In Version 12.0 we’re exposing it as `NumericArray`. Unlike ordinary Wolfram Language constructs, you have to tell `NumericArray` in detail how it should store data. But then it works in a nice, optimized way:

✕
NumericArray[Range[10000], "UnsignedInteger16"] |

✕
ByteCount[%] |

✕
ByteCount[Range[10000]] |

In Version 11.2 we introduced `ExternalEvaluate`, that lets you do computations in languages like Python and JavaScript from within the Wolfram Language (in Python, “^” means `BitXor`):

✕
ExternalEvaluate["Python", "23424^2542"] |

In Version 11.3, we introduced external language cells, to make it easy to enter external-language programs or other input directly in a notebook:

✕
ExternalEvaluate["Python", "23424^2542"] |

In Version 12.0, we’re tightening the integration. For example, inside an external language string, you can use <* ... *> to give Wolfram Language code to evaluate:

✕
ExternalEvaluate["Python","<* Prime[1000] *> + 10"] |

This works in external language cells too:

✕
ExternalEvaluate["Python", "<* Prime[1000] *> + 10"] |

Of course, Python is not Wolfram Language, so many things don’t work:

✕
ExternalEvaluate["Python","2+ <* Range[10] *>"] |

But `ExternalEvaluate` can at least return many types of data from Python, including lists (as `List`), dictionaries (as `Association`), images (as `Image`), dates (as `DateObject`), NumPy arrays (as `NumericArray`) and pandas datasets (as `TimeSeries`, `Dataset`, etc.). (`ExternalEvaluate` can also return `ExternalObject` that’s basically a handle to an object that you can send back to Python.)

You can also directly use external functions (the slightly bizarrely named ord is basically the Python analog of `ToCharacterCode`):

✕
ExternalFunction["Python", "ord"]["a"] |

And here’s a Python pure function, represented symbolically in the Wolfram Language:

✕
ExternalFunction["Python", "lambda x:x+1"] |

✕
%[100] |

How should one access the Wolfram Language? There are many ways. One can use it directly in a notebook. One can call APIs that execute it in the cloud. Or one can use WolframScript in a command-line shell. WolframScript can run either against a local Wolfram Engine, or against a Wolfram Engine in the cloud. It lets you directly give code to execute:

And it lets you do things like define functions, for example with code in a file:

Along with the release of Version 12.0, we’re also releasing our first new Wolfram Language Client Library—for Python. The basic idea of this library is to make it easy for Python programs to call the Wolfram Language. (It’s worth pointing out that we’ve effectively had a C Language Client Library for no less than 30 years—through what’s now called WSTP.)

The way a Language Client Library works is different for different languages. For Python—as an interpreted language (that was actually historically informed by early Wolfram Language)—it’s particularly simple. After you set up the library, and start a session (locally or in the cloud), you can then just evaluate Wolfram Language code and get the results back in Python:

You can also directly access Wolfram Language functions (as a kind of inverse of `ExternalFunction`):

And you can directly interact with things like pandas structures, NumPy arrays, etc. In fact, you can in effect just treat the whole of the Wolfram Language like a giant library that can be accessed from Python. Or, of course, you can just use the nice, integrated Wolfram Language directly, perhaps creating external APIs if you need them.

One feature of using the Wolfram Language is that it lets you get away from having to think about the details of your computer system, and about things like files and processes. But sometimes one wants to work at a systems level. And for fairly simple operations, one can just use an operating system GUI. But what about for more complicated things? In the past I usually found myself using the Unix shell. But for a long time now, I’ve instead used Wolfram Language.

It’s certainly very convenient to have everything in a notebook, and it’s been great to be able to programmatically use functions like `FileNames` (ls), `FindList` (grep), `SystemProcessData` (ps), `RemoteRunProcess` (ssh) and `FileSystemScan`. But in Version 12.0 we’re adding a bunch of additional functions to support using the Wolfram Language as a “super shell”.

There’s `RemoteFile` for symbolically representing a remote file (with authentication if needed)— that you can immediately use in functions like `CopyFile`. There’s `FileConvert` for directly converting files between different formats.

And if you really want to dive deep, here’s how you’d trace all the packets on ports 80 and 443 used in reading from wolfram.com:

✕
NetworkPacketTrace[URLRead["wolfram.com"], {80, 443}] |

Within the Wolfram Language, it’s been easy for a long time to interact with web servers, using functions like `URLExecute` and `HTTPRequest`, as well as `$Cookies`, etc. But in Version 12.0 we’re adding something new: the ability of the Wolfram Language to control a web browser, and programmatically make it do what we want. The most immediate thing we can do is just to get an image of what a website looks like to a web browser:

✕
WebImage["https://www.wolfram.com"] |

The result is an image that we can compute with:

✕
EdgeDetect[%] |

To do something more detailed, we have to start a browser session (we currently support Firefox and Chrome):

✕
session = StartWebSession["Chrome"] |

Immediately a blank browser window appears on our screen. Now we can use `WebExecute` to open a webpage:

✕
WebExecute["OpenPage" -> "http://www.wolfram.com"] |

Now that we’ve opened the page, there are lots of commands we can run. This clicks the first hyperlink containing the text “Programming Lab”:

✕
WebExecute[ "ClickElement" -> "PartialHyperlinkText" -> "Programming Lab"] |

This returns the title of the page we’ve reached:

✕
WebExecute["PageTitle"] |

You can type into fields, run JavaScript, and basically do programmatically anything you could do by hand with a web browser. Needless to say, we’ve been using a version of this technology for years inside our company to test all our various websites and web services. But now, in Version 12.0, we’re making a streamlined version generally available.

For every general-purpose computer in the world today, there are probably 10 times as many microcontrollers—running specific computations without any general operating system. A microcontroller might cost a few cents to a few dollars, and in something like a mid-range car, there might be 30 of them.

In Version 12.0 we’re introducing a Microcontroller Kit for the Wolfram Language, that lets you give symbolic specifications from which it automatically generates and deploys code to run autonomously in microcontrollers. In the typical setup, a microcontroller is continuously doing computations on data coming in from sensors, and in real time putting out signals to actuators. The most common types of computations are effectively ones in control theory and signal processing.

We’ve had extensive support for doing control theory and signal processing directly in the Wolfram Language for a long time. But now what’s possible with the Microcontroller Kit is to take what’s specified in the language and download it as embedded code in a standalone microcontroller that can be deployed anywhere (in devices, IoT, appliances, etc.).

As an example, here’s how one can generate a symbolic representation of an analog signal-processing filter:

✕
ButterworthFilterModel[{3,2}] |

We can use this filter directly in the Wolfram Language—say using `RecurrenceFilter` to apply it to an audio signal. We can also do things like plot its frequency response:

✕
BodePlot[%] |

To deploy the filter in a microcontroller, we first have to derive from this continuous-time representation a discrete-time approximation that can be run in a tight loop (here, every 0.1 seconds) in the microcontroller:

✕
filter=ToDiscreteTimeModel[ButterworthFilterModel[{3,2}],0.1]//Chop |

Now we’re ready to use the Microcontroller Kit to actually deploy this to a microcontroller. The kit supports more than a hundred different types of microcontrollers. Here’s how we could deploy the filter to an Arduino Uno that we have connected to a serial port on our computer:

✕
Needs["MicrocontrollerKit`"] |

✕
MicrocontrollerEmbedCode[filter,<|"Target"->"ArduinoUno","Inputs"->"Serial","Outputs"->"Serial"|>,"/dev/cu.usbmodem141101"] |

`MicrocontrollerEmbedCode` works by generating appropriate C-like source code, compiling it for the microcontroller architecture you want, then actually deploying it to the microcontroller through its so-called programmer. Here’s the actual source code that was generated in this particular case:

✕
%["SourceCode"] |

So now we have a thing like this that runs our Butterworth filter, that we can use anywhere:

If we want to check what it’s doing, we can always connect it back into the Wolfram Language using `DeviceOpen` to open its serial port, and read and write from it.

What’s the relation between the Wolfram Language and video games? Over the years, the Wolfram Language has been used behind the scenes in many aspects of game development (simulating strategies, creating geometries, analyzing outcomes, etc.). But for some time now we’ve been working on a closer link between Wolfram Language and the Unity game environment, and in Version 12.0 we’re releasing a first version of this link.

The basic scheme is to have Unity running alongside the Wolfram Language, then to set up two-way communication, allowing both objects and commands to be exchanged. The under-the-hood plumbing is quite complex, but the result is a nice merger of the strengths of Wolfram Language and Unity.

This sets up the link, then starts a new project in Unity:

✕
Needs["UnityLink`"] |

✕
UnityOpen["NewProject"] |

Now create some complex shape:

✕
RevolutionPlot3D[{Sin[t] + Sin[5 t]/10, Cos[t] + Cos[5 t]/10}, {t, 0, Pi}, Sequence[ RegionFunction -> (Sin[5 (#4 + #5)] > 0& ), Boxed -> False, Axes -> None, PlotTheme -> "ThickSurface"]] |

Then it takes just one command to put this into the Unity game as an object called `"thingoid"`:

✕
CreateUnityGameObject["thingoid", CloudGet["https://wolfr.am/COrZtVvA"], Properties -> { "SharedMaterial" -> UnityLink`CreateUnityMaterial[Orange]}] |

Within the Wolfram Language there’s a symbolic representation of the object, and UnityLink now provides hundreds of functions for manipulating such objects, always maintaining versions both in Unity and in the Wolfram Language.

It’s very powerful that one can take things from the Wolfram Language and immediately put them into Unity—whether they’re geometry, images, audio, geo terrain, molecular structures, 3D anatomy, or whatever. It’s also very powerful that such things can then be manipulated within the Unity game, either through things like game physics, or by user action. (Eventually, one can expect to have `Manipulate`-like functionality, in which the controls aren’t just sliders and things, but complex pieces of gameplay.)

We’ve done experiments with putting Wolfram Language–generated content into virtual reality since the early 1990s. But in modern times Unity has become something of a de facto standard for setting up VR/AR environments—and with UnityLink it’s now straightforward to routinely put things from Wolfram Language into any modern XR environment.

One can use the Wolfram Language to prepare material for Unity games, but within a Unity game UnityLink also basically lets one just insert Wolfram Language code that can be executed during a game either on a local machine or through an API in the Wolfram Cloud. And, among other things, this makes it straightforward to put hooks into a game so the game can send “telemetry” (say to the Wolfram Data Drop) for analysis in the Wolfram Language. (It’s also possible to script the playing of the game—which is, for example, very useful for game testing.)

Writing games is a complex matter. But UnityLink provides an interesting new approach that should make it easier to prototype all sorts of games, and to learn the ideas of game development. One reason for this is that it effectively lets one script a game at a higher level by using symbolic constructs in the Wolfram Language. But another reason is that it lets the development process be done incrementally in a notebook, and explained and documented every step of the way. For example, here’s what amounts to a computational essay describing the development of a “piano game”:

UnityLink isn’t a simple thing: it contains more than 600 functions. But with those functions it’s possible to access pretty much all the capabilities of Unity, and to set up pretty much any imaginable game.

For something like reinforcement learning it’s essential to have a manipulable external environment in the loop when one’s doing machine learning. Well, `ServiceExecute` lets you call APIs (what’s the effect of posting that tweet, or making that trade?), and `DeviceExecute` lets you actuate actual devices (turn the robot left) and get data from sensors (did the robot fall over?).

But for many purposes what one instead wants is to have a simulated external environment. And in a way, just the pure Wolfram Language already to some extent does that, for example providing access to a rich “computational universe” full of modifiable programs and equations (cellular automata, differential equations, …). And, yes, the things in that computational universe can be informed by the real world—say with the realistic properties of oceans, or chemicals or mountains.

But what about environments that are more like the ones we modern humans typically learn in—full of built engineering structures and so on? Conveniently enough, `SystemModel` gives access to lots of realistic engineering systems. And through UnityLink we can expect to have access to rich game-based simulations of the world.

But as a first step, in Version 12.0 we’re setting up connections to some simple games—in particular from the OpenAI “gym”. The interface is much as it would be for interacting with the real world, with the game accessed like a “device” (after appropriate sometimes-“open-source-painful” installation):

✕
env = DeviceOpen["OpenAIGym", "MontezumaRevenge-v0"] |

We can read the state of the game:

✕
DeviceRead[env] |

And we can show it as an image:

✕
Image[DeviceRead[env]["ObservedState"]] |

With a bit more effort, we can take 100 random actions in the game (always checking that we didn’t “die”), then show a feature space plot of the observed states of the game:

✕
FeatureSpacePlot[ Table[If[DeviceRead[env]["Ended"], Return[], Image[DeviceExecute[env, "Step", DeviceExecute[env, "RandomAction"]]["ObservedState"]]], 100]] |

In Version 11.3 we began our first connection to the blockchain. Version 12.0 adds a lot of new features and capabilities, perhaps most notably the ability to write to public blockchains, as well as read from them. (We also have our own Wolfram Blockchain for Wolfram Cloud users.) We’re currently supporting Bitcoin, Ethereum and ARK blockchains, both their mainnets and testnets (and, yes, we have our own nodes connecting directly to these blockchains).

In Version 11.3 we allowed raw reading of transactions from blockchains. In Version 12.0 we’ve added a layer of analysis, so that, for example, you can ask for a summary of “CK” tokens (AKA CryptoKitties) on the Ethereum blockchain:

✕
BlockchainTokenData["CK"] |

It’s quick to look at all token transactions in history, and make a word cloud of how active different tokens have been:

✕
WordCloud[SortBy[BlockchainTokenData[All,{"Name","TransfersCount"}],Last]] |

But what about doing our own transaction? Let’s say we want to use a Bitcoin ATM (like the one that, bizarrely, exists at a bagel store near me) to transfer cash to a Bitcoin address. Well, first we create our crypto keys (and we need to make sure we remember our private key!):

✕
keys=GenerateAsymmetricKeyPair["Bitcoin"] |

Next, we have to take our public key and generate a Bitcoin address from it:

✕
BlockchainKeyEncode[keys["PublicKey"],"Address",BlockchainBase->"Bitcoin"] |

Make a QR code from that and you’re ready to go to the ATM:

✕
BarcodeImage["%,"QR"] |

But what if we want to write to the blockchain ourselves? Here we’ll use the Bitcoin testnet (so we’re not spending real money). This shows an output from a transaction we did before—that includes 0.0002 bitcoin (i.e. 20,000 satoshi):

✕
$BlockchainBase={"Bitcoin", "Testnet"}; |

✕
First[BlockchainTransactionData["17a422eebfbf9cdee19b600740597bafea45cc4c703c67afcc8fb889f4cf7f28","Outputs"]] |

Now we can set up a transaction which takes this output, and, for example, sends 8000 satoshi to each of two addresses (that we defined just like for the ATM transaction):

✕
BlockchainTransaction[<| "Inputs" -> {<| "TransactionID" -> "17a422eebfbf9cdee19b600740597bafea45cc4c703c67afcc8fb889f4cf7f28", "Index" -> 0|>}, "Outputs" -> {<|"Amount" -> Quantity[8000, "Satoshi"], "Address" -> "munDTMqa9V9Uhi3P21FpkY8UfYzvQqpmoQ"|>, <| "Amount" -> Quantity[8000, "Satoshi"], "Address" -> "mo9QWLSJ1g1ENrTkhK9SSyw7cYJfJLU8QH"|>}, "BlockchainBase" -> {"Bitcoin", "Testnet"}|>] |

OK, so now we’ve got a blockchain transaction object—that would offer a fee (shown in red because it’s “actual money” you’ll spend) of all the leftover cryptocurrency (here 4000 satoshi) to a miner willing to put the transaction in the blockchain. But before we can submit this transaction (and “spend the money”) we have to sign it with our private key:

✕
BlockchainTransactionSign[%, keys["PrivateKey"]] |

Finally, we just apply `BlockchainTransactionSubmit` and we’ve submitted our transaction to be put on the blockchain:

✕
BlockchainTransactionSubmit[%] |

Here’s its transaction ID:

✕
txid=%["TransactionID"] |

If we immediately ask about this transaction, we’ll get a message saying it isn’t in the blockchain:

✕
BlockchainTransactionData[txid] |

But after we wait a few minutes, there it is—and it’ll soon spread to every copy of the Bitcoin testnet blockchain:

✕
BlockchainTransactionData[txid] |

If you’re prepared to spend real money, you can use exactly the same functions to do a transaction on a main net. You can also do things like buy CryptoKitties. Functions like `BlockchainContractValue` can be used for any (for now, only Ethereum) smart contract, and are set up to immediately understand things like ERC-20 and ERC-721 tokens.

Dealing with blockchains involves lots of cryptography, some of which is new in Version 12.0 (notably, handling elliptic curves). But in Version 12.0 we’re also extending our non-blockchain cryptographic functions. For example, we’ve now got functions for directly dealing with digital signatures. This creates a digital signature using the private key from above:

✕
message="This is my genuine message"; |

✕
signature=GenerateDigitalSignature[message,keys["PrivateKey"]] |

Now anyone can verify the message using the corresponding public key:

✕
VerifyDigitalSignature[{message,signature},keys["PublicKey"]] |

In Version 12.0, we added several new types of hashes for the `Hash` function, particularly to support various cryptocurrencies. We also added ways to generate and verify derived keys. Start from any password, and `GenerateDerivedKey` will “puff it out” to something longer (to be more secure you should add “salt”):

✕
GenerateDerivedKey["meow"] |

Here’s a version of the derived key, suitable for use in various authentication schemes:

✕
GenerateDerivedKey["meow"]["PHCString"] |

The Wolfram Knowledgebase contains all sorts of financial data. Typically there’s a financial entity (like a stock), then there’s a property (like price). Here’s the complete daily history of Apple’s stock price (it’s very impressive that it looks best on a log scale):

✕
DateListLogPlot[Entity["Financial", "NASDAQ:AAPL"][Dated["Price",All]]] |

But while the financial data in the Wolfram Knowledgebase, and standardly available in the Wolfram Language, is continuously updated, it’s not real time (mostly it’s 15-minute delayed), and it doesn’t have all the detail that many financial traders use. For serious finance use, however, we’ve developed Wolfram Finance Platform. And now, in Version 12.0, it’s got direct access to Bloomberg and Reuters financial data feeds.

The way we architect the Wolfram Language, the framework for the connections to Bloomberg and Reuters is always available in the language—but it’s only activated if you have Wolfram Finance Platform, as well as the appropriate Bloomberg or Reuters subscriptions. But assuming you have these, here’s what it looks like to connect to the Bloomberg Terminal service:

✕
ServiceConnect["BloombergTerminal"] |

All the financial instruments handled by the Bloomberg Terminal now become available as entities in the Wolfram Language:

✕
Entity["BloombergTerminal","AAPL US Equity"] |

Now we can ask for properties of this entity:

✕
%["PX_LAST"] |

Altogether there are more than 60,000 properties accessible from the Bloomberg Terminal:

✕
Length[EntityProperties["BloombergTerminal"]] |

Here are 5 random examples (yes, they’re pretty detailed; those are Bloomberg names, not ours):

✕
RandomSample[EntityProperties["BloombergTerminal"],5] |

We support the Bloomberg Terminal service, the Bloomberg Data License service, and the Reuters Elektron service. One sophisticated thing one can now do is to set up a continuous task to asynchronously receive data, and call a “handler function” every time a new piece of data comes in:

✕
ServiceSubmit[ ContinuousTask[ServiceRequest[ServiceConnect["Reuters"], "MarketData",{"Instrument"-> "AAPL.O","TriggerFields"->{"BID","ASK"}}]], HandlerFunctions-><|"MarketDataEvents"->(action[#Result]&)|>] |

I’ve talked about lots of new functions and new functionality in the Wolfram Language. But what about the underlying infrastructure of the Wolfram Language? Well, we’ve been working hard on that too. For example, between Version 11.3 and Version 12.0 we’ve managed to fix nearly 8000 reported bugs. We’ve also made lots of things faster and more robust. And in general we’ve been tightening the software engineering of the system, for example reducing the initial download size by nearly 10% (despite all the functionality that’s been added). (We’ve also done things like improve the predictive prefetching of knowledgebase elements from the cloud—so when you need similar data it’s more likely to be already cached on your computer.)

It’s a longstanding feature of the computing landscape that operating systems are continually getting updated—and to take advantage of their latest features, applications have to get updated too. We’ve been working for several years on a major update to our Mac notebook interface—which is finally ready in Version 12.0. As part of the update, we’ve rewritten and restructured large amounts of code that have been developed and polished over more than 20 years, but the result is that in Version 12.0, everything about our system on the Mac is fully 64-bit, and makes use of the latest Cocoa APIs. This means that the notebook front end is significantly faster—and can also go beyond the previous 2 GB memory limit.

There’s also a platform update on Linux, where now the notebook interface fully supports Qt 5, which allows all rendering operations to take place “headlessly”, without any X server—greatly streamlining deployment of the Wolfram Engine in the cloud. (Version 12.0 doesn’t yet have high-dpi support for Windows, but that’s coming very soon.)

The development of the Wolfram Cloud is in some ways separate from the development of the Wolfram Language, and Wolfram Desktop applications (though for internal compatibility we’re releasing Version 12.0 at the same time in both environments). But in the past year since Version 11.3 was released, there’s been dramatic progress in the Wolfram Cloud.

Especially notable are the advances in cloud notebooks—supporting more interface elements (including some, like embedded websites and videos, that aren’t even yet available in desktop notebooks), as well as greatly increased robustness and speed. (Making our whole notebook interface work in a web browser is no small feat of software engineering, and in Version 12.0 there are some pretty sophisticated strategies for things like maintaining consistent fast-to-load caches, along with full symbolic DOM representations.)

In Version 12.0 there’s now just a simple menu item (File > Publish to Cloud …) to publish any notebook to the cloud. And once the notebook is published, anyone in the world can interact with it—as well as make their own copy so they can edit it.

It’s interesting to see how broadly the cloud has entered what can be done in the Wolfram Language. In addition to all the seamless integration of the cloud knowledgebase, and the ability to reach out to things like blockchains, there are also conveniences like Send To… sending any notebook through email, using the cloud if there’s no direct email server connection available.

Even though this has been a long piece, it’s not even close to telling the whole story of what’s new in Version 12.0. Along with the rest of our team, I’ve been working very hard on Version 12.0 for a long time now—but it’s still exciting to see just how much is actually in it.

But what’s critical (and a lot of work to achieve!) is that everything we’ve added is carefully designed to fit coherently with what’s already there. From the very first version more than 30 years ago of what’s now the Wolfram Language, we’ve been following the same core principles—and this is part of what’s allowed us to so dramatically grow the system while maintaining long-term compatibility.

It’s always difficult to decide exactly what to prioritize developing for each new version, but I’m very pleased with the choices we made for Version 12.0. I’ve given many talks over the past year, and I’ve been very struck with how often I’ve been able to say about things that come up: “Well, it so happens that that’s going to be part of Version 12.0!”

I’ve personally been using internal preliminary builds of Version 12.0 for nearly a year, and I’ve come to take for granted many of its new capabilities—and to use and enjoy them a lot. So it’s a great pleasure that today we have the final Version 12.0—with all these new capabilities officially in it, ready to be used by anyone and everyone…

*To comment, please visit the copy of this post at the Stephen Wolfram Blog »*

Every year, the U.S. Department of State sponsors a worldwide competition called Fishackathon. Its goal is to protect life in our waters by creating technological solutions to help solve problems related to fishing.

The first global competition was held in 2014 and has been growing massively every year. In 2018 the winning entry came from a five-person team from Boston, after competing against 45,000 people in 65 other cities spread across 5 continents. The participants comprised programmers, web and graphic designers, oceanographers and biologists, mathematicians, engineers and students who all worked tirelessly over the course of two days.

To find out more about the winning entry for Fishackathon in 2018 and how the Wolfram Language has helped make the seas safer, we sat down with Michael Sollami to learn more about him and his team’s solution to that year’s challenge.

I became involved with Wolfram technologies during college and graduate school. I used Mathematica extensively for research in mathematics and computer science, and I ended up working with Stephen Wolfram in his Cambridge-based advanced research group. Today, I’m Lead Data Scientist at Salesforce Einstein, where I work with an amazing team of engineers and researchers to build new machine learning systems to enhance our prediction and search platforms. From experimental deep neural architectures to prototyping ever-more-efficient information retrieval methods, we are designing next-generation search, recommendation and predictive analytics technologies.

Growing up near the Long Island Sound, where my family had a small sailboat, we spent a lot of time on or near the water. I love scuba diving (random trivia: Saba is my favorite dive site), and over the years I have seen firsthand the unnerving levels of aquatic devastation. The data coming from oceanographic research paints quite a bleak future. In fact, according to the best oceanographic computer models, there won’t be any fish larger than minnows by the 2040s.

In just 50 years, we’ve reduced the populations of large fish, such as bluefin tuna and cod, by over 90 percent. Industrial fishing uses nets that are 20 miles long, and trawlers drag something the size of a tractor trailer along the ocean floor. In just a few short decades, we clear-cut our seabed floors, once vital nurseries of sponges and corals, into millions of square miles of lifeless mud. According to coral reef ecologist Jeremy Jackson, “The total area of underwater habitat destruction is larger than the sum of all forests that have ever been cut down in the history of humanity.”

Many factors contribute to the threat of sea-life extinction, and they are not independent: biological and chemical pollution, acidification, deoxygenation, plastics, the climate crisis—it can be overwhelming. So we thought illegal, unreported and unregulated (IUU) overfishing was a good place to brainstorm possible solutions. It was incredibly exciting to compete simultaneously with over 3,500 other people around the world, and in the space of a weekend produce code with the potential for positive environmental impact.

Each year Fishackathon offers multiple challenge statements that competing teams can chose from. The topics are developed by industry professionals with real-world needs and made available to participants nearer to the time of the event. In 2018, we chose challenge #10: passive illegal fishing detection.

**Challenge Statement:** Protecting restricted fishing zones (e.g. marine reserves, remote areas) from illegal fishing is a huge challenge. A passive tool (maybe using sonar?) that helps identify fishing activity in restricted areas would help agencies monitor, track and enforce laws more effectively.

You might recall from watching *The Hunt for Red October* that the United States Navy operates a chain of underwater listening posts located around the world, a sound surveillance system called SOSUS. This collection of bottom-mounted hydrophone arrays was originally constructed in the 1950s to track and fingerprint Soviet submarines by their acoustic signals. Almost 70 years later, there is no reason why we can’t apply (newer and cheaper versions of) this technology to track and identify fishing vessels.

We knew that the heart of our solution was an accurate fishing acoustics model, so we started with the software side of the problem. Once we had proof of life in the core detection algorithm, we iterated on designing the web app/interface and hardware components.

We need to design the submersible listening device to be outfitted with a hydrophone while making it as inexpensive as possible. The deployed devices would also need to be able to transmit network data back to our servers, for triangulating and tracking the positions of any detected poaching vessels operating in protected waters.

Our eventual submission, which we called PoachStopper, would not only recognize sounds associated with fishing, but also compute a unique signature for each boat that passes within a detection radius of 50 kilometers.

This is where Mathematica shined. Mathematica’s ability to import and manipulate audio files made the data preprocessing pipeline dead simple. `NetEncoders` for `"Audio"` that handle varying dimensionalities made preprocessing inputs much easier than in many other frameworks at the time.

✕
net=ResourceData["HiddenAudioIdentifyMobileNetDepth1-0"]; ne=NetExtract[net,"Input"]; encoder=NetEncoder[{"Function",Function[feat,First@Partition[If[Length[feat]<96,ArrayPad[feat,{{0,96-Length[feat]}},Padding->"Fixed"],feat],96,64]]/@Normal[NetEncoder[{"AudioMelSpectrogram","NumberOfFilters"->64,"WindowSize"->600}][#1]]&,{96,64},"Pattern"->None,"Batched"->True}]; Image3D[ne@Normal@ExampleData[#],ImageSize->Tiny,ColorFunction->"RainbowOpacity"]&/@ExampleData["Audio"][[;;20]]//Multicolumn |

At the time Fishackathon was held, the Neural Net Repository had yet to exist, so I couldn’t just pull some state-of-the-art model and perform some transfer learning. So we needed to train the network from scratch with the additional requirement of maintaining a very-low-energy consumption profile. For these reasons, I decided to to start working with a variant of Google’s MobileNet architecture, which features depth-wise separable convolutions as efficient building blocks. To the basic skeleton I added some linear bottlenecks between the layers with skip connections, and which led to higher accuracies and accelerated convergence. After processing the raw audio of our ocean sounds dataset into normalized spectrograms, I trained on AWS overnight on a GPU instance, and by Sunday morning had a working detector.

This was before Version 11.3 came out last year, but it would have been nice to use tools like `NetEncoder["AudioMFCC"]` and `WebAudioSearch`. The network we designed could delineate with 98% accuracy the difference between the normal sounds of the ocean, non-fishing boats and actively fishing trawlers operating in their different modes. We also built a separate recognition module that learned a nicely invariant mapping between the sounds of individual vessels’ engines and propellers—essentially a perceptual hash—for tracking specific vessels.

Indeed! Thanks to Mathematica being packaged with the Raspbian OS, that could be a good route for people to take. Raspberry Pis can come with GPUs—however, they aren’t from NVIDIA, and so are not supported for `TargetDevice→ "GPU"`. For production, we ended up porting the network into PyTorch, but I eventually ported the model to TensorFlow Mobile for testing with IoT and mobile devices, which was better from a real-time and power-usage profile.

After we won the finals, I handed the project off to my very capable teammates, who are pursuing various partnerships and funding opportunities. PoachStopper and other eco-startups like it have the potential to make very measurable impacts in the fight for the future of oceanic life. However, the hard truth is that governments are the only entities that can prevent the end of fish. And it is easy to do, with just a few simple legislative steps:

- Create sufficiently large un-fishable areas for populations to begin regenerating
- Impose quotas on the amount of fish caught in any given year
- End government subsidies for the fishing-industrial complex

Companies need to pay for the privilege of fishing, and governments need to ensure that our oceans will not become a vast toxic desert, as we predict they will by 2050.

This year’s Fishackathon has a large focus on Fishcoin, and will focus on developing solutions for data capture and sharing using open platforms. This essentially means building apps that can capture and process sensor data in the field and then publishing it with decentralized ledger technology. With all its new blockchain-related features, Mathematica could be the weapon of choice in 2019!

I consider Mathematica to be a “killer app” for any hackathon. Python and machine learning frameworks are my primary tools for machine learning engineering, but Mathematica remains the fastest language to prototype small-scale things. With Version 12 on the horizon, I’m very excited for the ability to compile everything down to machine code for running at C++ speed—not to mention the ability to export networks to ONNX for use in production-grade servers.

The latest Wolfram technology stack makes it possible for you to develop and deploy useful applications in minutes. Start coding today with a Wolfram|One trial. |

By providing easy-to-follow, step-by-step tutorials that result in a finished, functioning piece of software, Wolfram aims to lower the barrier of entry for those who wish to get immediately started programming, building and making. Projects can be completely built on the Raspberry Pi or within a web browser in the Wolfram Cloud.

Since 2013, the Wolfram Language and Mathematica have been freely available on the Raspberry Pi system as part of NOOBS. Stephen Wolfram wrote in his announcement of the collaboration with the Raspberry Pi Foundation, “I’m a great believer in the importance of programming as a central component of education.” And over five years later, there is indeed increasing demand in the labor force for technical programming skills—part of why Wolfram continues to push computational thinking as a primary means, method and framework for preparing individuals for success in the future of work.

The Wolfram Language is particularly well suited for this mission, as its high-level symbolic nature and linguistic capabilities not only tell machines precisely what to do, but can also be easily read by nontechnical people—the world’s first and only true computational communication language understandable by both humans and AI.

“These projects provide a fantastic opportunity for code clubs around the world to step into the power of using the Wolfram Language to springboard their computational thinking skills’ development,” says Jon McLoone, cofounder of Computer-Based Math™ (CBM).

The first of these project materials will be available this week, with more planned throughout the year. It will be interesting and exciting to see what people build with the Wolfram Language and Raspberry Pi.

]]>