Wolfram Computation Meets Knowledge

New in 14: LLM & AI

Two years ago we released Version 13.0 of Wolfram Language. Here are the updates in video since then, including the latest features in 14.0. The contents of this post are compiled from Stephen Wolfram’s Release Announcements for 13.1, 13.2, 13.3 and 14.0.

LLM Tech

The LLMs Have Landed (January 2024)

The machine learning superfunctions Classify and Predict first appeared in Wolfram Language in 2014 (Version 10). By the next year there were starting to be functions like ImageIdentify and LanguageIdentify, and within a couple of years we’d introduced our whole neural net framework and Neural Net Repository. Included in that were a variety of neural nets for language modeling, that allowed us to build out functions like SpeechRecognize and an experimental version of FindTextualAnswer. But—like everyone else—we were taken by surprise at the end of 2022 by ChatGPT and its remarkable capabilities.

Very quickly we realized that a major new use case—and market—had arrived for Wolfram|Alpha and Wolfram Language. For now it was not only humans who’d need the tools we’d built; it was also AIs. By March 2023 we’d worked with OpenAI to use our Wolfram Cloud technology to deliver a plugin to ChatGPT that allows it to call Wolfram|Alpha and Wolfram Language. LLMs like ChatGPT provide remarkable new capabilities in reproducing human language, basic human thinking and general commonsense knowledge. But—like unaided humans—they’re not set up to deal with detailed computation or precise knowledge. For that, like humans, they have to use formalism and tools. And the remarkable thing is that the formalism and tools we’ve built in Wolfram Language (and Wolfram|Alpha) are basically a broad, perfect fit for what they need.

We created the Wolfram Language to provide a bridge from what humans think about to what computation can express and implement. And now that’s what the AIs can use as well. The Wolfram Language provides a medium not only for humans to “think computationally” but also for AIs to do so. And we’ve been steadily doing the engineering to let AIs call on Wolfram Language as easily as possible.

But in addition to LLMs using Wolfram Language, there’s also now the possibility of Wolfram Language using LLMs. And already in June 2023 (Version 13.3) we released a major collection of LLM-based capabilities in Wolfram Language. One category is LLM functions, that effectively use LLMs as “internal algorithms” for operations in Wolfram Language:

In typical Wolfram Language fashion, we have a symbolic representation for LLMs: LLMConfiguration[] represents an LLM with its various parameters, promptings, etc. And in the past few months we’ve been steadily adding connections to the full range of popular LLMs, making Wolfram Language a unique hub not only for LLM usage, but also for studying the performance—and science—of LLMs.

You can define your own LLM functions in Wolfram Language. But there’s also the Wolfram Prompt Repository that plays a similar role for LLM functions as the Wolfram Function Repository does for ordinary Wolfram Language functions. There’s a public Prompt Repository that so far has several hundred curated prompts. But it’s also possible for anyone to post their prompts in the Wolfram Cloud and make them publicly (or privately) accessible. The prompts can define personas (“talk like a [stereotypical] pirate”). They can define AI-oriented functions (“write it with emoji”). And they can define modifiers that affect the form of output (“haiku style”).

Wolfram Prompt Repository

In addition to calling LLMs “programmatically” within Wolfram Language, there’s the new concept (first introduced in Version 13.3) of “Chat Notebooks”. Chat Notebooks represent a new kind of user interface, that combines the graphical, computational and document features of traditional Wolfram Notebooks with the new linguistic interface capabilities brought to us by LLMs.

The basic idea of a Chat Notebook—as introduced in Version 13.3, and now extended in Version 14.0—is that you can have “chat cells” (requested by typing ) whose content gets sent not to the Wolfram kernel, but instead to an LLM:

Write a haiku about a crocodile on the moon

You can use “function prompts”—say from the Wolfram Prompt Repository—directly in a Chat Notebook:

A cat ate my lunch

And as of Version 14.0 you can also knit Wolfram Language computations directly into your “conversation” with the LLM:

Make a haiku from RandomWord

(You type \ to insert Wolfram Language, very much like the way you can use <**> to insert Wolfram Language into external evaluation cells.)

One thing about Chat Notebooks is that—as their name suggests—they really are centered around “chatting”, and around having a sequential interaction with an LLM. In an ordinary notebook, it doesn’t matter where in the notebook each Wolfram Language evaluation is requested; all that’s relevant is the order in which the Wolfram kernel does the evaluations. But in a Chat Notebook the “LLM evaluations” are always part of a “chat” that’s explicitly laid out in the notebook.

A key part of Chat Notebooks is the concept of a chat block: type ~ and you get a separator in the notebook that “starts a new chat”:

My name is Stephen

Chat Notebooks—with all their typical Wolfram Notebook editing, structuring, automation, etc. capabilities—are very powerful just as “LLM interfaces”. But there’s another dimension as well, enabled by LLMs being able to call Wolfram Language as a tool.

At one level, Chat Notebooks provide an “on ramp” for using Wolfram Language. Wolfram|Alpha—and even more so, Wolfram|Alpha Notebook Edition—let you ask questions in natural language, then have the questions translated into Wolfram Language, and answers computed. But in Chat Notebooks you can go beyond asking specific questions. Instead, through the LLM, you can just “start chatting” about what you want to do, then have Wolfram Language code generated, and executed:

How do you make a rosette with 5 lobes?

The workflow is typically as follows. First, you have to conceptualize in computational terms what you want. (And, yes, that step requires computational thinking—which is a very important skill that too few people have so far learned.) Then you tell the LLM what you want, and it’ll try to write Wolfram Language code to achieve it. It’ll typically run the code for you (but you can also always do it yourself)—and you can see whether you got what you wanted. But what’s crucial is that Wolfram Language is intended to be read not only by computers but also by humans. And particularly since LLMs actually usually seem to manage to write pretty good Wolfram Language code, you can expect to read what they wrote, and see if it’s what you wanted. If it is, you can take that code, and use it as a “solid building block” for whatever larger system you might be trying to set up. Otherwise, you can either fix it yourself, or try chatting with the LLM to get it to do it.

One of the things we see in the example above is the LLM—within the Chat Notebook—making a “tool call”, here to a Wolfram Language evaluator. In the Wolfram Language there’s now a whole mechanism for defining tools for LLMs—with each tool being represented by an LLMTool symbolic object. In Version 14.0 there’s an experimental version of the new Wolfram LLM Tool Repository with some predefined tools:

Wolfram LLM Tool Repository

In a default Chat Notebook, the LLM has access to some default tools, which include not only the Wolfram Language evaluator, but also things like Wolfram documentation search and Wolfram|Alpha query. And it’s common to see the LLM go back and forth trying to write “code that works”, and for example sometimes having to “resort” (much like humans do) to reading the documentation.

Something that’s new in Version 14.0 is experimental access to multimodal LLMs that can take images as well as text as input. And when this capability is enabled, it allows the LLM to “look at pictures from the code it generated”, see if they’re what was asked for, and potentially correct itself:

Create graphics with a randomly colored disc

The deep integration of images into Wolfram Language—and Wolfram Notebooks—yields all sorts of possibilities for multimodal LLMs. Here we’re giving a plot as an image and asking the LLM how to reproduce it:

Create a similar plot

Another direction for multimodal LLMs is to take data (in the hundreds of formats accepted by Wolfram Language) and use the LLM to guide its visualization and analysis in the Wolfram Language. Here’s an example that starts from a file data.csv in the current directory on your computer:

Look at the file data.csv

One thing that’s very nice about using Wolfram Language directly is that everything you do (well, unless you use RandomInteger, etc.) is completely reproducible; do the same computation twice and you’ll get the same result. That’s not true with LLMs (at least right now). And so when one uses LLMs it feels like something more ephemeral and fleeting than using Wolfram Language. One has to grab any good results one gets—because one might never be able to reproduce them. Yes, it’s very helpful that one can store everything in a Chat Notebook, even if one can’t rerun it and get the same results. But the more “permanent” use of LLM results tends to be “offline”. Use an LLM “up front” to figure something out, then just use the result it gave.

One unexpected application of LLMs for us has been in suggesting names of functions. With the LLM’s “experience” of what people talk about, it’s in a good position to suggest functions that people might find useful. And, yes, when it writes code it has a habit of hallucinating such functions. But in Version 14.0 we’ve actually added one function—DigitSum—that was suggested to us by LLMs. And in a similar vein, we can expect LLMs to be useful in making connections to external databases, functions, etc. The LLM “reads the documentation”, and tries to write Wolfram Language “glue” code—which then can be reviewed, checked, etc., and if it’s right, can be used henceforth.

Then there’s data curation, which is a field that—through Wolfram|Alpha and many of our other efforts—we’ve become extremely expert at over the past couple of decades. How much can LLMs help with that? They certainly don’t “solve the whole problem”, but integrating them with the tools we already have has allowed us over the past year to speed up some of our data curation pipelines by factors of two or more.

If we look at the whole stack of technology and content that’s in the modern Wolfram Language, the overwhelming majority of it isn’t helped by LLMs, and isn’t likely to be. But there are many—sometimes unexpected—corners where LLMs can dramatically improve heuristics or otherwise solve problems. And in Version 14.0 there are starting to be a wide variety of “LLM inside” functions.

An example is TextSummarize, which is a function we’ve considered adding for many versions—but now, thanks to LLMs, can finally implement to a useful level:

The main LLMs that we’re using right now are based on external services. But we’re building capabilities to allow us to run LLMs in local Wolfram Language installations as soon as that’s technically feasible. And one capability that’s actually part of our mainline machine learning effort is NetExternalObject—a way of representing symbolically an externally defined neural net that can be run inside Wolfram Language. NetExternalObject allows you, for example, to take any network in ONNX form and effectively treat it as a component in a Wolfram Language neural net. Here’s a network for image depth estimation—that we’re here importing from an external repository (though in this case there’s actually a similar network already in the Wolfram Neural Net Repository):

Now we can apply this imported network to an image that’s been encoded with our built-in image encoder—then we’re taking the result and visualizing it:

It’s often very convenient to be able to run networks locally, but it can sometimes take quite high-end hardware to do so. For example, there’s now a function in the Wolfram Function Repository that does image synthesis entirely locally—but to run it, you do need a GPU with at least 8 GB of VRAM:

By the way, based on LLM principles (and ideas like transformers) there’ve been other related advances in machine learning that have been strengthening a whole range of Wolfram Language areas—with one example being image segmentation, where ImageSegmentationComponents now provides robust “content-sensitive” segmentation:

LLM Tech Comes to Wolfram Language (June 2023)

LLMs make possible many important new things in the Wolfram Language. And since I’ve been discussing these in a series of recent posts, I’ll just give only a fairly short summary here. More details are in the other posts, both ones that have appeared, and ones that will appear soon.

To ensure you have the latest Chat Notebook functionality installed and available, use:

PacletInstall["Wolfram/Chatbook" "1.0.0", UpdatePacletSites True].

The most immediately visible LLM tech in Version 13.3 is Chat Notebooks. Go to File > New > Chat-Enabled Notebook and you’ll get a Chat Notebook that supports “chat cells” that let you “talk to” an LLM. Press ' (quote) to get a new chat cell:

Plot two sine curves

You might not like some details of what got done (do you really want those boldface labels?) but I consider this pretty impressive. And it’s a great example of using an LLM as a “linguistic interface” with common sense, that can generate precise computational language, which can then be run to get a result.

This is all very new technology, so we don’t yet know what patterns of usage will work best. But I think it’s going to go like this. First, you have to think computationally about whatever you’re trying to do. Then you tell it to the LLM, and it’ll produce Wolfram Language code that represents what it thinks you want to do. You might just run that code (or the Chat Notebook will do it for you), and see if it produces what you want. Or you might read the code, and see if it’s what you want. But either way, you’ll be using computational language—Wolfram Language—as the medium to formalize and express what you’re trying to do.

When you’re doing something you’re familiar with, it’ll almost always be faster and better to think directly in Wolfram Language, and just enter the computational language code you want. But if you’re exploring something new, or just getting started on something, the LLM is likely to be a really valuable way to “get you to first code”, and to start the process of crispening up what you want in computational terms.

If the LLM doesn’t do exactly what you want, then you can tell it what it did wrong, and it’ll try to correct it—though sometimes you can end up doing a lot of explaining and having quite a long dialog (and, yes, it’s often vastly easier just to type Wolfram Language code yourself):

Draw red and green semicircles

Redraw red and green semicircles

Sometimes the LLM will notice for itself that something went wrong, and try changing its code, and rerunning it:

Make table of primes

And even if it didn’t write a piece of code itself, it’s pretty good at piping up to explain what’s going on when an error is generated:

Error report

And actually it’s got a big advantage here, because “under the hood” it can look at lots of details (like stack trace, error documentation, etc.) that humans usually don’t bother with.

To support all this interaction with LLMs, there’s all kinds of new structure in the Wolfram Language. In Chat Notebooks there are chat cells, and there are chatblocks (indicated by gray bars, and generating with ~) that delimit the range of chat cells that will be fed to the LLM when you press shiftenter on a new chat cell. And, by the way, the whole mechanism of cells, cell groups, etc. that we invented 36 years ago now turns out to be extremely powerful as a foundation for Chat Notebooks.

One can think of the LLM as a kind of “alternate evaluator” in the notebook. And there are various ways to set up and control it. The most immediate is in the menu associated with every chat cell and every chatblock (and also available in the notebook toolbar):

Chat cell and chatblock menu

The first items here let you define the “persona” for the LLM. Is it going to act as a Code Assistant that writes code and comments on it? Or is it just going to be a Code Writer, that writes code without being wordy about it? Then there are some “fun” personas—like Wolfie and Birdnardo—that respond “with an attitude”. The Advanced Settings let you do things like set the underlying LLM model you want to use—and also what tools (like Wolfram Language code evaluation) you want to connect to it.

Ultimately personas are mostly just special prompts for the LLM (together, sometimes with tools, etc.) And one of the new things we’ve recently launched to support LLMs is the Wolfram Prompt Repository:

Wolfram Prompt Repository

The Prompt Repository contains several kinds of prompts. The first are personas, which are used to “style” and otherwise inform chat interactions. But then there are two other types of prompts: function prompts, and modifier prompts.

Function prompts are for getting the LLM to do something specific, like summarize a piece of text, or suggest a joke (it’s not terribly good at that). Modifier prompts are for determining how the LLM should modify its output, for example translating into a different human language, or keeping it to a certain length.

You can pull in function prompts from the repository into a Chat Notebook by using !, and modifier prompts using #. There’s also a ^ notation for saying that you want the “input” to the function prompt to be the cell above:


This is how you can access LLM functionality from within a Chat Notebook. But there’s also a whole symbolic programmatic way to access LLMs that we’ve added to the Wolfram Language. Central to this is LLMFunction, which acts very much like a Wolfram Language pure function, except that it gets “evaluated” not by the Wolfram Language kernel, but by an LLM:

You can access a function prompt from the Prompt Repository using LLMResourceFunction:

There’s also a symbolic representation for chats. Here’s an empty chat:

And here now we “say something”, and the LLM responds:

There’s lots of depth to both Chat Notebooks and LLM functionsas I’ve described elsewhere. There’s LLMExampleFunction for getting an LLM to follow examples you give. There’s LLMTool for giving an LLM a way to call functions in the Wolfram Language as “tools”. And there’s LLMSynthesize which provides raw access to the LLM as its text completion and other capabilities. (And controlling all of this is $LLMEvaluator which defines the default LLM configuration to use, as specified by an LLMConfiguration object.)

I consider it rather impressive that we’ve been able to get to the level of support for LLMs that we have in Version 13.3 in less than six months (along with building things like the Wolfram Plugin for ChatGPT, and the Wolfram ChatGPT Plugin Kit). But there’s going to be more to come, with LLM functionality increasingly integrated into Wolfram Language and Notebooks, and, yes, Wolfram Language functionality increasingly integrated as a tool into LLMs.

Pictures from Words: Generative AI for Images (June 2023)

One of the remarkable things that’s emerged as a possibility from recent advances in AI and neural nets is the generation of images from textual descriptions. It’s not yet realistic to do this at all well on anything but a high-end (and typically server) GPU-enabled machine. But in Version 13.3 there’s now a built-in function ImageSynthesize that can get images synthesized, for now through an external API.

You give text, and ImageSynthesize will try to generate images for which that text is a description:

Sometimes these images will be directly useful in their own right, perhaps as “theming images” for documents or user interfaces. Sometimes they will provide raw material that can be developed into icons or other art. And sometimes they are most useful as inputs to tests or other algorithms.

And one of the important things about ImageSynthesize is that it can immediately be used as part of any Wolfram Language workflow. Pick a random sentence from Alice in Wonderland:

Now ImageSynthesize can “illustrate” it:

Or we can get AI to feed AI:

ImageSynthesize is set up to automatically be able to synthesize images of different sizes:

You can take the output of ImageSynthesize and immediately process it:

ImageSynthesize can not only produce complete images, but can also fill in transparent parts of “incomplete” images:

In addition to ImageSynthesize and all its new LLM functionality, Version 13.3 also includes a number of advances in the core machine learning system for Wolfram Language. Probably the most notable are speedups of up to 10x and beyond for neural net training and evaluation on x86-compatible systems, as well as better models for ImageIdentify. There are also a variety of new networks in the Wolfram Neural Net Repository, particularly ones based on transformers.

Machine Learning

Interpretable Machine Learning (June 2022)

Let’s say you have trained a machine learning model and you apply it to a particular input. It gives you some result. But why? What were the important features in the input that led it to that result? In Version 13.1 we’re introducing several functions that try to answer such questions.

Here’s some simple “training data”:

data = Flatten


We can use machine learning to make a predictor for this data:

pf = Predict

Applying the predictor to a particular input gives us a prediction:


What was important in making this prediction? The "SHAPValues" property introduced in Version 12.3 tells us what contribution each feature made to the result; in this case v was more important than u in determining the value of the prediction:


But what about in general, for all inputs? The new function FeatureImpactPlot gives a visual representation of the contribution or “impact” of each feature in each input on the output of the predictor:


What does this plot mean? It’s basically showing how often there are what contributions from values of the two input features. And with this particular predictor we see that there’s a wide range of contributions from both features.

If we use a different method to create the predictor, the results can be quite different. Here we’re using linear regression, and it turns out that with this method v never has much impact on predictions:


If we make a predictor using a decision tree, the feature impact plot shows the splitting of impact corresponding to different branches of the tree:


FeatureImpactPlot gives a kind of bird’s-eye view of the impact of different features. FeatureValueImpactPlot gives more detail, showing as a function of the actual values of input features the impact points with those values would have on the final prediction (and, yes, the actual points plotted here are based on data simulated on the basis of the distribution inferred by the predictor; the actual data is usually too big to want to carry around, at least by default):


CumulativeFeatureImpactPlot gives a visual representation of how “successive” features affect the final value for each (simulated) data point:


For predictors, feature impact plots show impact on predicted values. For classifiers, they show impact on (log) probabilities for particular outcomes.

Integrating External Neural Nets (December 2022)

The Wolfram Language has had integrated neural net technology since 2015. Sometimes this is automatically used inside other Wolfram Language functions, like ImageIdentify, SpeechRecognize or Classify. But you can also build your own neural nets using the symbolic specification language with functions like NetChain and NetGraph—and the Wolfram Neural Net Repository provides a continually updated source of neural nets that you can immediately use, and modify, in the Wolfram Language.

But what if there’s a neural net out there that you just want to run from within the Wolfram Language, but don’t need to have represented in modifiable (or trainable) symbolic Wolfram Language form—like you might run an external program executable? In Version 13.2 there’s a new construct NetExternalObject that allows you to run trained neural nets “from the wild” in the same integrated framework used for actual Wolfram-Language-specified neural nets.

NetExternalObject so far supports neural nets that have been defined in the ONNX neural net exchange format, which can easily be generated from frameworks like PyTorch, TensorFlow, Keras, etc. (as well as from Wolfram Language). One can get a NetExternalObject just by importing an .onnx file. Here’s an example from the web:

If we “open up” the summary for this object we see what basic tensor structure of input and output it deals with:

But to actually use this network we have to set up encoders and decoders suitable for the actual operation of this particular network—with the particular encoding of images that it expects:

Now we just have to run the encoder, the external network and the decoder—to get (in this case) a cartoonized Mount Rushmore:

Often the “wrapper code” for the NetExternalObject will be a bit more complicated than in this case. But the built-in NetEncoder and NetDecoder functions typically provide a very good start, and in general the symbolic structure of the Wolfram Language (and its integrated ability to represent images, video, audio, etc.) makes the process of importing typical neural nets “from the wild” surprisingly straightforward. And once imported, such neural nets can be used directly, or as components of other functions, anywhere in the Wolfram Language.

Analysis of Cluster Analysis (December 2022)

The Wolfram Language has had basic built-in support for cluster analysis since the mid-2000s. But in more recent times—with increased sophistication from machine learning—we’ve been adding more and more sophisticated forms of cluster analysis. But it’s one thing to do cluster analysis; it’s another to analyze the cluster analysis one’s done, to try to better understand what it means, how to optimize it, etc. In Version 13.2 we’re both adding the function ClusteringMeasurements to do this, as well as adding more options for cluster analysis, and enhancing the automation we have for method and parameter selection.

Let’s say we do cluster analysis on some data, asking for a sequence of different numbers of clusters:

Which is the “best” number of clusters? One measure of this is to compute the “silhouette score” for each possible clustering, and that’s something that ClusteringMeasurements can now do:

As is fairly typical in statistics-related areas, there are lots of different scores and criteria one can use—ClusteringMeasurements supports a wide variety of them.