## Pattern Matching Your Source Code—How Wolfram *Workbench* Integrates *Mathematica* Development Tools

March 19, 2009 — Werner Schuster, Kernel Developer

Inside and outside of Wolfram Research, teams are working on large *Mathematica* projects. Working with large code bases requires powerful tools; it is even better if these tools are integrated. With Wolfram *Workbench*, we brought an integrated development environment (IDE) to our users.

What does “integrated” mean? Well, let’s look at just one example of how *Workbench* integrates *Mathematica*‘s central language features, pattern matching, editors, and source management tools.

Let’s start with a specific problem: with our *Mathematica* 6.0 release, we overhauled many of our libraries and APIs (our recent Version 7.0 release builds on the improvements in Version 6.0). Some groups of functions were deprecated or their APIs changed. We had collected a long list of these changes… but how would users apply them to their source code? Go through them one by one and line by line in their code? Definitely not.

To solve the problem, we employed one of the basic mechanisms of *Mathematica*: pattern matching.

The changes were collected as patterns and replacements. For instance, the `Random` function is now available in the form of various functions. It can now, for example, return a list of random numbers. So, where previously we had to do this:

`{Random[Real, 10], Random[Real, 10]}`

we can now do this:

`RandomReal[10, 2]`

which returns the same list as the code above it. (Note: functions like `RandomReal`, `RandomComplex`, and so on are now preferred to the old, parameterized version of `Random`.)

Our migration tool can handle changes of that kind. This pattern matches the above example:

`{Random[Real, x_], Random[Real, x_]}`

The correct replacement is this:

`RandomReal[x, 2]`

It’s possible to run these transformations on code in an open notebook—but what about users with large code bases, thousands of lines of *Mathematica* code in packages, like `.m` files? What about the large *Mathematica* projects consisting of dozens or hundreds of packages, stored in `.m` text files?

Good thing we had made the *Mathematica* pattern matcher and a special transformation engine available in Wolfram *Workbench*, and used it in the Migration Assistant. The pattern language works like in an instance of *Mathematica*, but instead of in-memory expressions, it works on the source representations of the *Mathematica* code.

What you see in this screen shot is the collection of changes in libraries—expressed as the *Mathematica* pattern to find and its corresponding replacement (using the pattern variables such as *Mathematica* rules for `Replace`, `ReplaceAll`, and so on).

The Migration Assistant allows you to migrate large source bases automatically. It searches for the patterns in the source of whole *Mathematica* projects in Wolfram *Workbench* and performs the replacements. These changes are then available for preview—each change and modification can be seen side by side with the original code.

The Migration Assistant is but one example of the integration of the pattern matching and transformation features in Wolfram *Workbench*. *Mathematica* allows pattern matching for many purposes—for instance, when analyzing XML data.

So we thought—why not use it for analyzing *Mathematica* source files?

For instance: static analysis tools. What are those? You may have heard of or used tools like lint (for C), PMD (for Java), or others. If you’ve used a modern Java IDE (*e.g.* Eclipse), it’ll also have integrated static analysis tools. These tools check the source code for “bad” code patterns, such as code that is buggy, or uses APIs and functions in an odd and potentially buggy way. In short: a few static analysis runs a day can keep the debugger away (and if not, Wolfram *Workbench* comes with an integrated debugger as well).

Wolfram *Workbench* also does static analysis—every time you edit a file, the code is checked incrementally for a host of programming problems.

A part of the static analysis in Wolfram *Workbench* is implemented with pattern matching: warnings, custom warnings, and quick fixes. Sounds abstract—let’s look at an example.

Have you ever made this mistake:

Foo[x_] := Module[

{y = {1, 2, 3}},

Append[y, x];

Length[y]

]

There’s a bug in there—or at least dead code. It’s an easy mistake to make—particularly for programmers familiar with imperative programming languages. The problem: `Append``[y, 42]` actually returns a new list that is created by appending 42 to the list in *y*.

And now the problem in the code is clear: the return value of the `Append` call is thrown away—and variable *y* is still just the previous list. What we want here is a different function: `AppendTo``[list, newItem]`, which actually assigns the list to the correct variable.

In *Workbench*, we have such a tool—and surprise: it’s based on *Mathematica* patterns. Once you find a potentially bad use of a function, you can make sure you (and everyone on your team if you want) is warned—as long as you can figure out a pattern that matches it.

The tool is available in the preferences:

Let’s find a pattern that will match the `Append` problem:

`CompoundExpression[l___, x : Append[list_, rest__], r__]`

That’s the pattern that matches the case I described above. It’s not as simple as just looking for any old `Append` function—there are obvious cases where `Append` is right. What we want to do is warn when the result of `Append` is thrown away without ever being seen.

And this replacement is a fix:

`l; AppendTo[list, rest]; r`

*Workbench* integrates the static analysis tool in two ways. If a warning pattern is found in *Mathematica* code, *Workbench* will show a warning in the editor and also in Eclipse’s Problems view, which lists all problems, errors, and warnings in the workspace. If a problem can be solved with a simple transformation, *Workbench* will offer it when a quick assist is requested.

These patterns are checked whenever code is edited in Wolfram *Workbench*.

Another way of searching for bad code with patterns is available under the *Mathematica* tab in the Search > Search dialog…, where you can enter *Mathematica* patterns.

But *Workbench* also allows to do your own bulk changes: right-click in a *Mathematica* source file and choose Expression Find/Replace, and you’ll get a dialog like this:

The pattern and replacement will be applied to either the current file (in the editor) or to all selected files or contents of a project. As with the Migration Assistant, you’ll get a preview dialog that shows all the modifications so you can opt out if a transformation isn’t as desired.

While these replacements operate on expressions, *Workbench* adds another dimension: the expressions are stored as text, which allows them to be formatted, *i.e.* indentations, white space, and so on. The transformation expression actually lets you reformat code in a custom way—if you use a pattern like this:

`If[cond_, i_, e_]`

And replace it with this:

If[

cond,

i,

e

]

The result of transforming the input `If[a==b, 42, 43]` will be this:

If[

a == b,

42,

43

]

While *Workbench* has a formatter (in the editor’s context menu, Source > Format) that formats general *Mathematica* code in a readable way, this approach allows you to customize the layout of source code. It allows you, for instance, to format code (or data) in a way that makes it more readable. The transformation also supports comments—if you want to annotate code with comments, just match for the code and include the comment in the replacement.

*Mathematica*‘s consistent design (everything’s an expression) combined with pattern matching is just as helpful when working with and modifying large source bases.

It’s good that the simple yet powerful principles of *Mathematica* allow users to do many different tasks without having to learn new concepts. Every *Mathematica* user already understands the concepts we used to implement code transformation and static analysis—and Eclipse and Wolfram *Workbench* integrate them seamlessly.

The code transformation and static analysis tools are but one small area of the integrated tooling Wolfram *Workbench* provides. Modern software development requires many other tools.

Version management is tightly integrated into Eclipse—and Wolfram *Workbench* benefits from it. Gone are the days of doing version management by copying a notebook and using a naming scheme like “MyCode.nb.bak1″, “MyCode.nb.bak_Thursday27th”, or similar. The local history system is an undo feature on steroids, which can be a first step towards version management. But once a project grows, it’s also possible to move to systems like CVS, Subversion, or Git—or many other SCMs that are integrated into Eclipse.

There are other concerns in software development, such as performance and correctness. That’s why Wolfram *Workbench* comes with a powerful, integrated debugger written with *Mathematica* and profiling support, as well as testing and unit testing tools—and again, the integration includes running tests, inspecting the test results/reports, editing test source, and much more.

The Wolfram *Workbench* website contains more resources for learning about its capabilities, including screencasts demonstrating many of the features.