Wolfram Computation Meets Knowledge

Pattern Matching Your Source Code—How Wolfram Workbench Integrates Mathematica Development Tools

Inside and outside of Wolfram Research, teams are working on large Mathematica projects. Working with large code bases requires powerful tools; it is even better if these tools are integrated. With Wolfram Workbench, we brought an integrated development environment (IDE) to our users.

What does “integrated” mean? Well, let’s look at just one example of how Workbench integrates Mathematica‘s central language features, pattern matching, editors, and source management tools.

Let’s start with a specific problem: with our Mathematica 6.0 release, we overhauled many of our libraries and APIs (our recent Version 7.0 release builds on the improvements in Version 6.0). Some groups of functions were deprecated or their APIs changed. We had collected a long list of these changes… but how would users apply them to their source code? Go through them one by one and line by line in their code? Definitely not.

To solve the problem, we employed one of the basic mechanisms of Mathematica: pattern matching.

The changes were collected as patterns and replacements. For instance, the Random function is now available in the form of various functions. It can now, for example, return a list of random numbers. So, where previously we had to do this:

{Random[Real, 10], Random[Real, 10]}

we can now do this:

RandomReal[10, 2]

which returns the same list as the code above it. (Note: functions like RandomReal, RandomComplex, and so on are now preferred to the old, parameterized version of Random.)

Our migration tool can handle changes of that kind. This pattern matches the above example:

{Random[Real, x_], Random[Real, x_]}

The correct replacement is this:

RandomReal[x, 2]

It’s possible to run these transformations on code in an open notebook—but what about users with large code bases, thousands of lines of Mathematica code in packages, like .m files? What about the large Mathematica projects consisting of dozens or hundreds of packages, stored in .m text files?

Good thing we had made the Mathematica pattern matcher and a special transformation engine available in Wolfram Workbench, and used it in the Migration Assistant. The pattern language works like in an instance of Mathematica, but instead of in-memory expressions, it works on the source representations of the Mathematica code.

pattern1.jpg

What you see in this screen shot is the collection of changes in libraries—expressed as the Mathematica pattern to find and its corresponding replacement (using the pattern variables such as Mathematica rules for Replace, ReplaceAll, and so on).

The Migration Assistant allows you to migrate large source bases automatically. It searches for the patterns in the source of whole Mathematica projects in Wolfram Workbench and performs the replacements. These changes are then available for preview—each change and modification can be seen side by side with the original code.

The Migration Assistant is but one example of the integration of the pattern matching and transformation features in Wolfram Workbench. Mathematica allows pattern matching for many purposes—for instance, when analyzing XML data.

So we thought—why not use it for analyzing Mathematica source files?

For instance: static analysis tools. What are those? You may have heard of or used tools like lint (for C), PMD (for Java), or others. If you’ve used a modern Java IDE (e.g. Eclipse), it’ll also have integrated static analysis tools. These tools check the source code for “bad” code patterns, such as code that is buggy, or uses APIs and functions in an odd and potentially buggy way. In short: a few static analysis runs a day can keep the debugger away (and if not, Wolfram Workbench comes with an integrated debugger as well).

Wolfram Workbench also does static analysis—every time you edit a file, the code is checked incrementally for a host of programming problems.

A part of the static analysis in Wolfram Workbench is implemented with pattern matching: warnings, custom warnings, and quick fixes. Sounds abstract—let’s look at an example.

Have you ever made this mistake:

Foo[x_] := Module[
  {y = {1, 2, 3}},
  Append[y, x];
  Length[y]
 ]

There’s a bug in there—or at least dead code. It’s an easy mistake to make—particularly for programmers familiar with imperative programming languages. The problem: Append[y, 42] actually returns a new list that is created by appending 42 to the list in y.

And now the problem in the code is clear: the return value of the Append call is thrown away—and variable y is still just the previous list. What we want here is a different function: AppendTo[list, newItem], which actually assigns the list to the correct variable.

In Workbench, we have such a tool—and surprise: it’s based on Mathematica patterns. Once you find a potentially bad use of a function, you can make sure you (and everyone on your team if you want) is warned—as long as you can figure out a pattern that matches it.

The tool is available in the preferences:

pattern2.jpg

Let’s find a pattern that will match the Append problem:

CompoundExpression[l___, x : Append[list_, rest__], r__]

That’s the pattern that matches the case I described above. It’s not as simple as just looking for any old Append function—there are obvious cases where Append is right. What we want to do is warn when the result of Append is thrown away without ever being seen.

And this replacement is a fix:

l; AppendTo[list, rest]; r

Workbench integrates the static analysis tool in two ways. If a warning pattern is found in Mathematica code, Workbench will show a warning in the editor and also in Eclipse’s Problems view, which lists all problems, errors, and warnings in the workspace. If a problem can be solved with a simple transformation, Workbench will offer it when a quick assist is requested.

These patterns are checked whenever code is edited in Wolfram Workbench.

Another way of searching for bad code with patterns is available under the Mathematica tab in the Search > Search dialog…, where you can enter Mathematica patterns.

But Workbench also allows to do your own bulk changes: right-click in a Mathematica source file and choose Expression Find/Replace, and you’ll get a dialog like this:

pattern3.jpg

The pattern and replacement will be applied to either the current file (in the editor) or to all selected files or contents of a project. As with the Migration Assistant, you’ll get a preview dialog that shows all the modifications so you can opt out if a transformation isn’t as desired.

While these replacements operate on expressions, Workbench adds another dimension: the expressions are stored as text, which allows them to be formatted, i.e. indentations, white space, and so on. The transformation expression actually lets you reformat code in a custom way—if you use a pattern like this:

If[cond_, i_, e_]

And replace it with this:

If[
  cond,
  i,
  e
 ]

The result of transforming the input If[a==b, 42, 43] will be this:

If[
  a == b,
  42,
  43
 ]

While Workbench has a formatter (in the editor’s context menu, Source > Format) that formats general Mathematica code in a readable way, this approach allows you to customize the layout of source code. It allows you, for instance, to format code (or data) in a way that makes it more readable. The transformation also supports comments—if you want to annotate code with comments, just match for the code and include the comment in the replacement.

Mathematica‘s consistent design (everything’s an expression) combined with pattern matching is just as helpful when working with and modifying large source bases.

It’s good that the simple yet powerful principles of Mathematica allow users to do many different tasks without having to learn new concepts. Every Mathematica user already understands the concepts we used to implement code transformation and static analysis—and Eclipse and Wolfram Workbench integrate them seamlessly.

The code transformation and static analysis tools are but one small area of the integrated tooling Wolfram Workbench provides. Modern software development requires many other tools.

Version management is tightly integrated into Eclipse—and Wolfram Workbench benefits from it. Gone are the days of doing version management by copying a notebook and using a naming scheme like “MyCode.nb.bak1”, “MyCode.nb.bak_Thursday27th”, or similar. The local history system is an undo feature on steroids, which can be a first step towards version management. But once a project grows, it’s also possible to move to systems like CVS, Subversion, or Git—or many other SCMs that are integrated into Eclipse.

There are other concerns in software development, such as performance and correctness. That’s why Wolfram Workbench comes with a powerful, integrated debugger written with Mathematica and profiling support, as well as testing and unit testing tools—and again, the integration includes running tests, inspecting the test results/reports, editing test source, and much more.

The Wolfram Workbench website contains more resources for learning about its capabilities, including screencasts demonstrating many of the features.