Plug-and-Play Mathematica with Wolfram Lightweight Grid System
March 31, 2009 — Joel Klein, Distributed Systems Engineering Manager
For some people, parallel computing and the need for a cluster is a way of life. For others, the need sneaks up on them. Most clusters and grids are planned and organized from the first, and that can take time and effort, to say nothing of configuration. Other times there’s no budget for new hardware, but there are computer labs or desktop computers unused for much of the day—a cluster waiting to be harnessed, if only you can get the Macs to talk to the Windows boxes, and keep straight all the hostnames in use. For situations like these I helped develop Wolfram Lightweight Grid System, which is designed from the ground up to let you assemble existing hardware into a self-organized network, accessible from Mathematica with almost no configuration.
Parallel computing snuck up on me when I was taking a Machine Learning course from Dan Roth at the University of Illinois. For my graduate student project, Dr. Roth had me look at a problem in identifying a word’s grammatical part of speech based on what other words occur with it in a sentence. I studied the tree-augmented naive Bayes algorithm and coded it up, testing with my own mini data set. When I started running my code on the real training data, I noticed it was running slowly. Then I realized I had pointed an n2 algorithm at a model where n was about 250,0000, the number of distinct English words (counting, for instance, “dog” separately from “dogs”) that might appear in the test corpus. With less than two weeks left before the project due date, I calculated that most of the time would be spent running the algorithm once on a single workstation, with little time left to analyze the results and try refinements. I found a way to manually process the training data in batches, remotely logging in to several lab computers and running them for hours at a time. While it was an interesting exercise, it was an unwanted added burden at the end of the semester.
The point of that story is that I had better things to do with my time, and so do you. That’s where Mathematica comes in.
Mathematica is easily recognized for things like its collection of mathematical algorithms, but what makes it shine as a professional tool is its support for software technologies. Just as you have better things to do than research and program file formats, you have better things to do than fiddle with remote shell configurations and lists of hostnames.
Wolfram Lightweight Grid System brings automation to the problem of connecting separate computers for parallel Mathematica programming. And while the core function of Lightweight Grid is to remotely control Mathematica kernel processes over the network, it has a feature for Mathematica to discover itself.
What is Wolfram Lightweight Grid System anyway? It has two parts, client and server. If you have Mathematica 7, you have Wolfram Lightweight Grid Client. To get the corresponding server, you need Wolfram Lightweight Grid Manager, which comes with gridMathematica Server.
Using Lightweight Grid works something like this:
- Pick some computers
- Install gridMathematica Server on them. Say yes when the installer asks if you want to enable Wolfram Lightweight Grid Manager, and answer the few questions that follow.
- Use the web interface to set up your Mathematica license. You can come back to the web interface to manage kernels or tweak your configuration, but you may never need it again.
- Open the Evaluation > Parallel Kernel Configuration window, enable Lightweight Grid, and watch as Mathematica discovers the gridMathematica Server installations you just created.
- Tell Mathematica 7 how many kernels you want to run on each computer.
There is potentially one more step, and that happens when the Mathematica installation you interact with is on a different subnet from your gridMathematica Servers. In that case, you tell Mathematica the name of just one computer on the other subnet, and Mathematica pulls in the names of all the others automatically.
In our office environment, I’ve harnessed 8 computers running 5 different operating systems (Mac OS X, Linux, Solaris, Windows Server 2003, and Windows XP) to make a 26-kernel configuration.
There: I’ve assembled a handful of multicore desktop computers into a serviceable cluster, made possible with the plug-and-play technology of Wolfram Lightweight Grid System. Now I can point an appropriate parallel Mathematica program to my new cluster, and it will run unmodified, potentially running over 6 times faster than on a 4-core machine. For some, this approach may be the only cluster they need or can afford. For others, it’s a way to reclaim wasted CPU cycles until the next budget cycle lets them assemble their dream machine.