Wolfram Computation Meets Knowledge

Code Length Measured in 14 Languages

Update: See our latest post on How the Wolfram Language Measures Up.

I stumbled upon a nice project called Rosetta Code. Their stated aim is “to present solutions to the same task in as many different languages as possible, to demonstrate how languages are similar and different, and to aid a person with a grounding in one approach to a problem in learning another.”

After amusing myself by contributing a few solutions (Flood filling, Mean angle, and Sum digits of an integer being some of mine), I realized that the data hidden in the site provided an opportunity to quantify a claim that I have often made over the years—that Mathematica code tends to be shorter than equivalent code in other languages. This is due to both its high-level nature and built-in computational knowledge.

Here is what I found.

Large tasks - Line count ratio

Mathematica code is typically less than a third of the length of the same tasks written in other languages, and often much better.

Before the comments section fills up with objections, I should state that there are many sources of bias in this approach, not least of which are bias in the creation of tasks, bias in the kinds of persons who provide solutions, and selectivity in which tasks have been solved. But if we worry about such problems too much, we never do anything!

It should also be said that short code is not the same thing as good code. But short good code is better than long good code, and short bad code is a lot better than long bad code!

Naturally, I used Mathematica to gather the data needed to analyze the code length. This is mostly an exercise in web scraping, so the first step is to read the “Terms of Use” of the website, which seem to allow this data aggregation. Second, I want to be responsible in the server load I create (and for this reason, I am not providing a download of the code—if you want it, you will have to contact me, or copy it by hand from this blog post). So I started by creating a version of Import that will store data for future use rather than request it from the server again. I could do this by copying the web pages to local storage, but their whole website is small enough to hold in memory. Preventing repeat web accesses is also very important for performance of this code.

Creating a version of Import that will store data for future use

Now, I start importing some key web pages. The first lists all the languages supported by the project. I use the special “Hyperlinks” option to HTML Import, and then string match away links of the wrong type.

Importing some key web pages

There is a special page for each language that lists completed tasks, so I do something similar to that…

Continuing to import web pages

…and extend the command to take a list of languages and return tasks that have been completed for all of them.

Extending the command to take a list of languages and return tasks that have been completed

The next step isn’t necessary, but you have to process all that slow internet access at some point, and I prefer to get it out of the way at the start by systematically calling every import that I will need to do. I will also dump the data to disk in a compact binary .mx file, so that I can come back to it without having to re-scrape the website. This is a good point to break for some lunch while it works!

Systematically calling every import needed

DumpSave["RosettaCodeData.mx", importOnce];

Now that all the data gathering is done, we can start analyzing it. First, how many tasks have been completed by the Mathematica community?

Length[getCompletedPageList["Mathematica"]]

446

That’s a good number; the most complete on the site is Tcl with 694 tasks. More importantly, there are plenty of tasks that have been completed in both Mathematica and other key languages. This is vital for the like-for-like comparison that I want to do. For example, there are 440 tasks that have a solution in both Mathematica and C.

Length[getCompletedPageList[{"Mathematica", "C"}]]

440

The thorny part of this problem is extracting the right information from crowdsourced, handwritten wiki pages. Correctly written pages wrap the code in a <lang> tag, with a rather inconsistent argument for the language type. But some of them are not correctly tagged, and for those I have to look at the position of code blocks relative to appearance of the language names in section headings. All that results in this ugly bit of XML pattern matching. I’m sure I could do it better, but it seems to work.

XML pattern matching

The <lang> tag, when it has been used, is usually the language name in lowercase, without spaces. But not always! So I have to map some of the special cases.

Mapping some of the special cases

For completely un-marked-up code, or where the solution is descriptive or is an image rather than code, this will return an empty string, and we will treat these as if no solution was provided. With the exception of LabVIEW (where all solutions are images), I suspect that this is fairly unbiased by language, but probably biased toward excluding very small problems.

Here is the code in action, extracting my solution for “flood filling”:

example =   extractCode["http://rosettacode.org/wiki/Bitmap/Flood_fill",    "Mathematica"]

Solution for "flood filling"

The next thing we need are some metrics for code length. The industry norm is “lines of code”:

Finding the metric for "lines of code"

But that is as much a measure of code layout as length (at least for languages like Mathematica that can put more than one statement on a line), so non-white space characters counts might be better.

Finding non-white space characters counts

That disadvantages Mathematica a bit, with its long, descriptive command names (a good thing), so I will also implement a “token” count metric—where a token is a word separated by any non-letter characters.

Implementing a "token" count metric

Here is that piece of code measured by each of the metrics.

Through[{characterCount, lineCount, tokenCount}[example]]

{330, 9, 45}

The line count doesn’t match what you see above because it is counting lines in the original website, and the narrow page design of the Wolfram Blog is causing additional line wrapping.

Now to generate comparison data for two languages, we just extract the code for each and measure it and repeat this for every task the two languages have in common.

Extracting the code for each language and measuring it

If we look at the first three tasks that Mathematica and C have in common, we see that the Mathematica solution has fewer characters in each case.

Take[  compareLanguages[{"Mathematica", "C"}, characterCount], 3]

{{588, 811}, {572, 3749}, {563, 2187}}

Here is all the Mathematica versus C data.

Mathematica versus C data

Mathematica versus C data

There is a lot of noise, but one thing is clear—nearly every Mathematica solution is shorter than the C solution. Some of the outliers are caused by multiple solutions being given for the same language, which my code will just add together.

The best way to deal with such outliers is to do all our smoothing and averaging using Median.

This shows an interesting trend. As the tasks get longer in C, they get longer in Mathematica, but not in a linear way. It looks like the formula for estimating Mathematica code length is 5.5√c, where c is the number of characters in the C solution.

Comparing Mathematica to C

Mathematica versus C

You see similar behavior compared to other languages.

Comparing Mathematica to C++, Python, Java, and MATLAB

Mathematica versus C++, Python, Java, and MATLAB

This is perhaps not surprising, since some tasks are extremely simple. There is little difference between one language and another for assigning a variable, or accessing an array. But there is more opportunity to benefit from Mathematica‘s high-level abstractions, in larger tasks like “Implement the game Minesweeper.” This trend is unlikely to continue though; for very large projects, they should start to scale more linearly at the ratio reached for the typical size of individual code modules within the project.

There are 474 languages listed in the website. Too many to be bothered with this kind of analysis, and quite a lot have too few solutions to analyze. I am going to look at a list of popular languages, and some computation-oriented languages. My, somewhat arbitrary, choices are:

Choosing which languages to analyze

To make a nice table, I need to reduce the data down to a single number. I have two approaches. One is to reduce all comparisons to a ratio (length of code in language A) / (length of code in language B) and find the median of these values over all tasks. The other approach is to argue that code length only matters for longer problems, and to do the same, but only for the top 50% of tasks by average code length.

Reducing the data down to a single number

And finally, here are the tables looking at all the permutations of code-length metric and averaging method.

In all cases, the number represents how many times longer the code in the language at the top of the chart is compared to the language on the left of the chart. That is, big numbers mean the language on the left is better!

All tasks - Character count ratio

All tasks - Character count ratio

All tasks - Line count ratio

All tasks - Line count ratio

All tasks - Token count ratio

All tasks - Token count ratio

Large tasks - Character count ratio

Large tasks - Character count ratio

Large tasks - Line count ratio

Large tasks - Line count ratio

Large tasks - Token count ratio

Large tasks - Token count ratio

Despite the many possible issues with the data, it is an independent source (apart from the handful of solutions that I provided) with code that was not contrived to be short above all other considerations (as happens in code golf comparisons). Perhaps as close to a fair comparison as we are likely to get. If you want to contribute a program to Rosetta Code, take a look at unsolved tasks in Mathematica, or improve one of the existing ones.

While the “Large tasks – Line count ratio” gives the most impressive result for Mathematica, I think that the “Large tasks – Character count ratio” is the really the fairest comparison. But however you slice it, Mathematica is presenting shorter code, on average, than these other languages. On average, five to ten times shorter than the equivalent in C or C++, and that should mean shorter development time, lower code complexity, and easier maintenance.

Comments

Join the discussion

!Please enter your comment (at least 5 characters).

!Please enter your name.

!Please enter a valid email address.

30 comments

  1. This leans towards the notion of Kolmogorov Complexity, the lengh of the shortest code that, when executed, yields a given sequence x — K(x). This is usually done on a universal turing machine. It would be interesting to see what Mma offers with regard to KC and information theory.

    Reply
  2. Concision is clearly important, but it’s equally important that whatever code is present is actually readable. As much as I personally like Mathematica, this is where it repeatedly falls down in the eyes of many. It’s just not worth compressing code if you have to battle to understand seven or eight levels of nested functions, which is all too common in Mathematica code.

    Reply
    • Quite. It is certainly possible to write unreadable code in Mathematica, but it is certainly not intrinsic to the language. It would be good to have a blog on good Mathematica code style, but looking again at the code in this blog, I am not sure I should be the one to write it!

      Reply
      • a) unreadable code in M is LONG code, not SHORT code. b) several nesting levels are due to the superior functional programming paradigm that M supports, which actually makes the code better (usually, not always. You can, of course, abuse the functional programming paradigm as well). If the multiple nesting levels are too hard for you, then you should *study* them, because you have most likely learnt something from that experience and enhanced your arsenal of M skills. c) “battle with” … you don’t battle with concise code, you battle with LONG code. I shouldn’t have to read 10 screens of code to understand what the code is doing. Have you ever seen an air traffic controller reading reams of text? No, they look at ONE screen with *key data* of every flight, and have movement and other visual aids to *instantly* comprehend both big picture and small picture. Conciseness increases comprehension. It also increases the chances of introducing less bugs in the first place, which, when they exist, are easier to find in short code. And I highly recommend the study “Computer Code as a Medium for Human Communication: Are Programming Languages Improving?” by Gilles Dubochet, done at the EPFL Lausanne, in which eye-tracking devices were used to prove that the performance of code comprehension depends on the amount of code visible and understandable with a few glimpses on the screen.

        Comprehension is more important than reading. Efficient communication means a *large* volume of information/comprehension is conveyed using a *small* amount of resources.

        Too bad Scala is missing, that would beat Java and C# and others substantially.

        Reply
  3. A third party syntax markup like highlight makes it easier to separate comments from code for myriad programming languages — http://www.andre-simon.de/doku/highlight/en/highlight.html

    Simple compression also alleviates the variability from “descriptive command name” (and descriptive function names and variable names) — http://shootout.alioth.debian.org/more.php#gzbytes

    I’d be interested in how simple compression compares to your token counts.

    Reply
  4. I am an optimisation oriented programmer – so the thing I feel is necessary to consider here is the /actual/ results of the various implementations as bits and bytes. C allows you to achieve things you just can’t with Mathematica from that perspective… also, on the opposite side a /true/ recreation of a blob of Mathematica code as C would be much /much/ more verbose even than the examples used here I feel.

    If your problem is just a problem to solve then mathematica wins every time because its done a lot of the work for you.

    If you have to solve the problem at 6000Hz on phone hardware (for example) then mathematica isn’t even an option…

    Reply
  5. Horses were harder to tend for and needed oats, but once the padded collar and horseshoes were invented, they were soon capable of working 50% faster than oxen. Power trumps unfamiliarity.

    Reply
  6. I would like to see comparisons with computational language known for having very short programs – the J language – http://rosettacode.org/wiki/J

    Reply
    • This was also mentioned in a thread on this blog in Hacker News. I quickly ran the code on J and got ratios of lines:0.5, characters: 0.74, tokens:0.5, over 432 comparisons IE J is about half the length of Mathematica code.

      Hacker News discussion also asked about APL, which has ratios of between 1 and 2 (IE longer than Mathematica, but over only 32 comparisons which is too few to be reliable).

      Julia was also asked about, but there are only 9 comparisons.

      Reply
  7. Nice analysis! It would be interesting to pool these with code from

    http://en.literateprograms.org

    Reply
  8. I just scrolled up and noticed several other people said the same thing, but also consider using a metric like this: http://shootout.alioth.debian.org/more.php#gzbytes to compare code complexity.

    Reply
  9. I would be interested in seeing a line count of the assembly for the different languages. As an embedded systems developer, it doesn’t mean much to me to say that i.e Java can do a linked list implementation in fewer lines of code than C, since the reality of executed instructions will differ greatly.

    Reply
  10. Right, this compactness goes towards the vision of Stephen Wolfram. Ultimately, the ratio will converge to infinity.

    For example, the software needed to evaluate all the results from the LHC will be

    == ProfessionallyEvaluate[CollisionsFrom[LHC]]

    while competing computer languages have sources like:

    [gigabytes of text were omitted here]

    Of course, a part of the compactness is that Mathematica is hiding lots of tools that programmers using other systems have to develop for themselves. However, it’s not an explanation of everything. They would be lead to develop their routines less cleverly, avoiding the omnipresent arrays and the XML-like structure within Mathematica, so even if they develop all the routines, they must still use them more awkwardly.

    Reply
  11. Python is compact, it is very compact. You can’t go much more compact than python.

    But:
    1)
    Sum of integers can be written in python as:

    sum( [int(i) for i in str(1234556)])

    while the code on http://rosettacode.org is written to be long…. so it is long.

    2)
    On the other hand the comparison seems to be comparing the length of functions body with length of functions calls. If you use im M image processing functionalities, then use them in python as well.

    In conclusion, above comparisons make little sense.

    Marcin

    Reply
  12. It can not be possible to use “Rosetta code” project to compare the number of lines in code. The code there is _example_as_it_can_written_on_some_language_, for example the M (the sortest code ever, ha) – simply doesn’t suppert the requirements given.
    “1_10 sums to 1;
    1234_10 sums to 10;
    fe_16 sums to 29;
    f0e_16 sums to 29.”
    Where the hell is the all of that there? How it can be compared?

    Reply
  13. Have you considered code comments effecting your data?

    Reply
    • The code above will count code comments (which are embedded in the code) as code. Though it will ignore comments in the wiki text. Removing them would requrire coding each languages syntax for comments and was too much to consider for this blog. It could be argued that comments should be included since, if a language is very cryptic, then writing comments is a necessary part of the process of writing code. But note my comments about short code not being the same as good code — that is particularly true when it comes to comments!

      Reply
  14. This seems quite interesting. We measure code in terms of the number of lines it’s written. We don’t consider how complex the logic is. Based on the number of lines we give estimates. We know that this is not the right way to do. However this way we are able to complete the task estimates faster.

    Next time I will try your method. Thanks for sharing this article.

    Reply
  15. Hi, McLoone.

    I’m a researcher from USTC(University Of Science And Technology Of China),we are doing a investigation on programming language and Kolmogorov Complexity with ASU. Now we decided to use code from Rosettacode. So, can we get this Mathematica code in this article? Sorry to interrupt you, but I cannot find some functions’ code posted on this article(like posGreater[]).

    Thank you! Waiting for your reply!

    Reply
  16. Nice idea, but your images make it impossible for a visually impaired person like myself to use. Please consider using alt tags, or better yet, use tables to show the data. Thank you.

    Reply
  17. The set of programs comparing Mathematica and Clojure is not the same set as used to compare Mathemaitca and Ruby, which is not the same comparing Clojure and Ruby. Each comparison uses only programs that have been completed in BOTH systems. This is why I needed to present a big table of comparisons rather than a single relative value.

    If all tasks were completed for all languages, then the ratios would be entirely consistent.

    Reply