Comparisons

Recently I had to spend some time trying to adapt my imperative/OO background to a piece of code I need to write in functional paradigm. The problem is quite simple and can be described briefly. You have a collection of pairs containing an id and a quantity. When a new pair arrives you have to add the quantity to the pair with the same id in the collection, if exists, otherwise you have to add the pair to the collection.

Functional programming is based on three principles (as seen from an OO programmer) – variables are never modified once assigned, no side-effects (at least, avoid them as much as possible), no loops – work on collections with collective functions. Well maybe I missed something like monad composition, but that’s enough for this post.

Thanks to a coworker I wrote my Scala code that follows all the aforementioned principles and is quite elegant as well. It relies on the “partition” function that transforms (in a functional fashion) a collection into two collections containing the elements of the first one partitioned according to a given criteria. The criteria is the equality of the id so that I find the element with the same id if it exists, or just an empty collection if it doesn’t.

Here’s the code:

Yes, I could have written more concisely, but that would have been too much write-only for me to be comfortable with.

Once the pleasant feeling of elegance wore off a bit I wondered what is the cost of this approach. Each time you invoke merge the collection is rebuilt and, unless the compile optimizer be very clever, also each list item is cloned and the old one goes to garbage recycling.

Partitioning scans and rebuild, but since I’m using an immutable collection, also adding an item to an existing list causes a new list to be generated.

Performance matters in some 20% of your code, so it could acceptable to sacrifice performance in order to get a higher abstraction level and thus a higher coding speed. But then I wonder what about premature pessimization? Premature pessimization, at least in context where I read the them, means the widespread adoption of idioms that lead to worse performances (the case was for C++ use of pre or post increment operator). Premature pessimization may cause the application to run generally slower and makes more difficult to spot and optimize the cause.

This triggered the question – how is language idiomatic approach impacts on performances?

To answer the question I started coding the same problem in different languages.

I started from my language of choice – C++. In this language it is likely you approach a similar problem by using std::vector. This is the preferred collection and the recommended one. Source is like this:

Code is slightly longer (consider that in C++ I prefer opening brace on a line alone, while in Scala “they” forced me to have opening braces at the end of the statement line). Having mutable collections doesn’t require to warp your mind around data to find which aggregate function could transform your input data into the desired output – just find what you are looking for and change it. Seems simpler to explain to a child.

Then I turned to Java. I’m not so fond of this language, but it is quite popular and has a comprehensive set of standard libraries that really allow you to tackle every problem confidently. Not sure what a Java programmer would consider idiomatic, so I staid traditional and went for a generic List. The code follows:

I’m not sure why the inner class Data needs to be declared static, but it seems that otherwise the instance has a reference to the outer class instance. Anyway – code is decidedly more complex. There is no function similar to C++ find_if nor to Scala partition. The loop is simple, but it offers some chances to add bugs to your code. Anyway explaining the code is straightforward once the iterator concept is clear.

Eventually I wrote a version in C. This language is hampered by the lack of basic standard library – beside some functions on strings and files you have nothing. This could have been fine in the 70s, but today is a serious problem. Yes there are non-standard libraries providing all the needed functionalities, you have plenty of them, gazillions of them, all incompatible. Once you chose one you are locked in… Well clearly C shows the signs of age. So I write my own single linked list implementation:

Note that once cleaned of braces, merge function is shorter in C than in Java! This is a hint that Java is possibly verbose.

I just wrote here the merge function. The rest of the sources is not relevant for this post, but it basically consists in parsing the command line (string to int conversion), getting some random numbers and getting the time. The simplest frameworks for this operation are those based on the JVM. The most complex is C++ – it allows a wide range of configuration (random and time), but I had to look up on internet how to do it and… I am afraid I wouldn’t know how to exploit the greater range of options. Well, in my career as a programmer (some 30+ years counting since when I started coding) I think I never had the need to use a specific random number generator, or some clock different from a “SystemTimeMillis” or Wall Clock Time. I don’t mean that because of this no one should ask for greater flexibility, but that I find daunting that every programmer should pay this price because there is case for using a non default RNG.

Anyway back to my test. In the following table I reported the results.

C++ Scala Java C C++
vector list
time (ms) 170,75 11562,45 2230,05 718,75 710,9
lines 81 35 69 112 81

Times have been taken performing 100000 insertions with max id 10000. The test has been repeated 20 times and the results have been averaged in the table. The difference in timing between C++ and Scala is dramatic – with the first faster about 70 times the latter. Wildly extrapolating you can say that if you code in C++ you need 1/70 of the hardware you need to run Scala… there’s no surprise (still guessing wildly) that IBM backs this one.

Java is about 5 times faster than Scala. I’m told this is more or less expected and possibly it is something you may be willing to pay for higher level.

In the last column I reported the results for a version of the C++ code employing std::list for a more fair comparison (all the other implementations use a list after all). What I didn’t expected was that C++ is faster (even if slightly) than C despite using the same data structure. It is likely because of some template magic.

The other interesting value I wrote in the table is the number of lines (total, not just the merge function) of each implementation. From my studies (that now are quite aged) I remember that some researches reported that the speed of software development (writing, testing and debugging), stated as lines of code per unit of time, is the same regardless of the language. I’m starting having some doubt because my productivity in Scala is quite low if compared with other languages, but … ipse dixit.

Let’s say that you spend 1 for the Scala program, then you would pay 2.31 for C++, 1.97 for Java and 3.20 for C.

Wildly extrapolating again you could draw a formula to decide whether it is better to code in C++ or in Scala. Be H the cost of the CPU and hardware to comfortably run the C++ program. Be C the cost of writing the program in Scala. So the total cost of the project is:

(C++) H+C×2.31

(Scala) 68×H+C

(C++) > (Scala) ⇒ H+C×2.31 > 68×H+C ⇒ C×1.31 >67×H ⇒ C > 51.14×H

That is, you’d better using Scala when the cost of the hardware you want to use will not exceed the cost of Scala development by a factor of 50. If hardware is going to cost more, then you’d better use C++.

Beside of being a wild guess, this also assumes that there is no hardware constraint and that you can easily scale the hardware of the platform.

(Thanks to Andrea for pointing out my mistake in inequality)

6 thoughts on “Comparisons

  1. Your comparison among Scala and the other languages is not fair.
    You are comparing an implementation using immutable structures with implementations that use mutable structures. It’s obvious that the performance is 2 orders of magnitude worse, both in terms of speed and memory!

    You should reimplement the Scala version with a mutable collection (and update items in place) and redo the tests.

    This specific problem screams for a mutable structure (and I’d rather use a map, maybe a hashmap, rather than a list)
    Scala let’s you implement in an imperative way, when you need to. It’s perfectly legit, and you should not be ashamed of it.

    Also, comparing C++ and Scala based on the raw performance doesn’t make sense. Obviously C++ is more performant then Scala (given a good C++ programmer as you are). But the choice of a programming language depends on many different things.

    • It is likely I had to state it better – this comparison didn’t want to be between raw performances. In fact it makes little sense to compare a compiled language vs an interpreted one. Instead my goal was to compare how much “premature pessimization” you get when you tackle a problem with a functional approach.
      In other words my experiment was intended to be – give the same assignment to one Scala programmer and to one C++ programmer and compare the outcome. In this sense I don’t find it unfair – you pick up the tool you consider more apt, but in the end it only matters how much it costs to create it and how much it costs to run it.
      I agree that the problem is fictitious – I just took a piece of code for which an elegant solution has been found and overcharged it. I mean, in the real world that code does just a few iterations and I tested it with a number of iteration 1000 times. No one of the two programmers was concerned with performances, as it was reasonable. But this doens’t mean that you don’t pay for the functional approach. The overall code will be slower, maybe this will go unnoticed in the 80% of the code, but then you will need more hardware to run multiple instances, you will need more energy.
      I am not saying this is bad and you should avoid Scala in favor of C++, but that it is nice to take in account the extra you are paying and will be paying at run-time.
      I did other tests on the same problem – using mutable hash table Scala is only 10 times slower than C++. Changing data representation to a more suitable one (both in C++ and Scala) you can have the fastest implementation and Scala is only 2 times slower than C++.
      But this is just out of the scope – the first approach for C++ programmers is to use a std::vector, while for Scala programmers is to use an immutable List.

      • Again I have to disagree on the statement FP = “premature pessimization”. I’d rather say “bad design choice” = “premature pessimization”.

        In this specific example, as stated, you have a growing list of data that have a state (the quantity) that is frequently changing. You should not implement it with immutable structures that at every call reconstruct completely.

        For example (as I explained in another reply) you can compromise by using an immutable list of (partly) mutable objects. The list grows by adding new elements at the head, and this is very cheap.
        A functional programmer will focus on the “what” he has to do, rather than “how”, and he won’t be bothered by the usual boilerplate of iterating, updating, etc. This can improve productivity a lot.

        Then again, performance compared against C++ is in general worse. But the choice of a language should not be done only on raw performance (you’re saying that you’re not comparing raw performance, but indeed it is this you are saying again in your reply).

        Btw, Scala is not an interpreted language, it is compiled and executed in a VM.

        So, Scala is better? C++ is better? Both are better… The important thing is being able to choose the right programming language for the right problem, given many factors. among them also the expertise of developers.

        And don’t have a prejudice on Functional Programming Paradigm. It can take a while to get used to it after many years of Imperative paradigm, but in the end it can help describe problems in a more logical and clear way.

        • I think we are coming about to the same conclusion – pick the tool which better deals with the problem you are trying to solve. The business doesn’t care for FP / OO / Scala / Java / whatever. And we agree raw speed is not everything because it is just a part of the goal.
          I tried to take in account both speed of execution and speed of code production. I already know that trying to find a good Scala programmer is exceedingly hard. And I wondered what is the extra CPU you are paying to do FP.
          Maybe I have been exposed to too much bad Scala code, but I keep seeing a strong aversion for “var” and mutable. So, I focused on this aspect. I tried also with mutable data structures with a solution close to the one you proposed. I don’t have data here, but IIRC the code was five times slower wrt C++. As you said and I fully agree this is not enough to dismiss Scala in favor of C++. My final disequation – taken with a lot of grains of salt – pretended to be a rule of thumb to select the language according to the cost of running the application.
          Mutable or not mutable there is a price to pay for functional, this is why we needed to wait some 50 years since the formulation of lambda calculus and Lisp, for a functional language to start becoming mainstream. I’ll keep an eye for other cases to benchmark 🙂

  2. If you accept that “quantity” in Data object is mutable (and you should, since it is in the Java, C++ and C versions) you can write:

    case class Data(val id: String, var quantity: Int)

    type Collection = List[Data]

    def merge( collection : Collection, data : Data ) : Collection = {
    collection.find(_.id == data.id).map { d =>
    d.quantity += data.quantity
    collection
    }.getOrElse( data +: collection )
    }

    In this case if a Data object is already present, its quantity is updated, otherwise a new Data object is added at the head of the (immutable) list. The list is not duplicated (not wasted memory), and “quantity” is the only mutable state.
    I think performance should improve.

  3. With an implicit (and this can be an answer to your post on implicits) you can write (sorry that indentation is lost in this editor):

    case class Data(val id: String, var quantity: Int)

    type Collection = List[Data]

    implicit class mergedCollection(val collection: Collection) extends AnyVal {
    def merge(data: Data): Collection = collection.find(_.id == data.id).map { d =>
    d.quantity += data.quantity
    collection
    }.getOrElse( data +: collection )
    }

    val c0: Collection = List.empty

    c0.merge(Data("a", 1)).
    merge(Data("b", 1)).
    merge(Data("a", 1))

    With an implicit class we have easily extended our Collection with a method merge(). That’s nice!

Leave a Comment