Optimization Time – Scala (and C++)

The Case

It is quite a feeling when you learn that your commit, a couple of months ago, broke the build is such a subtle way that it took so long to be detected. Possibly a more thorough testing and validation of the software would have caught it earlier, nonetheless it’s there and you and your coworkers are working hard to delimit the offending code and better understanding what caused the mess.

It turned out that some perfectly working code was taking too much time in one of the hot spot of our codebase. More precisely that code operated a conversion from an incoming data packet into a format suitable for data processing… about 2500 times a second on a modest hardware.

The code resulting in such poor performances that disrupted the device functionality seemed like a good idea in a very functional idiom –

Basically the incoming packet holds six 10-bits samples in 8 bytes and some bits-shifting’n’pasting is needed to rebuild the six sample array for later processing.

Once found that the problem was here I decided rewrite the code to get rid of the functional idiom and take an imperative/iterative approach:

(the comment is a colorful note I decide to leave for the posterity :-)).

The idea is to get rid of the read-only constraint imposed by functional programming and save all the temporaries.

Optimizing is a subtle art, and even if intuition may guide you, only measurement can tell whether the optimization succeeded. So I set up some speed test (see “methodology” below) to objectively assess my results.

The speed improvement was astounding: more than 7 times faster (7.6 to be precise).

Energized by this victory I decided to switch to heavy optimization weapon. Not being sure if scala/java compiler optimization really does its optimization homework, I opted for manually unrolling the loop in the code:

I was somewhat disappointed to learn that this version is only marginally faster (x1.6) than the original functional version. This didn’t really make sense to me, since the loop is completely unrolled and just trivial computation needed to be performed. So I decompiled the .class file:

Instead of the Java array I expected, some Scala internals are used to create the array and then converting it into the Java thing. Possibly this was the cause of the slowness. By looking around at the decompiled code I found another function that just used a Java int array. So I rewrote the unrolled version accordingly –

The main difference is that here the array is first declared and allocated, then filled. In the above code the array was created, initialized and returned all in the same statement.

The speed improvement was good, but not much better than my imperative version: x7.77 times faster.

Then a colleague pointed out that I was using the “old” Scala 2.10 compiler and that I should try the latest 2.12 that benefits from a better interoperability with the underlying JVM 1.8.

In the following table you get the performance comparison:

 functionalimperativeunrolled0unrolled1
scala 2.10 times (lower is better)26.153.4016.603.36
scala 2.10 improvements over functional (higher is better)1x7.68x1.58x7.77
scala 2.12 times (lower is better)24.951.1713.30.021
scala 2.12 improvements over scala 2.10 functional
x1.05x22.32x1.97x1245

Functional and unrolled0 are quite unsurprising, just what you expect from the next version of the compiler. Imperative approach yields quite a boost – 22 times faster than the functional version on the previous compiler. The shocking surprise is in the last column – the unrolled1 version of the code runs in excess of 1200 times faster than the original code!

Before jumping to conclusions, I performed the same tests on the code translated into C++. Here is the equivalent code:

I’m positive that functional purist will scream in horror at the sight of C++ functional version of the code, but it is the closest thing I could do using the standard library. Should you have any idea on how to improve the functional idiom using the standard library, let me know and I’ll update my test set.

Here are the C++ results –

 FunctionalImperativeUnrolled 0Unrolled 1
C++ times (lower is better)0.320.0740.0530.049
Improvements over functional (higher is better)x1x4.36x6.1x6.6
Improvements over Scala 2.10 (higher is better)
x81x46x313x69
Improvements over Scala 2.12 (higher is better)x77x16x250x0.43

This is mostly unsurprising but for the the lower right column which seems to indicate that Scala version is faster than C++. I guess that the reason is because I am using timing values that are rebased considering a standard price you pay for executing the code under test and that I don’t want to consider in my measurements (see below about the methodology I used). My interpretation is that the overhead in Scala hides the cost for a trivial expression so that the base time is much more close to the execution time than it is in C++.

Conclusions

Drawing general conclusions from a specific case is all but easy and trivial. Anyway according to data and to intuition, if you want to code using the functional idiom you pay a price that sometimes can be hefty. Paying an execution price, be it memory or cpu cycles, to raise the abstraction level is something that has been with computer programming from the dawn of this craft. I remember echoes from the ranting against high level languages in favor of assembly to avoid paying a 30% penalty. Now those times are gone, but facing a x4-x7 overhead can be something not always you can afford.

(Raising the level of abstraction by employing the functional idiom is something that is not undisputed, but I will spare this for another post).

Writing efficient code is not always intuitive. Constructs that seem innocent enough turns out to be compiled into time consuming lower level code. Peeking at compiled code is always useful. While languages as C++ are accompanied by detailed information about performances of various constructs and library components, Scala (as Java) lacks of these information. Also the huge amount of syntactic sugar employed by Scala does not simplify the life of the programmer that has to optimize code.

Eventually I cannot save to remarks the performances of interpreted languages (and even more so functional interpreted languages), despite of the progress in the JVM technologies, are still way behind native language ones. Even employing a functional approach C++ is some 80 times faster than Scala. You can argue that the C++ code above is not really functional, but there is nothing preventing me to use a mostly functional library for C++, with roughly the same performances of the functional version of the C++ code. It could be more tedious to write because of the lesser grade of syntactic glucose in C++, yet it is still functional.

I’m curious to see how scala-native compares, but this again is a topic for another post.

The last thought I want to report here is that you can make Scala code much faster, by avoiding the functional paradigm and peeking at compiled code. It is something you are not usually expected to do, but and improvement of 1200 times is a worth gain for the most executed code.

A very last thought for those who think that after all nowadays we have such powerful computers that analyzing performances and optimizing is a waste of time. I would like to remember that data processing requires energy and today energy production mostly relies on processes that produce CO2, something that the world doesn’t need. A data processing that runs just twice the speed of another will produce half of the carbon dioxide emissions. The environment benefits for switching to C and C++ should be clear to everyone.

Methodology

A couple of words on methodology. The compilers were Scala 2.10.6, Scala 2.12.2 and GNU c++ 7.3.1. All the test were performed on Linux (Fedora Core 27) using the command /bin/time --format="%U %S"  command to print the User and Kernel times of the command and summing the two values together. These are times that CPU spent executing the process, not wall clock. The decoding function is executed 0x7FFFFFFF (some 2 billions) times per run and each program is tested for 12 runs. The fastest and the slowest run are discarded and the other are used to compute an average that is considered the measure of the test.

The base time is computed by employing a dummy decoder that does nothing but filling the result array with constants.

This setup should take in account the JIT effect and measure just the code I listed above.

Lambda World 2017 – Lambda Core, hardcore

Despite of the title which is a bit obscure, this talk has been very clear and entertaining. The speaker Jarek Ratajski (@jarek000000) who defines himself a functional programmer, wizard and anarchitect, definitively knows his art and how to make an engaging speech.

Btw, an anarchitect is the person that breaks the rules so that something can be eventually done.

As usual, mistakes, errors, inaccuracies are all my faults. A last note: you need Scala 2.12 to compile the code presented here, older versions are likely to fail.

I really recommend to watch the presentation video on youtube.


Let’s start with a question for the attendees – you are on a desert island and you can take just one thing from the civilized world. What would you like to have to survive?

[someone from the audience shouts “wifi”]

Well, you need Math. Math is the base for everything, but it is huge! So, maybe we can find a small subset, a core, that allows us to recreate the whole Math.

Lambda is a building block of math. Using lambda calculus we can build – one step at time – the whole Math. Historically Math has been built using sets, unfortunately sets had some problems in terms of paradoxes, so mathematicians resorted to this simpler mechanism.

So lambda (λ) is a function that takes some lambda as parameter and produces another lambda. A lambda is written as:

λ x. some expression

A λ x is applied to λ y with the following notation –

x(y)

to a scala developer:

Now suppose we on a deserted island with just our Scala compiler. Just the compiler – no jars, no collections, no scala utils, no Boolean, Int, Float, Double, String… anything at all.

The only exception to this nothingness is the badlam package in order to simplify and help the presentation of concepts.

Let’s define the identity, that is a function that returns the function accepted as an argument:

λ x. x

In Scala it turns out to be:

Now we can define function that accepts multiple arguments (via the curry mechanism) and always returns the first one:

Note that names are not important, but the structure is. The “funny” function structure can be used to implement true and false concepts:

Note that aFalse is complementary with respect to aTrue. Now that we have true and false symbols we can define boolean operators:

You can read this like: and is a function that via currying accepts two arguments p and q. Consider p: it can be either true or false. If it is true its lambda evaluates to its first argument, otherwise to the second. So, when p is true, the and expression evaluates to q, so and is true when both arguments are true. If p is false, then p is returned – no need to consider q.

Let’s check with Scala:

This evaluate to aFalse, while

This evaluate to aTrue.

Or function is based on the same scheme used by and:

To define not is a matter of reversing the true/false definition:

So long, so far, we have defined all the tools we need for boolean calculus. Let’s see if we can express numbers. One approach could be:

So numbers are functions (of course) that accepts a function and an argument, and returns zero or more applications of the function to the argument.

Zero is the identity. One is the single application of lambda f to the identity. Two is the single application of lambda f to one, that is that application of lambda f to the application of lambda f to the identity, and so on…

This may work, but it is boring to encode all numbers in this way. A function that computes the successor of a given number could be a good tool to lessen the burden:

[NdM – well this is mind boggling and I have a hard time to decode. Conceptually is simple – just remember that an integer n is represented by a lambda that accepts a function f and an argument x and returns a function composing n nested applications of f over x. So replace x with f(x) and you are done.]

Now we can define addition and multiplication

[NdM -It takes a while for these two as well. Sum is intuitively easy, take 3+2=5, in lambda is: x => f(f(f(x))) + x => f(f(x)) = x => f(f(f(f(f(x))))). You may read plus as: (m,n,f) => (x) => m(f)(n(f)(x)) , that is a function that takes two operands and one function. Remember that m represents and integer by applying a function m times to an operand. So swap in f as operand and you have m times the application of f, now applies this function to an argument that is n times the application of f to x.

Multiplication is similar, x is omitted to keep the code terse, but you can read it as:

Keeping the sum approach in mind, this is similar – applies m times a function that is composed by n application of f to x. I needed quite a neuron effort to figure this out.]

Predecessor can be defined as:

[NdM. I tried hard to understand this, but simply I couldn’t wrap my mind on it. I don’t have even the slightest idea of how this is expected to work. If you figure it out, let me know, please… well indeed, I think that the last part (u) => u, being an identity, is used to skip an application in the application list, therefore reducing by 1 the application list…]

[NdM. I found this thoroughly explanation of the predecessor on stackoverflow]

Now we can do something even more interesting – conditionals:

Recursion would be a nice addition to the computing arsenal of lambda calculus since it allows to express iterative algorithms. But how are we supposed to call a function when lambda functions are anonymous?

First let’s define a function that makes its argument call itself:

Then we need a helper function that makes something call itself

And finally we define a function containing the body of the recursive function:

Now we have everything to recursively compute a factorial:

[NdM: Once more I have a hard time in trying to understand this part. It makes sense intuitively, but I’m lost at details such as function Y. The overall idea is pretty well laid out]

Turing Equivalent?
Lambda calculus has been proved to be Turing- equivalent, meaning that every algorithm that can be implemented on a Turing Machine, can also be implemented using Lambda Calculus. Therefore, you have no excuse, everything can be done purely functional!

Equivalence

In mathematics there are lot of problems, an interesting one is about theorems.

For a mathematician a theorem is something like

if ( condition(args)) then statement(args)

That is, if something holds, then something else is true. It would be nice to build a machine that automatically checks this. This is what about a century ago, mathematicians were looking for – a machine that renders us useless by proving theorems by itself.

In the same way we used lambda calculus to represent numbers, booleans and statements, we could rewrite the theorem as a lambda calculus expression and then execute it to let it show true or false to determine whether the theorem holds or not.

This could be such a machine:

[NdM: also after having watched the youtube video of the conference, I can’t tell where this function comes from. I believe Jarek, but I have no explanation of why this function should prove theorems.]

That would be awesome, regrettably it doesn’t work. In fact autocall function is invoked over autocall itself causing a stack overflow. This is generally what happens when you try to analyze a lambda function for equivalence.

This fact has been proved in 1936 by Alonzo Church: “No computable function can decide the equivalence of two different lambda functions”.

Despite of this there are two guys on stack overflow that are trying to do exactly this in C#.

Lambda is convenient, but it undergoes the incompleteness theorem by Kurt Gödel – For any consistent, effectively generated formal theory that proves certain basic arithmetic truths, there is an arithmetical statement that is true, but not provable in the theory.

In other words it doesn’t matter how cool your formalism is, you still can’t fully automate through an algorithm to prove properties of another algorithm.

So this is what nearly a century ago a group of mathematicians, led by Alonzo Church, devised in search for a more elegant mathematics.

What I presented today is called Untyped Lambda Calculus. Take plus and try to add true and factorial together. It doesn’t make sense. So, by using this notation you can also write correct expressions that don’t make sense.

Since not every expression you can write makes sense, it is a programmer’s duty to write and use only those that make sense.

Typed lambda calculus check correctness for you. Untyped lambda is very much like javascript in which you can write everything and is not checked for type correctness.

I haven’t talked about Lazy/Eager evaluations in lambda calculus. This is a very complicated issue, even if very fun. If you do lambda calculus on paper you will notice that sometime you need to evaluate lazy and other times you need to evaluate eager.

Usually when you read about lambda calculus in Scala, you don’t find what I have shown you, you find something else – lambda calculus done on type system of Scala:

This is not performed at run-time, but at compile time. This is awesome because the compiler does all the calculations and produce such fine errors you can’t even figure out where they end.

Sources

  • wikipedia
  • blog: (C# Version) Dixin’s blog – this is the post where I took inspiration for this talk.
  • Book: Roger Penrose: The Emperor’s New Mind – this is not about only lambda calculus, but about computation. Reading this book will help you better grasping the subject.
  • Lambda Visualization – Badlam this is nothing special just a package to help the presentation. It is known to work only in presentation and not elsewhere.
  • This same presentation in Java – this is the original work. This presentation has been done first in Java (for fun and drinking beer of course).

NdM: that’s the end of presentation. Here I add my fragment of code I used to check the code. If you prefer you can use Badlam, the code below is in Scala.

Function to convert from lambda numbers to decimal –

The trick is to give function f of lambda the semantic of incrementing its argument by 1. If the argument has type int and start counting from zero the conversion is done.

The following code is just a pretty printer for lambdas that are not integers:

(where aTrue, aFalse and all the other symbols are defined as in the Jarek’s presentation.

Lambda World 2017 – Workshop – Don’t fear the Optics

This talk by Jesús Lopez Gonzales has been quite clear (at least to my challenged functional understanding). As strange as it may sound the whole idea of optics (in Functional Programming) is to solve a problem that exists only because of the functional paradigm. Aside from cheap humor, it makes sense – in structured programming you do the same by forbidding the use of the goto statement, and you need other tools (e.g. break, continue, structured statements) to do the same job in a safe, sound and controlled way.

You can find sources for the running example here: https://github.com/hablapps/dontfeartheoptics.git

But I don’t want to steal the stage. As usual all mistakes and false predicates are mine (my only defense is that the talk was performed without a mike and loudspeaker system).

Just one last not, before starting – the original slides of the presentation used a compact syntax, relying on Scala high sucrose diet. I opted for a more verbose syntax that makes clear to those less fluent in Scala what’s happening behind the sugar curtain.

Functional Programming is a programming paradigm […] that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data.

(from Wikipedia)

In functional programming we decide to not change state of variables once assigned. That means that when we want to change something we have to create a new instance from the existing one.

Consider the position class:

In order to change to position to a new one, the “changing” method just takes the existing pos1 and creates a new instance with the new state:

The running example for this talk is a CandyCrush clone. Here are the main classes:

Modules are defined as follows:

  • Candy REPL – IO
  • Candy Business logic – state program
  • Candy data layer – data structurs & optics

To face problems posed by state immutabile we resort to the Half Life narration – who better than Gordon Freeman – the man with a big lambda on his breast – could help us in the process?

The talk uses an explorative approach – you may want to explore the area to locate the problem (the Alien), then try to solve it using some techniques (equipping new weapons) and then refine the solution until you find an elegant way to fix the problem.

The first enemy to defeat is how to keep state unchanged.

CandyState.scala is the source file where “getter and setters” are located). There are several points where you need to update the state as the game progresses.

[NdM: note that specific to this paradigm these methods accepts a function that transforms score (or playfield) into a new score (or playfield) and return a function that change the Level accordingly. I found this revealing and somewhat mind boggling.]

How would you do in traditional way?
In order to modify you need copy:

This is not straightforward, at least not in general, because you need to copy through several indirection levels. Functional programming is about elegance and modularity, not this.

(Alien identified!)

Lens come to the rescue (or, as Jesus put it, Lens – the crowbar in the half-life analogy – is the weapon to equip). Lens is a parametric class defined over two types: S – the whole and A – the part:

get  method accepts a whole and returns a part. set  accepts a part and tell you how to change the whole to incorporate that part.

The code works, but it is lengthy and boring to write. So we can take advantage of the Lens defined for us by the @Lenses annotation.

[NdM: I’m going to expand the talk a little bit here because I lost some passages and I reconstructed them thanks to my Scala speaking friends]

This annotation instructs the compiler to create one lens method in the companion object for each case class field. [NdM: Oddly enough, for us coming from traditional programming languages, the lens has the same name of the case class field].

[NdM: original example exploited import to inject in the current scope the companion object’s fields, creating a bit of confusion in my mind. In the following examples I will avoid this shortcut in favor of readability].

What if we want to extract the matrix from a level? Operationally we have to navigate through the board (level->board->matrix). This can be done via composition, using the verb composeLens :

[NdM: my Scala speaking friend also told me that def  has been used without a real advantage over val . Having used val would have avoided an unneeded function call.]

The same can be applied for modify:

This syntax is slick, but still more verbose than say Haskell where you write just a dot instead of the composeLens verb.

# 2nd Enemy – Threading State Zombie (State Monads)

Consider the function

Its purpose is to crash a given position. This is accomplished by updating the map and updating the score. Additionally we want the function to return a pair composed by the level and the new score. The first implementation you may want to try is to navigate through the level to change the matrix, then navigate through the updated version of the level to update the score and then prepare the pair with the updated level and the score.

This works, but it is error prone because the programmer must ensure to properly pipe all the changes through the transformations.

The new weapon is the State:

This class defines a mechanism to execute a given action on an object and produce the updated object and a value. And it can be used like:

This maybe more elegant, but can be hardly defined as better, and nonetheless still requires the programmer to properly set up the execution pipe. [NdM: also note that the first part (from val  to run( lv0 )  may be replaced by a more compact  val lv1 = (board composeLens matrix).modify(mx => mx.updated(pos, None))( lv0 )  ]

This can be improved by using an implicit MonadState, that is a class implicitly built from a State class that can be bound together using the >> operator. In code:

Our code becomes:

[NdM: be careful in placing the >> operator! First IntelliJ is not able to recognize it and marks it as an unknown operator; second thanks to Scala forgiveness on syntax punctuation, you need to place >> on the same line of the closing bracket. I couldn’t figure out the right way to write this until I mailed the author of the talk for help. He responded quickly and kindly and set me in the right direction. Thanks Jesús]

[NdM: a quick word on binding. Bind is the same as flatMap, that is the way monads transform their content. In this case the binding allows you to compose the two run action into one. Since the computation accept one value and produces two, you may wonder what happens to the side value (in our case the score) of the first run. Answer – it gets discarded and only the last one is produced in the final result]

MonadState can be composed so they pipe the result one through the other.

[NdM: now, this is a bit more complex to digest – where does the .mod come out? And more importantly how does .mod know what to return? .mod is a method of the StateLens object (well, nearly true, but true enough for the sake of this analysis). Always remember that you are not dealing with actual values, but you are forging functions that will need to be called/applied on actual values. Note that the lenses in the expression are both on the Level class, so the state generated by mod operates on the Level class. The additional type is derived by the right operand of the >> operator.]

# Enemy 3: optional antlion

So far so good, but there are still other entities that cannot fit properly in the picture. What about getting and modifying the current score from the Game? The problem we face is that level is an option in Game. Lenses can’t be used with a plain-vanilla approach.

Let’s try a first attempt to the solution:

extract  is a method of the State

Nice, but cumbersome. The abstraction we can use now is the Prism (which is defined in monocle, roughly in the following way):

The first method takes an object and produces an option, the second method rebuilds the object given a part. So, let’s define our prism:

This prism deals  with an Option[A] , but monocle already provides you with this tool and it’s called some :

# Final Enemy: Multiple Fast Zombies

Now we want to crush an entire column of the board.

We can combine lenses and prisms into something else:

 

As for the Prims we have a getOption method that exposes the Option, but, instead of the reverseGet, there is a set that transforms S into another S provided an A.

Optionals  can be created by composing prisms and lenses as follows:

So that we can write our first iteration of the crushColumn method as –

This function operates on the game matrix and removes the candy when the column of the position is the same as j .

It is not bad per se, but that we are doing this in a manual way. The solution could be improved by using a filterIndex:

Let’s see how to automatize it. Let’s introduce the abstract metaclass Traversal:

Now it is possible to compose the Traversal with other lenses such as:

FilterIndex is a monocle function, that along with the implict mapFilterIndex allows the lens to apply over the map collection.

Since compose syntax may tend to be a bit verbose, you can also use the following operators:

  • ^<-?  compose with prism
  • ^|->  compose with a Lens
  • ^|->> compose traversal

Conclusions: Optics are abstractions for changing parts of wholes. These abstractions are composable to access complex data. Monacle library provides hybrid of concrete e Van Laarhoveen optics [NdM: sorry I missed the explanation entirely]. State monads encapsulate state threading and produce output values.


Max’s comment – The talk has been very helpful in improving my understanding of this aspect of the functional programming. I still find Scala syntax to be a bit on the verge of cryptic and dealing with new concepts doesn’t help as well. Composing stuff in the way functional programming does is a really powerful mechanism that enables the programmer to recycle code in an effective way.

I find that the use of symbols to further reduce the characters count is really dangerous. But this is a topic for another post. Let’s just say that Scala is endangered of write-only code 🙂 Looking forward to attending next Lambda World!

Comparisons

Recently I had to spend some time trying to adapt my imperative/OO background to a piece of code I need to write in functional paradigm. The problem is quite simple and can be described briefly. You have a collection of pairs containing an id and a quantity. When a new pair arrives you have to add the quantity to the pair with the same id in the collection, if exists, otherwise you have to add the pair to the collection.

Functional programming is based on three principles (as seen from an OO programmer) – variables are never modified once assigned, no side-effects (at least, avoid them as much as possible), no loops – work on collections with collective functions. Well maybe I missed something like monad composition, but that’s enough for this post.

Thanks to a coworker I wrote my Scala code that follows all the aforementioned principles and is quite elegant as well. It relies on the “partition” function that transforms (in a functional fashion) a collection into two collections containing the elements of the first one partitioned according to a given criteria. The criteria is the equality of the id so that I find the element with the same id if it exists, or just an empty collection if it doesn’t.

Here’s the code:

Yes, I could have written more concisely, but that would have been too much write-only for me to be comfortable with.

Once the pleasant feeling of elegance wore off a bit I wondered what is the cost of this approach. Each time you invoke merge the collection is rebuilt and, unless the compile optimizer be very clever, also each list item is cloned and the old one goes to garbage recycling.

Partitioning scans and rebuild, but since I’m using an immutable collection, also adding an item to an existing list causes a new list to be generated.

Performance matters in some 20% of your code, so it could acceptable to sacrifice performance in order to get a higher abstraction level and thus a higher coding speed. But then I wonder what about premature pessimization? Premature pessimization, at least in context where I read the them, means the widespread adoption of idioms that lead to worse performances (the case was for C++ use of pre or post increment operator). Premature pessimization may cause the application to run generally slower and makes more difficult to spot and optimize the cause.

This triggered the question – how is language idiomatic approach impacts on performances?

To answer the question I started coding the same problem in different languages.

I started from my language of choice – C++. In this language it is likely you approach a similar problem by using std::vector. This is the preferred collection and the recommended one. Source is like this:

Code is slightly longer (consider that in C++ I prefer opening brace on a line alone, while in Scala “they” forced me to have opening braces at the end of the statement line). Having mutable collections doesn’t require to warp your mind around data to find which aggregate function could transform your input data into the desired output – just find what you are looking for and change it. Seems simpler to explain to a child.

Then I turned to Java. I’m not so fond of this language, but it is quite popular and has a comprehensive set of standard libraries that really allow you to tackle every problem confidently. Not sure what a Java programmer would consider idiomatic, so I staid traditional and went for a generic List. The code follows:

I’m not sure why the inner class Data needs to be declared static, but it seems that otherwise the instance has a reference to the outer class instance. Anyway – code is decidedly more complex. There is no function similar to C++ find_if nor to Scala partition. The loop is simple, but it offers some chances to add bugs to your code. Anyway explaining the code is straightforward once the iterator concept is clear.

Eventually I wrote a version in C. This language is hampered by the lack of basic standard library – beside some functions on strings and files you have nothing. This could have been fine in the 70s, but today is a serious problem. Yes there are non-standard libraries providing all the needed functionalities, you have plenty of them, gazillions of them, all incompatible. Once you chose one you are locked in… Well clearly C shows the signs of age. So I write my own single linked list implementation:

Note that once cleaned of braces, merge function is shorter in C than in Java! This is a hint that Java is possibly verbose.

I just wrote here the merge function. The rest of the sources is not relevant for this post, but it basically consists in parsing the command line (string to int conversion), getting some random numbers and getting the time. The simplest frameworks for this operation are those based on the JVM. The most complex is C++ – it allows a wide range of configuration (random and time), but I had to look up on internet how to do it and… I am afraid I wouldn’t know how to exploit the greater range of options. Well, in my career as a programmer (some 30+ years counting since when I started coding) I think I never had the need to use a specific random number generator, or some clock different from a “SystemTimeMillis” or Wall Clock Time. I don’t mean that because of this no one should ask for greater flexibility, but that I find daunting that every programmer should pay this price because there is case for using a non default RNG.

Anyway back to my test. In the following table I reported the results.

C++ Scala Java C C++
vector list
time (ms) 170,75 11562,45 2230,05 718,75 710,9
lines 81 35 69 112 81

Times have been taken performing 100000 insertions with max id 10000. The test has been repeated 20 times and the results have been averaged in the table. The difference in timing between C++ and Scala is dramatic – with the first faster about 70 times the latter. Wildly extrapolating you can say that if you code in C++ you need 1/70 of the hardware you need to run Scala… there’s no surprise (still guessing wildly) that IBM backs this one.

Java is about 5 times faster than Scala. I’m told this is more or less expected and possibly it is something you may be willing to pay for higher level.

In the last column I reported the results for a version of the C++ code employing std::list for a more fair comparison (all the other implementations use a list after all). What I didn’t expected was that C++ is faster (even if slightly) than C despite using the same data structure. It is likely because of some template magic.

The other interesting value I wrote in the table is the number of lines (total, not just the merge function) of each implementation. From my studies (that now are quite aged) I remember that some researches reported that the speed of software development (writing, testing and debugging), stated as lines of code per unit of time, is the same regardless of the language. I’m starting having some doubt because my productivity in Scala is quite low if compared with other languages, but … ipse dixit.

Let’s say that you spend 1 for the Scala program, then you would pay 2.31 for C++, 1.97 for Java and 3.20 for C.

Wildly extrapolating again you could draw a formula to decide whether it is better to code in C++ or in Scala. Be H the cost of the CPU and hardware to comfortably run the C++ program. Be C the cost of writing the program in Scala. So the total cost of the project is:

(C++) H+C×2.31

(Scala) 68×H+C

(C++) > (Scala) ⇒ H+C×2.31 > 68×H+C ⇒ C×1.31 >67×H ⇒ C > 51.14×H

That is, you’d better using Scala when the cost of the hardware you want to use will not exceed the cost of Scala development by a factor of 50. If hardware is going to cost more, then you’d better use C++.

Beside of being a wild guess, this also assumes that there is no hardware constraint and that you can easily scale the hardware of the platform.

(Thanks to Andrea for pointing out my mistake in inequality)

Scala Days 2016 my thoughts

Scala Days 2016 is over. I’m sorry I didn’t make it to take notes of all the talks I attended, but some speakers spoke very fluently and my phone is not the best way to write down notes. English doesn’t help as well, and sometimes my understanding of the matter was lagging behind.
So what are my impressions? That’s a good question and I’m not sure about the answer. I think Scala is a very fitting solution for a specific niche. Then as every other programming language can be used for everything (I just heard that Dropbox employs 2.7 million lines of Python code).
The niche I have in mind is made of large, distributed applications, handling zettabytes of streaming data, performing math functions over them. A part of the niche could also be composed by high traffic, highly reliable web services (where service is just a general term and refers to any kind of service, including serving web pages).
In this niche using Scala, with actors and possibly Spark makes a lot of sense.
In other contexts you risk to pay the extra run for something you don’t really need – not every server software needs to scale up, not every process needs to be modeled using events. Although functional paradigm eases writing code that copes with these contexts, you still pay an extra cost.
It is hard to quantify how much. The naïve experiment of having two programmers of comparable proficiency in two different languages working at the same task is very hard to setup.
According to an old research the productivity of a programmer, intended as line per unit of time, is more or less the same regardless of the language. That means, for example, that it is cheaper to write programs in C than in assembly, because C is more expressive and higher level than assembly. I’d like to know if the same holds for Scala, where lines tend to be long concatenation of function applications. For sure this is at a more abstract level, but it requires quite an effort to write, lot of effort to understand and it is close to impossible to debug, at least with today tools.
Well back to my impressions on the state of Scala. Scala is comparatively young and suffers from its youth. There are pitfalls and shortcomings in its design (just as naturally is for every other language) that are starting to be acknowledged by the language owners. The solution they seem to prefer is to rewrite and go for a next incompatible version. This is a dangerous move as Python would teach. Also is somewhat that does not acknowledge the industry. It makes sense for a teaching tool, for a research language, but it impacts badly on industry investments.
In several talks the tenet was that although Scala allows the programmer to chose any of the supported paradigms, only functional programming is the proper way to code. Surprisingly, at least to me, Martin Odersky, the language father, doesn’t agree, when attending a talk, he claimed that multi paradigm was a bless when programming the Scala compiler.
Industry needs pragmatism, but I see it only partially. Enthusiasts may crosses the border to become zealots. And crusades are not something that could bring stability and reciprocal understanding. When I hear about functional programming revolution I am somewhat scared, I prefer evolution, acknowledging the goods of existing stuff and building over them. In revolutions many heads roll, included those of innocent people.
The most widespread background for programmers here was Java. I understand that Java is not the most exciting language. Coming from C++ I find Java quite boring. And, in fact, some, if not many, of the advantages of Scala over Java can be found in C++ as well. Unfortunately C++ falls short of open source industry standard libraries. You won’t find anything in C++ that comes close to what Spark or Akka are for Scala. Also Play – even if it doesn’t encounters a unanimous consensus – is the de facto companion library for web services and web development.
Back to Scala days (again) – it has been a positive experience, some talks needed some preliminary study even if they were marked as beginner (everything pretending to explain implicits). Other talks were quite marketing advertisements in disguise. And some were genuinely fun.
I think I got closer to this language and had great opportunities to change my point of view. My wish is for an ecosystem more attentive to the industry and that values back-compatibility  rather than see anything that breaks with the past a way to make easy money by selling technical support.

Implicits inspected and explained

Talk by Tim Soethout. The average age among attendees seem to be higher than other talks, maybe we elders are looking for understanding finally what the hell are implicits (then we would need something similar for monads…). Here we go.
Implicit enables you to use values without explicit reference. Implicit enables the relationship “is viewable as”.
As a parameter the value is taken from the calling context (eg akka sender).
Incomprehensible examples follow.
Implicits are used for DSL, type evidence (whatever this be), reduce verbosity… Other stuff I didn’t make to copy.
Caveats – resolution can be difficult to understand. Automatic conversions. For these reasons better to not overuse.
Implicit parameter just takes the only variable with a matching type as argument.
If two or more variables with a matching types are available an error is reported by the compiler.
Implicit conversion (please avoid them) allows to automatically convert from a type A to a type B every time B is required. This comes handy to convert from and to Java collection types, but it is deprecated. The new approach is to add the asScala method to the Java types.
Implicit view is not understood by yours truly, sorry.
Implicit class. This is used when a class is missing a method. You can write a wrapper to add what you need. If this is declared implicit then when the compiler finds an invocation to the new method the wrapper class is constructed and the method is invoked.
Scoping. Some explanation I didn’t understand. Interactive moment turned quite to an epic failure with most of the attendees having no idea on what was going to happen in the example shown.
Type classes. This concept is used for ad-hoc polymorphism and to extend libraries. A naive approach to serialize to json uses subclasses. All the classes need to inherit from a JsonSerializable and to define the serialization abstract method.
With type class the class to serialize do not need to derive from the same base. A generic trait is defined for the serialization method. By defining implicit variables and assigning them the class which defines the serialization for the specific type then type serializers will be uses when needed.
Other examples just make things worse for me, I though I got it, but now I’m no longer sure (also, now, rereading what I wrote, I’m unsure I understood anything at all).
Wrap up time. I still don’t get this stuff, at least not completely and only slightly more than “danger – do not use”. Type class seems to be really powerful in the sense that they lift Java from its pond to the static polymorphism of C++. But the mechanics that power this specific kind of implicit are still obscure to me. Likely nothing that a good book and some months of hard study couldn’t solve. Just it seems not to be some knowledge you can transfer in a conference.

Transactional event sourcing using slick

Talk by Adam Warski. Event sourcing means that all the changes in the system are captured as a sequence of events.
Reasons for event sourcing: keeping information (not losing info), auditing the information in the system. But it is also useful to recreate the system state.
Hibernate was a technology developed by Adam for Java.
The goal is to have event sourcing with rdbms. There are other approaches (event store, akka persistence, eventuate).
Event is immutable, is the primary source of truth. The payload is an arbitrary json, the name is the case class name. Transaction ID, event ID and user ID.
Events are stored in a dedicated table. A read model is created on the event in the crud. This gives consistency.
DBIOAction[T] gets a description of actions to be done.
User data arrives in the system, within a command (a Scala method). The command uses read to validate the data, if data is valid the command generates a series of actions to perform that updates the model. Event listeners are triggered at this time. Each event listener may trigger further events.
Demo follows for ordering a Tesla model 3.
Summary – no library, just a small adapter. Event listener performs side effects.

image

The system could have been written with free monads. That would have improved testability. But free monads don’t come for free – you have to pay a price to decode them from the code and figure out what the code is trying to do. As Adam said, maybe it’s just me.
And saying this here while presenting a relational db approach takes really a brave heart.

Connecting reactive applications with fast data using reactive streams.

Talk by Luc Bourlier. As in the previous posts these are my quick notes.
Who doesn’t know what a reactive application is? Responsive, elastic, resilient and message driven – this is what reactive apps are.
Big data means that there are too much data to be handled by traditional means on a single machine.
Fast Data are big data that comes in big volume and you want up to the second information with continuous process.
Spark streaming is the technology by light bend that does the trick.
Spark is an evolution of map reduce model. A driver program (spark context) talks to the cluster manager to get worker nodes to do the job.
Spark can be used on streams by using mini-batching. A mini batch is the work executed on data received in a unit of time.
Spark streaming deals with all kind of failures (hardware, software and network). It also handles recovery for continuous processing and deals with excess of data volume.
A demo is presented with a raspberry pi cluster. (On raspberry pi you don’t need to push the system to the limit, because you are already at the limit).
Demo ran fine, but it broke, that makes me wonder how stable is this technology. The demo model seemed quite simple.

Back pressure is the mechanism implemented by akka streaming to slow the data producer if the consumer is not able to consume data fast enough.
Congestion in spark was handled by static limit on the input rate. In spark 1.5 the limit has been changed into dynamic rate limit. There is a rate limit estimator based on PID that sets the rate limit.
There are some limitations to this method based on the assumptions used in the design – all records require about the same time to process, the process is linear (a 3rd assumption was there, but it got lost my my note taking)

Scala Days – key note

It may not be a great surprise, but the opening key note is held by Martin Odersky. I don’t feel much expectation, more or less everyone expects he’s going to repeat the opening speech of the last Scala conference in New York.
In this post I’ll just summarize the content, my considerations will be in a next one.
Scala days are going to be attended by some 1000 people. Conf app is, of course written in Scala and swift (and it’s open source) courtesy of 47 degrees.

Odersky enters with son et lumier effect, halfway between a disco and an alien abduction… Maybe both.
First he shows a steady growth of the language. Scala jobs get a little over the line. There is no comparison with Java, of course, but there is no drop in popularity.
He will talk about the future, next, mid and distant futures.
What’s next? Scala center, Scala 2.12 , Scala libraries.
Scala center is a vendor neutral initiative supported by several partners that promotes Scala, and undertakes projects that benefit all the Scala community.
Scala 2.12 is the next release of the language, about to be completed. Optimized for Java 8. Shorter code and faster execution.
This release will arrive mid year. Older Java versions will be supported by Scala 2.11 that, in turn, will be supported for quite a while.
Not many new features: 33. Main contributors are from the community.
Martin’s book “Programming in Scala” will be updated to 3rd edition to include release 2.12 of the language.

In the farther future: 2.13 will focus on libraries – Scala collections, simpler, lazy, integrated with Spark. Backward compatible.

Split Scala stdlib into core and platform. Stdlib was much prototypal with the idea that wouldn’t have lasted long.

Scala js, and Scala native.

The dot is the foundation of Scala. a mini language, small enough that programs written in this language can be machine proof. Much of the language can be encoded in this language. 8 years in the working. Language work can be done with much more confidence.

image

Type soundness, properties of code can be demonstrated (the examples say that an expression of type T produces a value of type T).
Dotty a language close to Scala compiler that produces dot code. Generics are processed with dot. Not ready for industry, but if you want to try something cool…
Faster
Goal: Best language Martin knows how to make.

Dropped: procedure syntax. Rewrite tool that will take care of translating. DelayedInit.
Macros – of was just a long run example. There will be an alternative.
Early initializers – for stuff that needs to be initialized before the base trait.
Existential Types forSome.
General Type Projection T#x.

Added: intersection and union types- types T&U just the common properties of the two types. T|U will have either the properties of T or the properties of U.

Function arity adaptation. Pairs.map ((a,b) =>a+b)

Static method for object.

Non blocking lazy Vals: locking time is much shorter now. Avoiding deadlocks.
@volatile for thread shared lazy vals.

Multiversal equality type safe equality and inequality operators. Named type parameters – partial type parametrization.

Motivations better foundation, safer,…

SBT integration. Repl with syntax highlighting. Intellij. Doc generation. Linker.

Future.
Scala meta – will replace macros and meta programming. Inline and meta. Executed by the compiler.
Implicit function types . used to compose … Just more mess.

Effect checking a->b pure function, a=>b impure. Checked by the compiler.

Nullable types T? =T | Null. Types coming from Java will have a ? Because they have side effect. It is an alternative to monads.

Generic programming.

Guard rails – how to prevent the programmers to misuse or abuse the language. Strategic Scala style: principle of least power. Strategic Scala style.

Libraries that inject bad behavior (eg implicit conversion). Implicit conversions will make a style error if public.

Syntax flexibily. Even Martin regrets the space syntax. Add @infix annotation if the author intends it to be used as infix and give a style error in other cases.

Operators are regretted as well . @infix will have the option to give names to such operators.

Scala center

Notes from “Scala center” by Heather Miller. Scala Center is a non profit organization established at EPFL. It is not lightbend. Same growth chart of yesterday, source are not cited (indeed?). Stack overflow survey reports Scala in the top 5 most loved languages.
The organization will take the burden of evolving and keeping organized libraries and language environment, educating and managing the community rather than the language itself.
Coursera Scala class is very popular (400k) with a high completion rate. There will be 2 new courses on the new coursera platform. Unverified courses are free, verified and certified courses are paid.
Functional programming in Scala – 6 weeks.
Functional program design in Scala – 4 weeks.
Parallel programming – 4 weeks.
Big data analysis in Scala and spark – 3 weeks.

My (somewhat cynic) impression – lot of work and desperate needs for workforce, they are looking to get for free by grooming the community.
EPFL funds for 2 ppl for moocs . donations from the industry and revenues from moocs.
Lightbend? Will continue to maintain the stable Scala.
Package index is not yet available for Scala. Aka people should be able to publish their projects and get them to be used without the need of being a salesperson.
Scala library index. Index.scala-lang.org
It is an indexing engine.

Just wondering – is this a language for the academia or for the industry? Keep changing things and the investments made by the industry will be lost: language is going to change, base libraries are going to change as well… Which warranties do I have that my code will still compile 5 years ahead in the future?
Changing things is good for the academia since it allows to do research and to better teach new concepts. It doesn’t harm the community where workforce is free and there is no lack of people to redo the same things with new tech.