Lambda World 2017 – Workshop – Don’t fear the Optics

This talk by Jesús Lopez Gonzales has been quite clear (at least to my challenged functional understanding). As strange as it may sound the whole idea of optics (in Functional Programming) is to solve a problem that exists only because of the functional paradigm. Aside from cheap humor, it makes sense – in structured programming you do the same by forbidding the use of the goto statement, and you need other tools (e.g. break, continue, structured statements) to do the same job in a safe, sound and controlled way.

You can find sources for the running example here: https://github.com/hablapps/dontfeartheoptics.git

But I don’t want to steal the stage. As usual all mistakes and false predicates are mine (my only defense is that the talk was performed without a mike and loudspeaker system).

Just one last not, before starting – the original slides of the presentation used a compact syntax, relying on Scala high sucrose diet. I opted for a more verbose syntax that makes clear to those less fluent in Scala what’s happening behind the sugar curtain.

Functional Programming is a programming paradigm […] that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data.

(from Wikipedia)

In functional programming we decide to not change state of variables once assigned. That means that when we want to change something we have to create a new instance from the existing one.

Consider the position class:

In order to change to position to a new one, the “changing” method just takes the existing pos1 and creates a new instance with the new state:

The running example for this talk is a CandyCrush clone. Here are the main classes:

Modules are defined as follows:

  • Candy REPL – IO
  • Candy Business logic – state program
  • Candy data layer – data structurs & optics

To face problems posed by state immutabile we resort to the Half Life narration – who better than Gordon Freeman – the man with a big lambda on his breast – could help us in the process?

The talk uses an explorative approach – you may want to explore the area to locate the problem (the Alien), then try to solve it using some techniques (equipping new weapons) and then refine the solution until you find an elegant way to fix the problem.

The first enemy to defeat is how to keep state unchanged.

CandyState.scala is the source file where “getter and setters” are located). There are several points where you need to update the state as the game progresses.

[NdM: note that specific to this paradigm these methods accepts a function that transforms score (or playfield) into a new score (or playfield) and return a function that change the Level accordingly. I found this revealing and somewhat mind boggling.]

How would you do in traditional way?
In order to modify you need copy:

This is not straightforward, at least not in general, because you need to copy through several indirection levels. Functional programming is about elegance and modularity, not this.

(Alien identified!)

Lens come to the rescue (or, as Jesus put it, Lens – the crowbar in the half-life analogy – is the weapon to equip). Lens is a parametric class defined over two types: S – the whole and A – the part:

get  method accepts a whole and returns a part. set  accepts a part and tell you how to change the whole to incorporate that part.

The code works, but it is lengthy and boring to write. So we can take advantage of the Lens defined for us by the @Lenses annotation.

[NdM: I’m going to expand the talk a little bit here because I lost some passages and I reconstructed them thanks to my Scala speaking friends]

This annotation instructs the compiler to create one lens method in the companion object for each case class field. [NdM: Oddly enough, for us coming from traditional programming languages, the lens has the same name of the case class field].

[NdM: original example exploited import to inject in the current scope the companion object’s fields, creating a bit of confusion in my mind. In the following examples I will avoid this shortcut in favor of readability].

What if we want to extract the matrix from a level? Operationally we have to navigate through the board (level->board->matrix). This can be done via composition, using the verb composeLens :

[NdM: my Scala speaking friend also told me that def  has been used without a real advantage over val . Having used val would have avoided an unneeded function call.]

The same can be applied for modify:

This syntax is slick, but still more verbose than say Haskell where you write just a dot instead of the composeLens verb.

# 2nd Enemy – Threading State Zombie (State Monads)

Consider the function

Its purpose is to crash a given position. This is accomplished by updating the map and updating the score. Additionally we want the function to return a pair composed by the level and the new score. The first implementation you may want to try is to navigate through the level to change the matrix, then navigate through the updated version of the level to update the score and then prepare the pair with the updated level and the score.

This works, but it is error prone because the programmer must ensure to properly pipe all the changes through the transformations.

The new weapon is the State:

This class defines a mechanism to execute a given action on an object and produce the updated object and a value. And it can be used like:

This maybe more elegant, but can be hardly defined as better, and nonetheless still requires the programmer to properly set up the execution pipe. [NdM: also note that the first part (from val  to run( lv0 )  may be replaced by a more compact  val lv1 = (board composeLens matrix).modify(mx => mx.updated(pos, None))( lv0 )  ]

This can be improved by using an implicit MonadState, that is a class implicitly built from a State class that can be bound together using the >> operator. In code:

Our code becomes:

[NdM: be careful in placing the >> operator! First IntelliJ is not able to recognize it and marks it as an unknown operator; second thanks to Scala forgiveness on syntax punctuation, you need to place >> on the same line of the closing bracket. I couldn’t figure out the right way to write this until I mailed the author of the talk for help. He responded quickly and kindly and set me in the right direction. Thanks Jesús]

[NdM: a quick word on binding. Bind is the same as flatMap, that is the way monads transform their content. In this case the binding allows you to compose the two run action into one. Since the computation accept one value and produces two, you may wonder what happens to the side value (in our case the score) of the first run. Answer – it gets discarded and only the last one is produced in the final result]

MonadState can be composed so they pipe the result one through the other.

[NdM: now, this is a bit more complex to digest – where does the .mod come out? And more importantly how does .mod know what to return? .mod is a method of the StateLens object (well, nearly true, but true enough for the sake of this analysis). Always remember that you are not dealing with actual values, but you are forging functions that will need to be called/applied on actual values. Note that the lenses in the expression are both on the Level class, so the state generated by mod operates on the Level class. The additional type is derived by the right operand of the >> operator.]

# Enemy 3: optional antlion

So far so good, but there are still other entities that cannot fit properly in the picture. What about getting and modifying the current score from the Game? The problem we face is that level is an option in Game. Lenses can’t be used with a plain-vanilla approach.

Let’s try a first attempt to the solution:

extract  is a method of the State

Nice, but cumbersome. The abstraction we can use now is the Prism (which is defined in monocle, roughly in the following way):

The first method takes an object and produces an option, the second method rebuilds the object given a part. So, let’s define our prism:

This prism deals  with an Option[A] , but monocle already provides you with this tool and it’s called some :

# Final Enemy: Multiple Fast Zombies

Now we want to crush an entire column of the board.

We can combine lenses and prisms into something else:

 

As for the Prims we have a getOption method that exposes the Option, but, instead of the reverseGet, there is a set that transforms S into another S provided an A.

Optionals  can be created by composing prisms and lenses as follows:

So that we can write our first iteration of the crushColumn method as –

This function operates on the game matrix and removes the candy when the column of the position is the same as j .

It is not bad per se, but that we are doing this in a manual way. The solution could be improved by using a filterIndex:

Let’s see how to automatize it. Let’s introduce the abstract metaclass Traversal:

Now it is possible to compose the Traversal with other lenses such as:

FilterIndex is a monocle function, that along with the implict mapFilterIndex allows the lens to apply over the map collection.

Since compose syntax may tend to be a bit verbose, you can also use the following operators:

  • ^<-?  compose with prism
  • ^|->  compose with a Lens
  • ^|->> compose traversal

Conclusions: Optics are abstractions for changing parts of wholes. These abstractions are composable to access complex data. Monacle library provides hybrid of concrete e Van Laarhoveen optics [NdM: sorry I missed the explanation entirely]. State monads encapsulate state threading and produce output values.


Max’s comment – The talk has been very helpful in improving my understanding of this aspect of the functional programming. I still find Scala syntax to be a bit on the verge of cryptic and dealing with new concepts doesn’t help as well. Composing stuff in the way functional programming does is a really powerful mechanism that enables the programmer to recycle code in an effective way.

I find that the use of symbols to further reduce the characters count is really dangerous. But this is a topic for another post. Let’s just say that Scala is endangered of write-only code 🙂 Looking forward to attending next Lambda World!

Lambda World 2017 – Unconference – Category Theory Crash Course

Here I am in Cadiz to attend 2017 edition of Lambda World. Cadiz is really a nice frame for this conference and the weather is so good that it seems to be back at vacation time instead then in mid-Autumn.

Anyway, on this morning I attended the “unconference”, a sort of informal prelude to the real conference. After the first, planned talk, people could register to present their talk and other attendants could vote their preferred talk. This yield two interesting talk – one on declarative testing and the other one on category theory. Here is what I learned from the last one, by Rùnar Bjarnason.

First, this is algebra stuff, really. Not the kind of ‘a+b’ you do in mid schools, but the high order stuff you do later when you study groups, monoids and the likes.

How does these concepts applies to practical programming still eludes me, but I sense that something is there waiting for my brain to grasp it.

Basically an abstract algebra of function wants to answer to the following questions – what functions do? And what can we say about them? Here we go, all mistakes, errors, inconsistencies are just mine, blame me not Rùnar.

A Category is defined by objects and arrows and how we can compose those arrows.

If you consider language types as objects then functions can be thought as arrows from an object to another object:

Some types: A, B, C.
Some functions f, g.

a composite functions is formed by invoking a function on the result of another one:

As long as types line up we can concatenate function composition.

If a function links an object to itself (in the example a type to itself), this is the identity function:

Sometime composition is expressed with a dot. An interesting property is that composition is associative:

(so no need for parenthesis).

In Scala you have a category for functions and types, but also for types and inheritance (subtyping). This category is defined as follows:

Objects: Scala Types: A, B, C…
Arrows: Subtype relations: A <: B, B <: C…
Composition: [ A <: B, B <: C ] => A <: C
Associativity: A <: B <: C (no need for parenthesis)
Identity: A <: A

Another example of category is the the pre-order (<=). Objects are instances of a Type on which <= binary relationship is defined:

Objects a, b, c…
Arrows: a <= b, b <= c
Transitivity: <=
Identity: a <= a

E.g. natural numbers are a category for preorder.

Monoids are categories with one object. This object, by convention, is written as ‘*’. All arrows links * to itself. In a turn of notation I’ll use the symbol |+| for composition and ‘mzero’ for identity.

Objects: *
Arrows: f, g, h: * -> *
associativity
identity: * -> *
mzero |+| a = a
a |+| mzero = a

Object is just an anchor, to hold arrows.

Category of preorders:
A preorder is a set with <=
Objects: all the preorders (Scala integers)
arrows: Homomorphisms on preorders

[NdM: at this point things start to be complicated and I have likely lost the speaker. What follows are likely to be glimpses of his talk… the parts I got]

Monotonic functions.
e.g. string length : _.length is a monotonic function.
Preserves relationship:
x <= y then h(x) <= h(y)

Category of monoids:
Objects: all the monoids
Arrows: monoid homomorphism
h( x |+| y) == h(x) |+| h(y)

Category of categories
Objects: all the categories
Arrows: category homomorphism F
F( f . g ) = F(f) . F(g) e.g. _map( f compose g ) = _.map(f) compose _.map(g)
these are functors.

Given a category C, there is a category of endofunctors on C
endofunctor F[_]
Objects: functors
arrows: functor homomorphism
h: F[A] => G[A]
h compose _.map(f) = _.map(f) compose h

trait ~>[F[_].G[_]] { def apply[A](a: F[A]) : G[A] }

A monad on C is an endofunctor M[_] together with two natural transformations:

pure[A] : A=>M[A]
join[A] : M[M[A]] => M[A]

A monad is a kind of category:
M[M[M[A]]] : can be associated as we want, preserving the result.

A monad is a monoid in an endofunctor category.

[And with this truth the brief talk is over]

Livello di pericolosità dello step coreografico

(articolo originale)

Lo step coreografico è una delle mie attività sportive preferite. Come ho già scritto, mi piace lo step perchè richiede sia coordinazione che memoria e mette alla prova la memoria muscolare al ritmo della musica dance. Nello scorso anno e mezzo ho avuto la fortuna di avere un’istruttice stellare. Capace di proporre ad ogni lezione coreografie sempre nuove, emozionanti e a volte pazze. Le sue coreografie sono molto divertenti, ma a volte richiedono alcuni passaggi che presentano un qualche livello di pericolo. Se ti distrai o non sei abbastanza veloce a modificare la tua memoria muscolare puoi mancare lo step o scontrarti con qualcun altro.
Ho preparato una tabella (simile al mio “Grado di Efficacia del Weekend”) che definisce un numero di taschi per ogni caratteristica pericolosa della coreografia. Semplicemente bisogna aggiungerne uno per ogni caratteristiche della coreografia per avere il livello di pericolo. Oggi abbiamo raggiunto i 3 teschi.

  • Passo attraverso lo step – bisogna estendere la gamba oltre lo step con il rischio di colpire lo step con il tallone e perdere l’equilibrio.
  • Giro all’indietro sul pavimento che finisce con un piede sullo step – non si può vedere lo step mentre giri, ma bisogna avere un’idea abbastanza precisa di dove sia lo step, altrimenti puoi colpirlo con il piede o appoggiare il piede parzialmente perdendo l’equilibrio.
  • Due step a testa – di solito gli step sono abbastanza vicini l’uno all’altro e la coreografia solitamente prevede delle mosse con un piede tra i due step. E’ possibile inciampare in uno step o appoggiare male il piede. Se gli step sono molto vicini è possibile inciampare nelle proprie gambe.
  • Step vicini – quando il tuo step è accanto a quello di un’altra persona diventa necessaria una buona sincronia altrimenti ci si può scontrare. Distinguere la destra dalla sinistra solitamente aiuta.
  • Step condivisi – in determinati momenti della coreografia due persone condividono lo stesso step. Di solito la coreografia prevede abbastasnza spazio per entrambe. Ma anche in questo caso è necessaria una buona tempistica e una buona sincronia per muoversi nella giusta direzione.
  • Scambio step – ognuno ha il proprio step, ma ad un certo punto tutti si muovono verso un altro step allo stesso momento. Di solito la coreografia crea un corridoio a tempo per permetterti di raggiungere la destinazione. “A tempo” è la chiave.
  • Sincronizzazione precisa. Due persone sone abbastanza vicine, verosimilmente sullo stesso step, ma eseguono movimenti diversi. I movimenti si combinano con i tempi giusti in modo che non ci sia collisione. Questo è una delle situazioni più pericolose perchè non può affidarti alla simmetria o prendere spunto dai movimenti del tuo compagno/a; qualsiasi errore, anche piccolo, causa una collisione.

Step Choreography Danger Level

Choreographic step is one of my favorite fitness activities. As I wrote in the past I like step because it requires both coordination, memory and challenges your muscle memory at the rhythm of dance music. In the past year and half I was so lucky to have a stellar step instructor. She is able to propose new and exciting and sometimes crazy choreographies at each lesson. Her choreographies are usually a lot of fun but sometimes require you to perform some moves that may yield some danger level. If you get distracted or you are not fast enough to retrain your muscle memory you may miss the step or smash into another stepper. Here is a table (similar to my Week-end Effectiveness Degree table) that defines a number of skull for each dangerous feature of the choreography. Just add one for each of the feature in the choreography and you get the danger level. Today we reached 3 skulls.

Step across the step – you need to extend you leg across the step risking to hit the step edge with your talon, losing your balance.
Turn backward on the floor and end with a feet on the step – you can’t see the step though you need a fairly good idea about where the step is, otherwise you may hit with the feet or put your feet half on the step, losing your balance.
two steps per person – usually steps are quite close each other and the choreography expects you to move placing a feet between the two steps. You can stumble into a step, or half place a feet on a step. If the steps are really close you may stumble into your own legs.
close steps – when your step is adjacent to another person’s step, you need to properly synchronize, otherwise you may smash into each other. Knowing left from right is usually helpful.
shared steps – at some point of the choreography two people share the same step. Usually the choreography manages to get each one enough space to move, but again you need good timing and good sync to move the right way.
exchanging steps – each one has his/her own step, but at a given time everyone moves to another step all together. Usually the choreography manages to create a timed corridor to reach your destination without smashing into anyone. “Timed” is the key.
close synchro – two persons are close enough, likely on the same step and performs different moves in turn. Moves combine together at the right time so that there is no collision. This is one of the most dangerous since you can’t rely on symmetry or get any hint from your pal; also any slight error will cause a collision.

Ciaspolata Rheme Notre-Dame

Seiser Alm – pictures

From our latest vacations on the Sud Tirol Alps

Reacting to the movies: reactive systems and the lessons of narrative.

Another Reactive Summit 2016 talk. Speaker was Steve Moore – IBM story strategist. My comments at the end. (When writing about heros I used “she” as pronoun, if you are sensitive to this use sed s/she/she\/he/g).

Thinking differently helps you gaining new perspectives on your work. That’s the work for story strategist and this is the idea behind this talk. Finding analogies between reactive systems and movies.

Stories behave like reactive systems you can see three analogies:

  1. hero as scalability,
  2. plot as resilience,
  3. genre as elasticity.

Scalability : heroes are under constant pressure to scale. Scaling is usually achieved via some resource. The best resource a hero has at her disposal is a new understanding of the world.

There are three form of escaping in the scaling up of the hero.

Escape the skin. The hero is trying to escape the fake reality she’s living to reach the real and more complex reality behind. Paranoia and questioning foundation of the system are the mean to achieve this escape. This can be summed by the question: “Are you being paranoid enough”?

Escape stoicisms. The hero lives in a simple world that avoids emotional entanglement. The hero takes more responsibility that she think she can bear. Scalability and generosity go hand in hand. Question: “Are you being generous enough?”

Escape the story. The hero lives in a world she wants to escape. This is different from the deception of the escape the skin (first form). The question is “are you being patient enough?” Sometime you need to take the time to observe and understand.

plotasresilience

The plot directs the hero where is needed by the story. Stories are all about failures – the plot is just a sequence of failures that get the hero further and further away from his goal.

Heroes never ever give up. Likeability is not required by heros.

3 ways the hero fails:

Lure of the invisible. This is a kind of accumulation of something that is somewhat hiding. Can be a log file that is increasing without anyone noticing, or some peril that grows unnoticed over time. “What is the system accumulating?” This is not an issue until it is. This way of failure includes also non technical accumulation such as management attention, customer complaints…

Lure of inclination. The hero would like something, but she has to put away that to pursue his goal. Following inclinations may get the hero in hot water, but in the end brings benefits.  “Can you make the system fragile?” [NdM sorry, it is likely I missed something in the translation]

Lure of independence. The story has a lot of moving parts apparently in an unrelated way. Using micro services the system may be better managed and written by smaller teams. “Could coordination be fun?”

Coordination among the teams is needed.

Genre is elasticity.

Elasticity means that response does not change when workload changes or accidents happen. What defines the genre is a set of rules. Genres are categories of antagonists. An antagonist puts strain on the hero.

Three different kinds of genres:

Human v. Human. The analogy on system is threats posed by human hostile actions. “Can you change the game?”

Human v. Nature. The natural world is challenging the hero. Natural antagonists are simple, unbeatable and unchangeable. These stories are about survival. If you live, you deserve to live. “do you deserve to live?” If the system were in danger do these people would do something to protect it?

Human vs self. Hero is also the antagonist. The hero doesn’t want to push away something that is abundant in his life. This kind of abundance forces the hero to questioning the reality “can you endure abundance?” We need to transform ourselves and the system at the same time.

This is the only way the system has to survive.

This talk was very interesting even if not very practical or even theorical. I like the idea that adding word to vocabulary and concepts to the programmer resources may help to devise better system or to enhance the understanding of the system. It makes food for lateral thinking (lateral food?).

One of the first person we met at the luminaries barbecue the day we arrived, had the following job title “Chief Storyteller”. Maybe this is a sign of times.

A Journey to Modern Apps Through Containers and Microservices

Here I am, warped through jet leg in the sunny Texas. Austin. The reason is to attend the Reactive summit 2016. Two days crammed with interesting speeches by luminaries of the reactive community. I’ll try my best to provide a brief summary of each talk I will attend, reportin faithfully what the speaker said and adding my thoughts. If you know about reactive systems and you read something unreasonable it is likely I missed something in the process.

This talk is by Edward Hwu – mesosphere.

(I’ll leave out the quote about Justin Bibier being the reason for having reactive systems… in part because I missed something).

If you compare the number of Internet users today with the same number a few years ago you will notice that it has increased manyfolds. Almost everyone owns a smartphone and almost everyone use it to interact with Internet. Regardless companies that make business over Internet need to be sure to be able to comply with these large number of interactions in order not to lose customers and business opportunities.

Because of the increase in the number of Internet users an architecture shift was needed: distributed systems are the only way for industry to cope with this growth.

Modern applications that runs websites are made from open source software, VMs have been replaced by containers that have shorter startup time and are lighter to run.

Additionally containers provide the means for elasticity – they can be used just for the time needed for run the service – and allows for fast and frequent releases.

Data services provide connection and persistence.

Mesos is the core component of DC/OS whose purpose is to manage and abstract data center in the same way a traditional OS abstracts from hardware. Where the traditional OS operates on a single computer, DC/OS operates on the whole data center.

Operations such as installing Kafka or Spark on a cluster, with the proper configuration and replica count, is handled with a single click. You can find services ready to install and use on the open universe app store. As of today you can find 40+ apps

Here’s my thought. Well DC/OS seems to be the real shift in distributed systems. My impression is that the era of custom made software to handle the specific aspects of distributed system (such as load balancing, or provisioning) is going the way of the dodo since structured and standard solutions are being developed.

Having the facility to handle the whole datacenter with simple commands is possibly going to revolutionize datacenter in the same way docker revolutionized application deployment. The latter avoiding developer to keep track manually of all the dependencies and providing a sensible installer, the former avoiding manual install on multiple node with configuration twiddling.

Maybe this technology is a bit ahead – I’m saying this because I found no trace of the universe app store on the internet (maybe I didn’t search enough), but for sure is very promising and being open source, free for all, is something that for sure will provide widespread adoption.

Comparisons

Recently I had to spend some time trying to adapt my imperative/OO background to a piece of code I need to write in functional paradigm. The problem is quite simple and can be described briefly. You have a collection of pairs containing an id and a quantity. When a new pair arrives you have to add the quantity to the pair with the same id in the collection, if exists, otherwise you have to add the pair to the collection.

Functional programming is based on three principles (as seen from an OO programmer) – variables are never modified once assigned, no side-effects (at least, avoid them as much as possible), no loops – work on collections with collective functions. Well maybe I missed something like monad composition, but that’s enough for this post.

Thanks to a coworker I wrote my Scala code that follows all the aforementioned principles and is quite elegant as well. It relies on the “partition” function that transforms (in a functional fashion) a collection into two collections containing the elements of the first one partitioned according to a given criteria. The criteria is the equality of the id so that I find the element with the same id if it exists, or just an empty collection if it doesn’t.

Here’s the code:

Yes, I could have written more concisely, but that would have been too much write-only for me to be comfortable with.

Once the pleasant feeling of elegance wore off a bit I wondered what is the cost of this approach. Each time you invoke merge the collection is rebuilt and, unless the compile optimizer be very clever, also each list item is cloned and the old one goes to garbage recycling.

Partitioning scans and rebuild, but since I’m using an immutable collection, also adding an item to an existing list causes a new list to be generated.

Performance matters in some 20% of your code, so it could acceptable to sacrifice performance in order to get a higher abstraction level and thus a higher coding speed. But then I wonder what about premature pessimization? Premature pessimization, at least in context where I read the them, means the widespread adoption of idioms that lead to worse performances (the case was for C++ use of pre or post increment operator). Premature pessimization may cause the application to run generally slower and makes more difficult to spot and optimize the cause.

This triggered the question – how is language idiomatic approach impacts on performances?

To answer the question I started coding the same problem in different languages.

I started from my language of choice – C++. In this language it is likely you approach a similar problem by using std::vector. This is the preferred collection and the recommended one. Source is like this:

Code is slightly longer (consider that in C++ I prefer opening brace on a line alone, while in Scala “they” forced me to have opening braces at the end of the statement line). Having mutable collections doesn’t require to warp your mind around data to find which aggregate function could transform your input data into the desired output – just find what you are looking for and change it. Seems simpler to explain to a child.

Then I turned to Java. I’m not so fond of this language, but it is quite popular and has a comprehensive set of standard libraries that really allow you to tackle every problem confidently. Not sure what a Java programmer would consider idiomatic, so I staid traditional and went for a generic List. The code follows:

I’m not sure why the inner class Data needs to be declared static, but it seems that otherwise the instance has a reference to the outer class instance. Anyway – code is decidedly more complex. There is no function similar to C++ find_if nor to Scala partition. The loop is simple, but it offers some chances to add bugs to your code. Anyway explaining the code is straightforward once the iterator concept is clear.

Eventually I wrote a version in C. This language is hampered by the lack of basic standard library – beside some functions on strings and files you have nothing. This could have been fine in the 70s, but today is a serious problem. Yes there are non-standard libraries providing all the needed functionalities, you have plenty of them, gazillions of them, all incompatible. Once you chose one you are locked in… Well clearly C shows the signs of age. So I write my own single linked list implementation:

Note that once cleaned of braces, merge function is shorter in C than in Java! This is a hint that Java is possibly verbose.

I just wrote here the merge function. The rest of the sources is not relevant for this post, but it basically consists in parsing the command line (string to int conversion), getting some random numbers and getting the time. The simplest frameworks for this operation are those based on the JVM. The most complex is C++ – it allows a wide range of configuration (random and time), but I had to look up on internet how to do it and… I am afraid I wouldn’t know how to exploit the greater range of options. Well, in my career as a programmer (some 30+ years counting since when I started coding) I think I never had the need to use a specific random number generator, or some clock different from a “SystemTimeMillis” or Wall Clock Time. I don’t mean that because of this no one should ask for greater flexibility, but that I find daunting that every programmer should pay this price because there is case for using a non default RNG.

Anyway back to my test. In the following table I reported the results.

C++ Scala Java C C++
vector list
time (ms) 170,75 11562,45 2230,05 718,75 710,9
lines 81 35 69 112 81

Times have been taken performing 100000 insertions with max id 10000. The test has been repeated 20 times and the results have been averaged in the table. The difference in timing between C++ and Scala is dramatic – with the first faster about 70 times the latter. Wildly extrapolating you can say that if you code in C++ you need 1/70 of the hardware you need to run Scala… there’s no surprise (still guessing wildly) that IBM backs this one.

Java is about 5 times faster than Scala. I’m told this is more or less expected and possibly it is something you may be willing to pay for higher level.

In the last column I reported the results for a version of the C++ code employing std::list for a more fair comparison (all the other implementations use a list after all). What I didn’t expected was that C++ is faster (even if slightly) than C despite using the same data structure. It is likely because of some template magic.

The other interesting value I wrote in the table is the number of lines (total, not just the merge function) of each implementation. From my studies (that now are quite aged) I remember that some researches reported that the speed of software development (writing, testing and debugging), stated as lines of code per unit of time, is the same regardless of the language. I’m starting having some doubt because my productivity in Scala is quite low if compared with other languages, but … ipse dixit.

Let’s say that you spend 1 for the Scala program, then you would pay 2.31 for C++, 1.97 for Java and 3.20 for C.

Wildly extrapolating again you could draw a formula to decide whether it is better to code in C++ or in Scala. Be H the cost of the CPU and hardware to comfortably run the C++ program. Be C the cost of writing the program in Scala. So the total cost of the project is:

(C++) H+C×2.31

(Scala) 68×H+C

(C++) > (Scala) ⇒ H+C×2.31 > 68×H+C ⇒ C×1.31 >67×H ⇒ C > 51.14×H

That is, you’d better using Scala when the cost of the hardware you want to use will not exceed the cost of Scala development by a factor of 50. If hardware is going to cost more, then you’d better use C++.

Beside of being a wild guess, this also assumes that there is no hardware constraint and that you can easily scale the hardware of the platform.

(Thanks to Andrea for pointing out my mistake in inequality)

Scala Days 2016 my thoughts

Scala Days 2016 is over. I’m sorry I didn’t make it to take notes of all the talks I attended, but some speakers spoke very fluently and my phone is not the best way to write down notes. English doesn’t help as well, and sometimes my understanding of the matter was lagging behind.
So what are my impressions? That’s a good question and I’m not sure about the answer. I think Scala is a very fitting solution for a specific niche. Then as every other programming language can be used for everything (I just heard that Dropbox employs 2.7 million lines of Python code).
The niche I have in mind is made of large, distributed applications, handling zettabytes of streaming data, performing math functions over them. A part of the niche could also be composed by high traffic, highly reliable web services (where service is just a general term and refers to any kind of service, including serving web pages).
In this niche using Scala, with actors and possibly Spark makes a lot of sense.
In other contexts you risk to pay the extra run for something you don’t really need – not every server software needs to scale up, not every process needs to be modeled using events. Although functional paradigm eases writing code that copes with these contexts, you still pay an extra cost.
It is hard to quantify how much. The naïve experiment of having two programmers of comparable proficiency in two different languages working at the same task is very hard to setup.
According to an old research the productivity of a programmer, intended as line per unit of time, is more or less the same regardless of the language. That means, for example, that it is cheaper to write programs in C than in assembly, because C is more expressive and higher level than assembly. I’d like to know if the same holds for Scala, where lines tend to be long concatenation of function applications. For sure this is at a more abstract level, but it requires quite an effort to write, lot of effort to understand and it is close to impossible to debug, at least with today tools.
Well back to my impressions on the state of Scala. Scala is comparatively young and suffers from its youth. There are pitfalls and shortcomings in its design (just as naturally is for every other language) that are starting to be acknowledged by the language owners. The solution they seem to prefer is to rewrite and go for a next incompatible version. This is a dangerous move as Python would teach. Also is somewhat that does not acknowledge the industry. It makes sense for a teaching tool, for a research language, but it impacts badly on industry investments.
In several talks the tenet was that although Scala allows the programmer to chose any of the supported paradigms, only functional programming is the proper way to code. Surprisingly, at least to me, Martin Odersky, the language father, doesn’t agree, when attending a talk, he claimed that multi paradigm was a bless when programming the Scala compiler.
Industry needs pragmatism, but I see it only partially. Enthusiasts may crosses the border to become zealots. And crusades are not something that could bring stability and reciprocal understanding. When I hear about functional programming revolution I am somewhat scared, I prefer evolution, acknowledging the goods of existing stuff and building over them. In revolutions many heads roll, included those of innocent people.
The most widespread background for programmers here was Java. I understand that Java is not the most exciting language. Coming from C++ I find Java quite boring. And, in fact, some, if not many, of the advantages of Scala over Java can be found in C++ as well. Unfortunately C++ falls short of open source industry standard libraries. You won’t find anything in C++ that comes close to what Spark or Akka are for Scala. Also Play – even if it doesn’t encounters a unanimous consensus – is the de facto companion library for web services and web development.
Back to Scala days (again) – it has been a positive experience, some talks needed some preliminary study even if they were marked as beginner (everything pretending to explain implicits). Other talks were quite marketing advertisements in disguise. And some were genuinely fun.
I think I got closer to this language and had great opportunities to change my point of view. My wish is for an ecosystem more attentive to the industry and that values back-compatibility  rather than see anything that breaks with the past a way to make easy money by selling technical support.