Scala Days 2016 my thoughts

Scala Days 2016 is over. I’m sorry I didn’t make it to take notes of all the talks I attended, but some speakers spoke very fluently and my phone is not the best way to write down notes. English doesn’t help as well, and sometimes my understanding of the matter was lagging behind.
So what are my impressions? That’s a good question and I’m not sure about the answer. I think Scala is a very fitting solution for a specific niche. Then as every other programming language can be used for everything (I just heard that Dropbox employs 2.7 million lines of Python code).
The niche I have in mind is made of large, distributed applications, handling zettabytes of streaming data, performing math functions over them. A part of the niche could also be composed by high traffic, highly reliable web services (where service is just a general term and refers to any kind of service, including serving web pages).
In this niche using Scala, with actors and possibly Spark makes a lot of sense.
In other contexts you risk to pay the extra run for something you don’t really need – not every server software needs to scale up, not every process needs to be modeled using events. Although functional paradigm eases writing code that copes with these contexts, you still pay an extra cost.
It is hard to quantify how much. The naïve experiment of having two programmers of comparable proficiency in two different languages working at the same task is very hard to setup.
According to an old research the productivity of a programmer, intended as line per unit of time, is more or less the same regardless of the language. That means, for example, that it is cheaper to write programs in C than in assembly, because C is more expressive and higher level than assembly. I’d like to know if the same holds for Scala, where lines tend to be long concatenation of function applications. For sure this is at a more abstract level, but it requires quite an effort to write, lot of effort to understand and it is close to impossible to debug, at least with today tools.
Well back to my impressions on the state of Scala. Scala is comparatively young and suffers from its youth. There are pitfalls and shortcomings in its design (just as naturally is for every other language) that are starting to be acknowledged by the language owners. The solution they seem to prefer is to rewrite and go for a next incompatible version. This is a dangerous move as Python would teach. Also is somewhat that does not acknowledge the industry. It makes sense for a teaching tool, for a research language, but it impacts badly on industry investments.
In several talks the tenet was that although Scala allows the programmer to chose any of the supported paradigms, only functional programming is the proper way to code. Surprisingly, at least to me, Martin Odersky, the language father, doesn’t agree, when attending a talk, he claimed that multi paradigm was a bless when programming the Scala compiler.
Industry needs pragmatism, but I see it only partially. Enthusiasts may crosses the border to become zealots. And crusades are not something that could bring stability and reciprocal understanding. When I hear about functional programming revolution I am somewhat scared, I prefer evolution, acknowledging the goods of existing stuff and building over them. In revolutions many heads roll, included those of innocent people.
The most widespread background for programmers here was Java. I understand that Java is not the most exciting language. Coming from C++ I find Java quite boring. And, in fact, some, if not many, of the advantages of Scala over Java can be found in C++ as well. Unfortunately C++ falls short of open source industry standard libraries. You won’t find anything in C++ that comes close to what Spark or Akka are for Scala. Also Play – even if it doesn’t encounters a unanimous consensus – is the de facto companion library for web services and web development.
Back to Scala days (again) – it has been a positive experience, some talks needed some preliminary study even if they were marked as beginner (everything pretending to explain implicits). Other talks were quite marketing advertisements in disguise. And some were genuinely fun.
I think I got closer to this language and had great opportunities to change my point of view. My wish is for an ecosystem more attentive to the industry and that values back-compatibility  rather than see anything that breaks with the past a way to make easy money by selling technical support.

Implicits inspected and explained

Talk by Tim Soethout. The average age among attendees seem to be higher than other talks, maybe we elders are looking for understanding finally what the hell are implicits (then we would need something similar for monads…). Here we go.
Implicit enables you to use values without explicit reference. Implicit enables the relationship “is viewable as”.
As a parameter the value is taken from the calling context (eg akka sender).
Incomprehensible examples follow.
Implicits are used for DSL, type evidence (whatever this be), reduce verbosity… Other stuff I didn’t make to copy.
Caveats – resolution can be difficult to understand. Automatic conversions. For these reasons better to not overuse.
Implicit parameter just takes the only variable with a matching type as argument.
If two or more variables with a matching types are available an error is reported by the compiler.
Implicit conversion (please avoid them) allows to automatically convert from a type A to a type B every time B is required. This comes handy to convert from and to Java collection types, but it is deprecated. The new approach is to add the asScala method to the Java types.
Implicit view is not understood by yours truly, sorry.
Implicit class. This is used when a class is missing a method. You can write a wrapper to add what you need. If this is declared implicit then when the compiler finds an invocation to the new method the wrapper class is constructed and the method is invoked.
Scoping. Some explanation I didn’t understand. Interactive moment turned quite to an epic failure with most of the attendees having no idea on what was going to happen in the example shown.
Type classes. This concept is used for ad-hoc polymorphism and to extend libraries. A naive approach to serialize to json uses subclasses. All the classes need to inherit from a JsonSerializable and to define the serialization abstract method.
With type class the class to serialize do not need to derive from the same base. A generic trait is defined for the serialization method. By defining implicit variables and assigning them the class which defines the serialization for the specific type then type serializers will be uses when needed.
Other examples just make things worse for me, I though I got it, but now I’m no longer sure (also, now, rereading what I wrote, I’m unsure I understood anything at all).
Wrap up time. I still don’t get this stuff, at least not completely and only slightly more than “danger – do not use”. Type class seems to be really powerful in the sense that they lift Java from its pond to the static polymorphism of C++. But the mechanics that power this specific kind of implicit are still obscure to me. Likely nothing that a good book and some months of hard study couldn’t solve. Just it seems not to be some knowledge you can transfer in a conference.

Transactional event sourcing using slick

Talk by Adam Warski. Event sourcing means that all the changes in the system are captured as a sequence of events.
Reasons for event sourcing: keeping information (not losing info), auditing the information in the system. But it is also useful to recreate the system state.
Hibernate was a technology developed by Adam for Java.
The goal is to have event sourcing with rdbms. There are other approaches (event store, akka persistence, eventuate).
Event is immutable, is the primary source of truth. The payload is an arbitrary json, the name is the case class name. Transaction ID, event ID and user ID.
Events are stored in a dedicated table. A read model is created on the event in the crud. This gives consistency.
DBIOAction[T] gets a description of actions to be done.
User data arrives in the system, within a command (a Scala method). The command uses read to validate the data, if data is valid the command generates a series of actions to perform that updates the model. Event listeners are triggered at this time. Each event listener may trigger further events.
Demo follows for ordering a Tesla model 3.
Summary – no library, just a small adapter. Event listener performs side effects.

image

The system could have been written with free monads. That would have improved testability. But free monads don’t come for free – you have to pay a price to decode them from the code and figure out what the code is trying to do. As Adam said, maybe it’s just me.
And saying this here while presenting a relational db approach takes really a brave heart.

Connecting reactive applications with fast data using reactive streams.

Talk by Luc Bourlier. As in the previous posts these are my quick notes.
Who doesn’t know what a reactive application is? Responsive, elastic, resilient and message driven – this is what reactive apps are.
Big data means that there are too much data to be handled by traditional means on a single machine.
Fast Data are big data that comes in big volume and you want up to the second information with continuous process.
Spark streaming is the technology by light bend that does the trick.
Spark is an evolution of map reduce model. A driver program (spark context) talks to the cluster manager to get worker nodes to do the job.
Spark can be used on streams by using mini-batching. A mini batch is the work executed on data received in a unit of time.
Spark streaming deals with all kind of failures (hardware, software and network). It also handles recovery for continuous processing and deals with excess of data volume.
A demo is presented with a raspberry pi cluster. (On raspberry pi you don’t need to push the system to the limit, because you are already at the limit).
Demo ran fine, but it broke, that makes me wonder how stable is this technology. The demo model seemed quite simple.

Back pressure is the mechanism implemented by akka streaming to slow the data producer if the consumer is not able to consume data fast enough.
Congestion in spark was handled by static limit on the input rate. In spark 1.5 the limit has been changed into dynamic rate limit. There is a rate limit estimator based on PID that sets the rate limit.
There are some limitations to this method based on the assumptions used in the design – all records require about the same time to process, the process is linear (a 3rd assumption was there, but it got lost my my note taking)

Scala Days – key note

It may not be a great surprise, but the opening key note is held by Martin Odersky. I don’t feel much expectation, more or less everyone expects he’s going to repeat the opening speech of the last Scala conference in New York.
In this post I’ll just summarize the content, my considerations will be in a next one.
Scala days are going to be attended by some 1000 people. Conf app is, of course written in Scala and swift (and it’s open source) courtesy of 47 degrees.

Odersky enters with son et lumier effect, halfway between a disco and an alien abduction… Maybe both.
First he shows a steady growth of the language. Scala jobs get a little over the line. There is no comparison with Java, of course, but there is no drop in popularity.
He will talk about the future, next, mid and distant futures.
What’s next? Scala center, Scala 2.12 , Scala libraries.
Scala center is a vendor neutral initiative supported by several partners that promotes Scala, and undertakes projects that benefit all the Scala community.
Scala 2.12 is the next release of the language, about to be completed. Optimized for Java 8. Shorter code and faster execution.
This release will arrive mid year. Older Java versions will be supported by Scala 2.11 that, in turn, will be supported for quite a while.
Not many new features: 33. Main contributors are from the community.
Martin’s book “Programming in Scala” will be updated to 3rd edition to include release 2.12 of the language.

In the farther future: 2.13 will focus on libraries – Scala collections, simpler, lazy, integrated with Spark. Backward compatible.

Split Scala stdlib into core and platform. Stdlib was much prototypal with the idea that wouldn’t have lasted long.

Scala js, and Scala native.

The dot is the foundation of Scala. a mini language, small enough that programs written in this language can be machine proof. Much of the language can be encoded in this language. 8 years in the working. Language work can be done with much more confidence.

image

Type soundness, properties of code can be demonstrated (the examples say that an expression of type T produces a value of type T).
Dotty a language close to Scala compiler that produces dot code. Generics are processed with dot. Not ready for industry, but if you want to try something cool…
Faster
Goal: Best language Martin knows how to make.

Dropped: procedure syntax. Rewrite tool that will take care of translating. DelayedInit.
Macros – of was just a long run example. There will be an alternative.
Early initializers – for stuff that needs to be initialized before the base trait.
Existential Types forSome.
General Type Projection T#x.

Added: intersection and union types- types T&U just the common properties of the two types. T|U will have either the properties of T or the properties of U.

Function arity adaptation. Pairs.map ((a,b) =>a+b)

Static method for object.

Non blocking lazy Vals: locking time is much shorter now. Avoiding deadlocks.
@volatile for thread shared lazy vals.

Multiversal equality type safe equality and inequality operators. Named type parameters – partial type parametrization.

Motivations better foundation, safer,…

SBT integration. Repl with syntax highlighting. Intellij. Doc generation. Linker.

Future.
Scala meta – will replace macros and meta programming. Inline and meta. Executed by the compiler.
Implicit function types . used to compose … Just more mess.

Effect checking a->b pure function, a=>b impure. Checked by the compiler.

Nullable types T? =T | Null. Types coming from Java will have a ? Because they have side effect. It is an alternative to monads.

Generic programming.

Guard rails – how to prevent the programmers to misuse or abuse the language. Strategic Scala style: principle of least power. Strategic Scala style.

Libraries that inject bad behavior (eg implicit conversion). Implicit conversions will make a style error if public.

Syntax flexibily. Even Martin regrets the space syntax. Add @infix annotation if the author intends it to be used as infix and give a style error in other cases.

Operators are regretted as well . @infix will have the option to give names to such operators.

Scala center

Notes from “Scala center” by Heather Miller. Scala Center is a non profit organization established at EPFL. It is not lightbend. Same growth chart of yesterday, source are not cited (indeed?). Stack overflow survey reports Scala in the top 5 most loved languages.
The organization will take the burden of evolving and keeping organized libraries and language environment, educating and managing the community rather than the language itself.
Coursera Scala class is very popular (400k) with a high completion rate. There will be 2 new courses on the new coursera platform. Unverified courses are free, verified and certified courses are paid.
Functional programming in Scala – 6 weeks.
Functional program design in Scala – 4 weeks.
Parallel programming – 4 weeks.
Big data analysis in Scala and spark – 3 weeks.

My (somewhat cynic) impression – lot of work and desperate needs for workforce, they are looking to get for free by grooming the community.
EPFL funds for 2 ppl for moocs . donations from the industry and revenues from moocs.
Lightbend? Will continue to maintain the stable Scala.
Package index is not yet available for Scala. Aka people should be able to publish their projects and get them to be used without the need of being a salesperson.
Scala library index. Index.scala-lang.org
It is an indexing engine.

Just wondering – is this a language for the academia or for the industry? Keep changing things and the investments made by the industry will be lost: language is going to change, base libraries are going to change as well… Which warranties do I have that my code will still compile 5 years ahead in the future?
Changing things is good for the academia since it allows to do research and to better teach new concepts. It doesn’t harm the community where workforce is free and there is no lack of people to redo the same things with new tech.

Scala Days – Berlin

Scala… Where to start? Well I’m going to the Scala Days 2016 in Berlin. This is the first conference of this kind I will attend. Maybe I would have preferred a conference about C++, such as c++con, but this is what I got. And, as a second thought, Scala has nothing to envy to C++, at least in terms of unfriendliness and unreadability.
In Scala I like that I am not forced to use Java when programming for the JVM, but I find that by looking at the pro and weakness of the existing languages, EPFL could have designed the language somewhere more … Industrial. Instead it felt in several pitfalls, first of which is the misconception that the speed of developing software is capped by the speed of typing the code.
Anyway, here I am waiting for the conference to begin. Keynote speech will be from Martin Odersky voice. Martin is, of course the inventor of Scala language and, if I understood correctly, a cofounder of lightbend (was Typesafe) the for-profit company that backs up Scala and its environment.
image

First Sprint

The first sprint is over and it has been though. As expected we are encountering some difficulties. Mainly role keys without enough time to invest in scrum.

Scrum IMO tends to be well balanced – the team has lot of power having the last word on what can’t be done, what can be done and how long it will take (well, it’s not really the time, because you estimate effort and derive duration, but basically you can do the inverse function and play the estimation rules to get what you want).

This great power is balanced by another great power – the Product Owner (PO) who defines what has to be done and in what order.

Talking, negotiating and bargaining over the tasks is the way the product progress from stage to stage into its shippable form.

In this scenario it is up to the PO to write User Stories (brief use cases that can be implemented in a sprint) and define their priority. In the first scrum planning meeting, the team esitimate one story at time to set the Story Points value.

This is a straightforward process, just draw a Fibonacci Series along a wall and then set a reference duration. We opted for a single person in a week is capable of working for 8 story points. This is just arbitrary. You can set whatever value you consider sound. Having set two week sprint and being more or less 5 people we estimated an initial velocity somewhere between 80 points per sprint and (having the value) 40 points. The Scrum Master (SM) reads the User Story aloud and the team decide where to put it on the wall. Since the relationship between story points is very evident then it is quite easy and fast to find where to put new stories.

In that meeting, that went well beyond the boxed 2 hours, all the team was there, SM included, but the PO who was ill (the very first day in years). We could have delayed the meeting but the team would have been idling for some days…. not exactly doing nothing, you know there’s always something to refactor, some shiny new technology to try, but for sure we wouldn’t have gone the way our PO would have want us to go.

So we started and in a few moments it became clear that we were too much. Being a project that ranges from firmware to the cloud when talking about a specific story many people were uninterested and became bored and started doing their personal stuff. At the end we were about three being quite involved in the estimation game.

The lack of the PO in that specific moment was critical – we missed the chance of setting proper priority on tasks since we had to rely on a mail and we can’t ask our questions and get the proper feedback. At the end we discovered that the top most prioritaire User Story remained out of the sprint.

The other error we made was to avoid decomposing User Stories into Tasks. This may seem redundant at first, but it is really needed because an User Story may involve different skills and thus different people to be implemented.

The sprint started and we managed to get the daily scrum. This is a short meeting scheduled each day early in the morning where each member of the team says what she/he has done the day before, what is going to do that day and if she/he sees any obstacle in reaching the goal. This meeting is partly a daily assessment and partly an intent statement where, at least ideally, everyone sets the goal for the day. Everyone could assist, but only the team may speak (in fact has to speak).

Daily scrum were fine, we managed to host one/two programmers that were not co-located most of the time via a skype video call. The SM updated the planned, doing, done walls. The PO attended most of the times.

On the other hand the team interacted sparingly with the PO during the sprint, but just in part because his presence was often needed elsewhere. Also I need to say that our PO is usually available via a phone all even if he’s not in the office.

This paired with the team underestimating the integration testing and definition of done. Many times user stories were considered done even if not tested with real physical systems, but just with mock ups. Deploy process failed silently just before the sprint review leaving many implemented features not demonstrable. Also our definition of done requires that the PO approves the user story implementation. This wasn’t done for any task before the sprint end, but we relied on the sprint review for getting the approval.

The starting plan included 70 Story Points and we collect nearly another 70 points in new User Stories during the sprint. These new Stories appeared both because we found stuff that could be done together with the current activities or the needed to be done to make sense of the current activities.

Without considering the Definition of Done, we managed to crunch about 70 points that were nearly halved by the application of Definition of Done (in the worst possible moment, i.e. during the Sprint Review).

Thinking about improving the process, probably working from remote is not quite efficient, I noticed that a great deal of communication (mainly via slack, but also via email and Skype) happened the day that those two programmers were off site.

The sprint end was adjusted to allow for a even split of sprints before a delivery, therefore we had something less time than was we planned for.

End/Start sprint meetings (i.e. Sprint Review, Sprint Planning and Sprint Retrospective), didn’t fit quite well in the schedule, mostly because of PO being too busy in other meetings and activities.

Am I satisfied about the achievements? Quite. I find that at least starting to implement the process exposes the workload and facilitates the communication. The team pace is clear for everybody.

Is the company satisfied about the achievements? This is harder to say and should be the PO to say this. I fear that other factors affecting the team speed in implementing the requests of the PO may be considered with the scrum and dumped altogether. Any process poses some overhead and the illusion that a process is not really needed to get the things done is lurking closely.

Today we planned for another sprint, but this is for another post.

Ready? Set! Scrum

So, it has started. After taking a Construx on line SCRUM boot-camp, I started the scrum process at my workplace. This needs a bit of background though. The company I work for it is quite an unusual environment where developing software. The company started nearly 70 years ago and always manufactured plastic and metal goods. Maybe one of the reasons it is lasting so long is that avoided electronics first and software then for a very long period.

Nowadays softwareless goods are becoming increasingly difficult to sell, and beside everything, adding software to laboratory equipments may turn a great product into a fantastic one… and since “fantastic” means in the fantasy, you see the risk involved.

Developing software in a traditional mechanical manufacturing company has its own share of challenges and woes. If I manage to keep up with my proposition, I’ll report about my attempt to introduce a process that is as far as you can think of from the classical waterfall approach.