Scala Italy 2018

At the end of September, my employer sent me to the awesome city of Florence to attend the Scala Italy conference. Thanks to this patronage, this is the third year I’m able to be at the conference of the Italian Scala Community.

For the first time this year, the conference lasted two days with two distinct tracks – talks and workshops. The choice of one working and one non-working day can be a bit odd and likely a symptom of some ambivalence about the target audience. To keep the conference for Scala enthusiasts attending in their spare time or to move it to the professional realm, that’s the question. Given that the ticket price wasn’t very light on pockets, maybe half of the decision has already been made.

I dutifully took notes during all the talks so I’d like to prepare some posts with my notes. I noted that after I did so for ++it 2018 (Channels are Useful, not only for Water, Zero Allocation and No Type Erasure Futures, Time Travel Debug, ++it 2018 – Coroutines – coming soon… ?, ++it 2018 – Key Notes) conference organization put all the conference videos online. I’m pretty sure the same is going to happen for Scala Italy 2018 as well. So I’ll wait for your feedbacks before investing the time needed for writing all the blogs.

Read moreScala Italy 2018

Optimization Time – Scala (and C++)

The Case

It is quite a feeling when you learn that your commit, a couple of months ago, broke the build is such a subtle way that it took so long to be detected. Possibly a more thorough testing and validation of the software would have caught it earlier, nonetheless it’s there and you and your coworkers are working hard to delimit the offending code and better understanding what caused the mess.

It turned out that some perfectly working code was taking too much time in one of the hot spot of our codebase. More precisely that code operated a conversion from an incoming data packet into a format suitable for data processing… about 2500 times a second on a modest hardware.

The code resulting in such poor performances that disrupted the device functionality seemed like a good idea in a very functional idiom –

@inline final val SAMPLES_PER_BLOCK = 6
@inline final val BLOCK_LENGTH = 8
@inline final val BLOCK_HEADER = 2

private def decodeFunctional(packet: Array[Byte]): Array[Int] = {
    val lowSamples: Array[Int] = packet.drop(BLOCK_HEADER).map(x => x.toUint)

    val msbs: Int = ((packet(0).toUint << 8) | packet(1).toUint) & 0xFFF
    Array.tabulate(SAMPLES_PER_BLOCK)(
        x => lowSamples(x) | (((msbs << 2 * x) & 0xC00) >> 2)
    )
}

Basically the incoming packet holds six 10-bits samples in 8 bytes and some bits-shifting’n’pasting is needed to rebuild the six sample array for later processing.

Once found that the problem was here I decided rewrite the code to get rid of the functional idiom and take an imperative/iterative approach:

private def decodeImperative(packet: Array[Byte]): Array[Int] = {
    // the code below is ugly, but it needs to be fast, don't touch,
    // unless you know what you are doing
    val samples: Array[Int] = new Array[Int](SAMPLES_PER_BLOCK)
    val msbs: Int = ((packet(0).toUint << 8) | packet(1).toUint) & 0xFFF
    var index: Int = 0
    while (index < SAMPLES_PER_BLOCK) {
        samples(index) = packet(BLOCK_HEADER + index).toUint |
                         (((msbs << 2 * index) & 0xC00) >> 2)
        index += 1
    }
    samples
}

(the comment is a colorful note I decide to leave for the posterity :-)).

The idea is to get rid of the read-only constraint imposed by functional programming and save all the temporaries.

Optimizing is a subtle art, and even if intuition may guide you, only measurement can tell whether the optimization succeeded. So I set up some speed test (see “methodology” below) to objectively assess my results.

The speed improvement was astounding: more than 7 times faster (7.6 to be precise).

Energized by this victory I decided to switch to heavy optimization weapon. Not being sure if scala/java compiler optimization really does its optimization homework, I opted for manually unrolling the loop in the code:

private def decodeUnrolled0( packet: Array[Byte] ) : Array[Int] = {
    val PACKET0 : Int = packet(0).toUint
    val PACKET1 : Int = packet(1).toUint
    Array[Int](
        packet(BLOCK_HEADER+0).toUint | ((PACKET0 << 6) & 0x300),
        packet(BLOCK_HEADER+1).toUint | ((PACKET0 << 8) & 0x300),
        packet(BLOCK_HEADER+2).toUint | ((PACKET1 << 2) & 0x300),
        packet(BLOCK_HEADER+3).toUint | ((PACKET1 << 4) & 0x300),
        packet(BLOCK_HEADER+4).toUint | ((PACKET1 << 6) & 0x300),
        packet(BLOCK_HEADER+5).toUint | ((PACKET1 << 8) & 0x300)
    )
}

I was somewhat disappointed to learn that this version is only marginally faster (x1.6) than the original functional version. This didn’t really make sense to me, since the loop is completely unrolled and just trivial computation needed to be performed. So I decompiled the .class file:

public int[] CheckPerf$$decodeUnrolled0(byte[] packet)
{
  int PACKET0 = UnsignedByteOps(packet[0]).toUint();
  int PACKET1 = UnsignedByteOps(packet[1]).toUint();
  return (int[])Array..MODULE$.apply(Predef..MODULE$.wrapIntArray(new int[] {
    UnsignedByteOps(packet[2]).toUint() | PACKET0 << 6 & 0x300, 
    UnsignedByteOps(packet[3]).toUint() | PACKET0 << 8 & 0x300, 
    UnsignedByteOps(packet[4]).toUint() | PACKET1 << 2 & 0x300, 
    UnsignedByteOps(packet[5]).toUint() | PACKET1 << 4 & 0x300, 
    UnsignedByteOps(packet[6]).toUint() | PACKET1 << 6 & 0x300, 
    UnsignedByteOps(packet[7]).toUint() | PACKET1 << 8 & 0x300 }), ClassTag..MODULE$.Int());
}

Instead of the Java array I expected, some Scala internals are used to create the array and then converting it into the Java thing. Possibly this was the cause of the slowness. By looking around at the decompiled code I found another function that just used a Java int array. So I rewrote the unrolled version accordingly –

private def decodeUnrolled1( packet: Array[Byte] ) : Array[Int] = {
    val samples = new Array[Int](SAMPLES_PER_BLOCK)
    val PACKET0 = packet(0).toUint
    val PACKET1 = packet(1).toUint
    samples(0) = packet(BLOCK_HEADER + 0).toUint | ((PACKET0 << 6) & 0x300)
    samples(1) = packet(BLOCK_HEADER + 1).toUint | ((PACKET0 << 8) & 0x300)
    samples(2) = packet(BLOCK_HEADER + 2).toUint | ((PACKET1 << 2) & 0x300)
    samples(3) = packet(BLOCK_HEADER + 3).toUint | ((PACKET1 << 4) & 0x300)
    samples(4) = packet(BLOCK_HEADER + 4).toUint | ((PACKET1 << 6) & 0x300)
    samples(5) = packet(BLOCK_HEADER + 5).toUint | ((PACKET1 << 8) & 0x300)
    samples
}

The main difference is that here the array is first declared and allocated, then filled. In the above code the array was created, initialized and returned all in the same statement.

The speed improvement was good, but not much better than my imperative version: x7.77 times faster.

Then a colleague pointed out that I was using the “old” Scala 2.10 compiler and that I should try the latest 2.12 that benefits from a better interoperability with the underlying JVM 1.8.

In the following table you get the performance comparison:

 functionalimperativeunrolled0unrolled1
scala 2.10 times (lower is better)26.153.4016.603.36
scala 2.10 improvements over functional (higher is better)1x7.68x1.58x7.77
scala 2.12 times (lower is better)24.951.1713.30.021
scala 2.12 improvements over scala 2.10 functional
x1.05x22.32x1.97x1245

Functional and unrolled0 are quite unsurprising, just what you expect from the next version of the compiler. Imperative approach yields quite a boost – 22 times faster than the functional version on the previous compiler. The shocking surprise is in the last column – the unrolled1 version of the code runs in excess of 1200 times faster than the original code!

Before jumping to conclusions, I performed the same tests on the code translated into C++. Here is the equivalent code:

typedef std::array<int,SAMPLES_PER_BLOCK> OutputArray;
typedef std::array<uint8_t,BLOCK_LENGTH> InputArray;

OutputArray decodeImperative( InputArray const& packet )
{
    OutputArray samples;
    uint32_t msbs = ((packet[0] << 8) | packet[1]) & 0xFFF;
    for( int index=0; index< SAMPLES_PER_BLOCK; ++index )
    {
        samples[index] = packet[BLOCK_HEADER + index] |
                         (((msbs << 2 * index) & 0xC00) >> 2);
    }
    return samples;
}

OutputArray decodeFunctional( InputArray const& packet )
{
    OutputArray lowSamples;
    std::copy( packet.begin()+2, packet.end(), lowSamples.begin() );
    uint32_t msbs = ((packet[0] << 8) | packet[1] ) & 0xFFF;
    OutputArray samples;

    int index=0;
    OutputArray::const_iterator scan = lowSamples.begin();
    std::generate(
        samples.begin(),
        samples.end(),
        [&index,&scan,msbs]()
        {
            return *scan++ | (((msbs << 2*index++) & 0xC00) >> 2);
        }
    );
    return samples;
}

OutputArray decodeUnrolled0( InputArray const& packet )
{
    uint8_t PACKET0 = packet[0];
    uint8_t PACKET1 = packet[1];

    return OutputArray{
        packet[BLOCK_HEADER+0] | ((PACKET0 << 6) & 0x300),
        packet[BLOCK_HEADER+1] | ((PACKET0 << 8) & 0x300),
        packet[BLOCK_HEADER+2] | ((PACKET1 << 2) & 0x300),
        packet[BLOCK_HEADER+3] | ((PACKET1 << 4) & 0x300),
        packet[BLOCK_HEADER+4] | ((PACKET1 << 6) & 0x300),
        packet[BLOCK_HEADER+5] | ((PACKET1 << 8) & 0x300)
    };
}

OutputArray decodeUnrolled1( InputArray const& packet )
{
    OutputArray samples;
    uint8_t PACKET0 = packet[0];
    uint8_t PACKET1 = packet[1];

    samples[0] = packet[BLOCK_HEADER+0] | ((PACKET0 << 6) & 0x300);
    samples[1] = packet[BLOCK_HEADER+1] | ((PACKET0 << 8) & 0x300);
    samples[2] = packet[BLOCK_HEADER+2] | ((PACKET1 << 2) & 0x300);
    samples[3] = packet[BLOCK_HEADER+3] | ((PACKET1 << 4) & 0x300);
    samples[4] = packet[BLOCK_HEADER+4] | ((PACKET1 << 6) & 0x300);
    samples[5] = packet[BLOCK_HEADER+5] | ((PACKET1 << 8) & 0x300);
    return samples;
}

I’m positive that functional purist will scream in horror at the sight of C++ functional version of the code, but it is the closest thing I could do using the standard library. Should you have any idea on how to improve the functional idiom using the standard library, let me know and I’ll update my test set.

Here are the C++ results –

 FunctionalImperativeUnrolled 0Unrolled 1
C++ times (lower is better)0.320.0740.0530.049
Improvements over functional (higher is better)x1x4.36x6.1x6.6
Improvements over Scala 2.10 (higher is better)
x81x46x313x69
Improvements over Scala 2.12 (higher is better)x77x16x250x0.43

This is mostly unsurprising but for the the lower right column which seems to indicate that Scala version is faster than C++. I guess that the reason is because I am using timing values that are rebased considering a standard price you pay for executing the code under test and that I don’t want to consider in my measurements (see below about the methodology I used). My interpretation is that the overhead in Scala hides the cost for a trivial expression so that the base time is much more close to the execution time than it is in C++.

Conclusions

Drawing general conclusions from a specific case is all but easy and trivial. Anyway according to data and to intuition, if you want to code using the functional idiom you pay a price that sometimes can be hefty. Paying an execution price, be it memory or cpu cycles, to raise the abstraction level is something that has been with computer programming from the dawn of this craft. I remember echoes from the ranting against high level languages in favor of assembly to avoid paying a 30% penalty. Now those times are gone, but facing a x4-x7 overhead can be something not always you can afford.

(Raising the level of abstraction by employing the functional idiom is something that is not undisputed, but I will spare this for another post).

Writing efficient code is not always intuitive. Constructs that seem innocent enough turns out to be compiled into time consuming lower level code. Peeking at compiled code is always useful. While languages as C++ are accompanied by detailed information about performances of various constructs and library components, Scala (as Java) lacks of these information. Also the huge amount of syntactic sugar employed by Scala does not simplify the life of the programmer that has to optimize code.

Eventually I cannot save to remarks the performances of interpreted languages (and even more so functional interpreted languages), despite of the progress in the JVM technologies, are still way behind native language ones. Even employing a functional approach C++ is some 80 times faster than Scala. You can argue that the C++ code above is not really functional, but there is nothing preventing me to use a mostly functional library for C++, with roughly the same performances of the functional version of the C++ code. It could be more tedious to write because of the lesser grade of syntactic glucose in C++, yet it is still functional.

I’m curious to see how scala-native compares, but this again is a topic for another post.

The last thought I want to report here is that you can make Scala code much faster, by avoiding the functional paradigm and peeking at compiled code. It is something you are not usually expected to do, but and improvement of 1200 times is a worth gain for the most executed code.

A very last thought for those who think that after all nowadays we have such powerful computers that analyzing performances and optimizing is a waste of time. I would like to remember that data processing requires energy and today energy production mostly relies on processes that produce CO2, something that the world doesn’t need. A data processing that runs just twice the speed of another will produce half of the carbon dioxide emissions. The environment benefits for switching to C and C++ should be clear to everyone.

Methodology

A couple of words on methodology. The compilers were Scala 2.10.6, Scala 2.12.2 and GNU c++ 7.3.1. All the test were performed on Linux (Fedora Core 27) using the command /bin/time –format=”%U %S”  command to print the User and Kernel times of the command and summing the two values together. These are times that CPU spent executing the process, not wall clock. The decoding function is executed 0x7FFFFFFF (some 2 billions) times per run and each program is tested for 12 runs. The fastest and the slowest run are discarded and the other are used to compute an average that is considered the measure of the test.

The base time is computed by employing a dummy decoder that does nothing but filling the result array with constants.

This setup should take in account the JIT effect and measure just the code I listed above.

Lambda World 2017 – Workshop – Don’t fear the Optics

This talk by Jesús Lopez Gonzales has been quite clear (at least to my challenged functional understanding). As strange as it may sound the whole idea of optics (in Functional Programming) is to solve a problem that exists only because of the functional paradigm. Aside from cheap humor, it makes sense – in structured programming you do the same by forbidding the use of the goto statement, and you need other tools (e.g. break, continue, structured statements) to do the same job in a safe, sound and controlled way.

You can find sources for the running example here: https://github.com/hablapps/dontfeartheoptics.git

But I don’t want to steal the stage. As usual all mistakes and false predicates are mine (my only defense is that the talk was performed without a mike and loudspeaker system).

Just one last not, before starting – the original slides of the presentation used a compact syntax, relying on Scala high sucrose diet. I opted for a more verbose syntax that makes clear to those less fluent in Scala what’s happening behind the sugar curtain.

Functional Programming is a programming paradigm […] that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data.

(from Wikipedia)

In functional programming we decide to not change state of variables once assigned. That means that when we want to change something we have to create a new instance from the existing one.

Consider the position class:

case class Pos(i: Int, j: Int )
pos1 = Pos(1,1)

In order to change to position to a new one, the “changing” method just takes the existing pos1 and creates a new instance with the new state:

pos2 = pos1.move(1,2)

The running example for this talk is a CandyCrush clone. Here are the main classes:

case class Game( ups: Int, level: option[Level], )
case class Level( targetScore: Long,
                  targetMoves: Long,
                  board: Board,
                  currentScore: Long,
                  currentMoves: Long )
case class Board( height: Int, width: Int, matrix: Map[Pos,Option[Candy]])

Modules are defined as follows:

  • Candy REPL – IO
  • Candy Business logic – state program
  • Candy data layer – data structurs & optics

To face problems posed by state immutabile we resort to the Half Life narration – who better than Gordon Freeman – the man with a big lambda on his breast – could help us in the process?

The talk uses an explorative approach – you may want to explore the area to locate the problem (the Alien), then try to solve it using some techniques (equipping new weapons) and then refine the solution until you find an elegant way to fix the problem.

The first enemy to defeat is how to keep state unchanged.

CandyState.scala is the source file where “getter and setters” are located). There are several points where you need to update the state as the game progresses.

def modifyScore( f: Long => Long) : Level => Level..
def modifyMatrix( f: CandyMatrix => CandyMatrix ) : Level => Level

[NdM: note that specific to this paradigm these methods accepts a function that transforms score (or playfield) into a new score (or playfield) and return a function that change the Level accordingly. I found this revealing and somewhat mind boggling.]

How would you do in traditional way?
In order to modify you need copy:

def modifyScore( f: Long => Long ) : Level => Level =
  lv => lv.copy(currentScore = f(lv.currentScore))

def modifyMatrix( f: CandyMatrix => CandyMatrix ) : Level => Level
lv => lv.copy( board = lv.board.copy( matrix = f(lv.board.matrix)))

This is not straightforward, at least not in general, because you need to copy through several indirection levels. Functional programming is about elegance and modularity, not this.

(Alien identified!)

Lens come to the rescue (or, as Jesus put it, Lens – the crowbar in the half-life analogy – is the weapon to equip). Lens is a parametric class defined over two types: S – the whole and A – the part:

abstract class Lens[S,A] {
    def get( s: S): A
    def set(a: A): S => S
}

get  method accepts a whole and returns a part. set  accepts a part and tell you how to change the whole to incorporate that part.

val _currentScore: Lens[Level,Long] =
    Lens[Level,Long]( 
        lv => lv.currentScore
    )(
        cs => lv => lv.copy(currentScore=cs)
    )
val _board : Lens[Level,Board] =
    Lens[Level,Board](
        lv => lv.board
    )(
        br=>lv=>lv.copy(board=br)
    )
val _matrix : Lens[Board,CandyMatrix] =
    Lens[Board,CandyMatrix](
        bd => bd.matrix
    )(
        mx => bd => bd.copy(matrix=mx)
    )

modifyScore2( f: Long => Long) : Level => Level = {
    level : Level =>
        Level.currentScore.set( f( level.currentScore ))(level)
}

The code works, but it is lengthy and boring to write. So we can take advantage of the Lens defined for us by the @Lenses annotation.

[NdM: I’m going to expand the talk a little bit here because I lost some passages and I reconstructed them thanks to my Scala speaking friends]

This annotation instructs the compiler to create one lens method in the companion object for each case class field. [NdM: Oddly enough, for us coming from traditional programming languages, the lens has the same name of the case class field].

[NdM: original example exploited import to inject in the current scope the companion object’s fields, creating a bit of confusion in my mind. In the following examples I will avoid this shortcut in favor of readability].

What if we want to extract the matrix from a level? Operationally we have to navigate through the board (level->board->matrix). This can be done via composition, using the verb composeLens :

def getMatrix2 : Level => CandyMatrix = (Level.board composeLens Board.matrix).get

[NdM: my Scala speaking friend also told me that def  has been used without a real advantage over val . Having used val would have avoided an unneeded function call.]

The same can be applied for modify:

def modifyMatrix2( f: CandyMatrix => CandyMatrix ) : Level => Level = 
    (Level.board composeLens Board.matrix).modify(f)

This syntax is slick, but still more verbose than say Haskell where you write just a dot instead of the composeLens verb.

# 2nd Enemy – Threading State Zombie (State Monads)

Consider the function

crushPos( pos: Pos): Level => (Level, Long)

Its purpose is to crash a given position. This is accomplished by updating the map and updating the score. Additionally we want the function to return a pair composed by the level and the new score. The first implementation you may want to try is to navigate through the level to change the matrix, then navigate through the updated version of the level to update the score and then prepare the pair with the updated level and the score.

def crushPos( pos: Pos) : Level => (Level, Long) = {
    lv0 =>
        val lv1 = (board composeLens matrix).modify(mx => mx.updated(pos, None))(lv0)
        val lv2 = currentScore.modify( cs => cs +1)(lv1)
        (lv2, currentScore.get(lv2))
}

This works, but it is error prone because the programmer must ensure to properly pipe all the changes through the transformations.

The new weapon is the State:

abstract class State[S,A](run : S => (S,A))

This class defines a mechanism to execute a given action on an object and produce the updated object and a value. And it can be used like:

def crushPos2(pos: Pos): Level => (Level , Long) = {
    lv0 =>
        val (lv1, _ ) = State[Level,Unit] (
            lv => {
                (board composeLens matrix).modify(mx => mx.updated(pos, None))
                ((lv), ())
            }
        ).run( lv0 )
        State[Level,Long] (
            lv => {
                val nlv = currentScore.modify(cs => cs + 1)(lv)
                (nlv, currentScore.get(nlv))
            }
        ).run(lv1)
}

This maybe more elegant, but can be hardly defined as better, and nonetheless still requires the programmer to properly set up the execution pipe. [NdM: also note that the first part (from val  to run( lv0 )  may be replaced by a more compact val lv1 = (board composeLens matrix).modify(mx => mx.updated(pos, None))( lv0 )  ]

This can be improved by using an implicit MonadState, that is a class implicitly built from a State class that can be bound together using the >> operator. In code:

case class State[S,A](run : S => (S,A))

implict def monadState[S]: monad[State[S,?]] = ...

Our code becomes:

def crushPos3(pos: Pos): State[Level, Long] = {
    State[Level, Unit] {
        lv =>
            val lv1 = (board composeLens matrix).modify(mx => mx.updated(pos, None))(lv)
            (lv),()
    } >>
    State[Level,Long] {
        lv =>
            val nlv = currentScore.modify( cs => cs +1)(lv)
            (nlv,currentScore.get(nlv))
    }
}

[NdM: be careful in placing the >> operator! First IntelliJ is not able to recognize it and marks it as an unknown operator; second thanks to Scala forgiveness on syntax punctuation, you need to place >> on the same line of the closing bracket. I couldn’t figure out the right way to write this until I mailed the author of the talk for help. He responded quickly and kindly and set me in the right direction. Thanks Jesús]

[NdM: a quick word on binding. Bind is the same as flatMap, that is the way monads transform their content. In this case the binding allows you to compose the two run action into one. Since the computation accept one value and produces two, you may wonder what happens to the side value (in our case the score) of the first run. Answer – it gets discarded and only the last one is produced in the final result]

MonadState can be composed so they pipe the result one through the other.

def crushPos4( pos: Pos ) : State[Level, Long] =
    (board composeLens matrix).mod(mx => mx.updated( pos, None)) >> currentScore.mod(_ + 5)

[NdM: now, this is a bit more complex to digest – where does the .mod come out? And more importantly how does .mod know what to return? .mod is a method of the StateLens object (well, nearly true, but true enough for the sake of this analysis). Always remember that you are not dealing with actual values, but you are forging functions that will need to be called/applied on actual values. Note that the lenses in the expression are both on the Level class, so the state generated by mod operates on the Level class. The additional type is derived by the right operand of the >> operator.]

# Enemy 3: optional antlion

So far so good, but there are still other entities that cannot fit properly in the picture. What about getting and modifying the current score from the Game? The problem we face is that level is an option in Game. Lenses can’t be used with a plain-vanilla approach.

Let’s try a first attempt to the solution:

def getScore: State[Game, Option[Long]] =
    level.extract.map( olv => olv.map( lv => currentScore.get(lv)))

def modifyScore(f : Long => Long) : State[Game,Unit] =
    level.mod_(olv. => olv.map(lv => currentScore.modify(f)(lv)))

extract  is a method of the State

Nice, but cumbersome. The abstraction we can use now is the Prism (which is defined in monocle, roughly in the following way):

abstract class Prism[S,A] {
    def getOption : S => Option[A]
    def reverseGet : A => S
}

The first method takes an object and produces an option, the second method rebuilds the object given a part. So, let’s define our prism:

import monocle.Prism

def mySome[A] : Prism[Option[A], A] = Prism[Option[A],A]( s => s)(a=>Some(a))

This prism deals  with an Option[A] , but monocle already provides you with this tool and it’s called some :

import monocle.std.option.some

def getScore2: State[Game, Option[Long]] =
    (level composePrism some composeLens currentScore).extract

def modifyScore2( f : Long => Long) : State[Game,Unit] = 
    (level composePrism some composeLens currentScore).mod_(f)

# Final Enemy: Multiple Fast Zombies

Now we want to crush an entire column of the board.

We can combine lenses and prisms into something else:

 

abstract class Optional[S,A] {
    def getOption(s: S): Option[A]
    def set(a: A): S => S
}

As for the Prims we have a getOption method that exposes the Option, but, instead of the reverseGet, there is a set that transforms S into another S provided an A.

Optionals  can be created by composing prisms and lenses as follows:

val op: Optional[Game,CandyMtrix] =
    level composePrism some composeLens board composeLens matrix

So that we can write our first iteration of the crushColumn method as –

def crushColumn(j: Int): State[Game, Unit] =
    op.mod_(
        mx => mx.map {
            case (p,_) if p.j == j => (p,None)
            case x => x
        }
    )

This function operates on the game matrix and removes the candy when the column of the position is the same as j .

It is not bad per se, but that we are doing this in a manual way. The solution could be improved by using a filterIndex:

def crushColumn2( j : Int ) : State[Game,Unit] =
    op.mod_(
        mx => filterIndex[CandiMatrix,Pos,Option[Candy]]( p => p.j == j ).set(None)(mx)
    )

Let’s see how to automatize it. Let’s introduce the abstract metaclass Traversal:

abstract class Traversal[S,A] {
    def modifyF[F[_]: Applicative](f:A => F[A])(s:S): F[S]
}

Now it is possible to compose the Traversal with other lenses such as:

def crushColumn3( j: Int ) : State[Game,Unit] =
    (op composeTraversal filterIndex((p: Pos)=> p.j == j )).assign_(None)

FilterIndex is a monocle function, that along with the implict mapFilterIndex allows the lens to apply over the map collection.

Since compose syntax may tend to be a bit verbose, you can also use the following operators:

  • ^<-?  compose with prism
  • ^|->  compose with a Lens
  • ^|->> compose traversal

Conclusions: Optics are abstractions for changing parts of wholes. These abstractions are composable to access complex data. Monacle library provides hybrid of concrete e Van Laarhoveen optics [NdM: sorry I missed the explanation entirely]. State monads encapsulate state threading and produce output values.


Max’s comment – The talk has been very helpful in improving my understanding of this aspect of the functional programming. I still find Scala syntax to be a bit on the verge of cryptic and dealing with new concepts doesn’t help as well. Composing stuff in the way functional programming does is a really powerful mechanism that enables the programmer to recycle code in an effective way.

I find that the use of symbols to further reduce the characters count is really dangerous. But this is a topic for another post. Let’s just say that Scala is endangered of write-only code 🙂 Looking forward to attending next Lambda World!

Lambda World 2017 – Unconference – Category Theory Crash Course

Here I am in Cadiz to attend 2017 edition of Lambda World. Cadiz is really a nice frame for this conference and the weather is so good that it seems to be back at vacation time instead then in mid-Autumn.

Anyway, on this morning I attended the “unconference”, a sort of informal prelude to the real conference. After the first, planned talk, people could register to present their talk and other attendants could vote their preferred talk. This yield two interesting talk – one on declarative testing and the other one on category theory. Here is what I learned from the last one, by Rùnar Bjarnason.

First, this is algebra stuff, really. Not the kind of ‘a+b’ you do in mid schools, but the high order stuff you do later when you study groups, monoids and the likes.

How does these concepts applies to practical programming still eludes me, but I sense that something is there waiting for my brain to grasp it.

Basically an abstract algebra of function wants to answer to the following questions – what functions do? And what can we say about them? Here we go, all mistakes, errors, inconsistencies are just mine, blame me not Rùnar.

A Category is defined by objects and arrows and how we can compose those arrows.

If you consider language types as objects then functions can be thought as arrows from an object to another object:

Some types: A, B, C.
Some functions f, g.

f: A => B
g: B => C

a composite functions is formed by invoking a function on the result of another one:

g compose f = (x => g(f(x)))

As long as types line up we can concatenate function composition.

If a function links an object to itself (in the example a type to itself), this is the identity function:

identity function: identity[A] : A=>A
identity compose f = f
f compose identity = f

Sometime composition is expressed with a dot. An interesting property is that composition is associative:

h.g.f = (h.g).f = h.(g.f)

(so no need for parenthesis).

In Scala you have a category for functions and types, but also for types and inheritance (subtyping). This category is defined as follows:

Objects: Scala Types: A, B, C…
Arrows: Subtype relations: A <: B, B <: C…
Composition: [ A <: B, B <: C ] => A <: C
Associativity: A <: B <: C (no need for parenthesis)
Identity: A <: A

Another example of category is the the pre-order (<=). Objects are instances of a Type on which <= binary relationship is defined:

Objects a, b, c…
Arrows: a <= b, b <= c
Transitivity: <=
Identity: a <= a

E.g. natural numbers are a category for preorder.

Monoids are categories with one object. This object, by convention, is written as ‘*’. All arrows links * to itself. In a turn of notation I’ll use the symbol |+| for composition and ‘mzero’ for identity.

Objects: *
Arrows: f, g, h: * -> *
associativity
identity: * -> *
mzero |+| a = a
a |+| mzero = a

Object is just an anchor, to hold arrows.

Category of preorders:
A preorder is a set with <=
Objects: all the preorders (Scala integers)
arrows: Homomorphisms on preorders

[NdM: at this point things start to be complicated and I have likely lost the speaker. What follows are likely to be glimpses of his talk… the parts I got]

Monotonic functions.
e.g. string length : _.length is a monotonic function.
Preserves relationship:
x <= y then h(x) <= h(y)

Category of monoids:
Objects: all the monoids
Arrows: monoid homomorphism
h( x |+| y) == h(x) |+| h(y)

Category of categories
Objects: all the categories
Arrows: category homomorphism F
F( f . g ) = F(f) . F(g) e.g. _map( f compose g ) = _.map(f) compose _.map(g)
these are functors.

Given a category C, there is a category of endofunctors on C
endofunctor F[_]
Objects: functors
arrows: functor homomorphism
h: F[A] => G[A]
h compose _.map(f) = _.map(f) compose h

trait ~>[F[_].G[_]] { def apply[A](a: F[A]) : G[A] }

A monad on C is an endofunctor M[_] together with two natural transformations:

pure[A] : A=>M[A]
join[A] : M[M[A]] => M[A]

A monad is a kind of category:
M[M[M[A]]] : can be associated as we want, preserving the result.

A monad is a monoid in an endofunctor category.

[And with this truth the brief talk is over]

Scala Days – Berlin

Scala… Where to start? Well I’m going to the Scala Days 2016 in Berlin. This is the first conference of this kind I will attend. Maybe I would have preferred a conference about C++, such as c++con, but this is what I got. And, as a second thought, Scala has nothing to envy to C++, at least in terms of unfriendliness and unreadability.
In Scala I like that I am not forced to use Java when programming for the JVM, but I find that by looking at the pro and weakness of the existing languages, EPFL could have designed the language somewhere more … Industrial. Instead it felt in several pitfalls, first of which is the misconception that the speed of developing software is capped by the speed of typing the code.
Anyway, here I am waiting for the conference to begin. Keynote speech will be from Martin Odersky voice. Martin is, of course the inventor of Scala language and, if I understood correctly, a cofounder of lightbend (was Typesafe) the for-profit company that backs up Scala and its environment.
image