Scala Job Interview – Reactive Programming

an above shot of two colour liquid in a petri dish

Here we are with the fourth installment in this Scala Job Interview series. The first was on general questions, the second on Scala language, and the third on functional programming.

Explain the actor model

In this case, the Actor Model has nothing to do with the Star System 🙂 I’m not sure where the term actor in “actor model” comes from. but, quite for sure, it has nothing to do with Hollywood, but more with the idea of something that “acts”, i.e. makes actions.

In fact, the Actor in the Actor Model is a processing entity that receives and reacts to messages. More precisely an Actor can only perform computation on the receipt of a message, when message processing is done, the actor just idly waits for the next incoming message.

There is no method call, everything is performed via message. Do you want the actor to perform something? You send it a command message. Do you want to retrieve information from an actor? You ask for them via a message and wait for a reply message.

Actor behavior can be closely modeled after the State Machine abstraction. Each received message is an input symbol that may trigger other actions or cause the machine to change its own state, so that the same input symbol may be handled differently.

Actors have a close resemblance with Objects from Object-Oriented Paradigm – the first OOP Language – Smalltalk used the term message instead of method.

The actor model sports the advantages of OO class, hiding information inside the processing entity while allowing for safe concurrency by having multiple actors process messages at the same time. In fact, if you stick with the model and let the interaction among actors happens only via messages there is no concurrent access to data.

Actors are instantiated at startup time, or they are spawned from other actors at run time when needed. This forms a relationship – spawned actors are children to the spawning actor. This relationship could be leveraged for error handling and default behaviors.

What are benefits of non-blocking (asynchronous I/O) over blocking (synchronous I/O)?

The non-blocking approach frees system resources for other tasks while waiting for the operation to be complete while the blocking approach freezes the running thread along with its currently allocated resources until the operation is complete.

The CPU is still multiplexed among processes and threads, but non-blocking operations allow the current task to better exploit the CPU starting another operation while waiting for the previous to be completed.

Asynchronous I/O is more complex to handle because you have to model somehow the concurrency involved, designing the code so that result is not readily available right after its request, but the result arrives when ready, asynchronously with respect to the execution flow.

Do you think that Scala has the same async spirit as Node.js?

Not knowing node.js, I find it a bit difficult to answer this. IIRC browser javascript has just one thread and that’s all, but let me have a look on the web (something that’s unlikely to be possible during a job interview).

After a quick peek on the web, I confirm, JavaScript is not multithreading, so, async is just a matter of not wasting time while waiting for a result if something else could be done while waiting.

Scala is fully multithreading and the ExecutionContext manages the thread pool so that async/await can leverage all CPU cores.

Explain the difference between concurrency and parallelism, and name some constructs you can use in Scala to leverage both.

These two terms are often used interchangeably, but their strict definitions are different. Two parallel activities are performed at the same time, on two independent processing units (e.g. two cores of the same CPU, or two distinct CPUs, or two distinct nodes in a network).

Two concurrent activities are carried along together, but they compete for the same processing unit. (e.g. two threads or processes on the same CPU core).

Say you have two CPU-bound tasks, i.e. they just perform computations with no I/O waiting. The first task takes t0 time to execute, while the second task takes t1.

When you execute these tasks in parallel, the total execution time is max(t0,t1). The same two tasks executed concurrently take t0+t1 .

By configuring the dispatcher Execution Context in the Akka actor system you actually tune, among other parameters, how much concurrency and parallelism you want in your actor system.

What is the global ExecutionContext?

The global ExecutionContext can be thought, approximately, as the default thread pool. Asynchronous operations, such as Futures and onComplete callbacks are executed by asking the provided ExecutionContext a processing “engine”.

The global ExecutionContext can be used as the default choice, but your program may have specific patterns for concurrent or parallel execution that may benefit for a specific ExecutionContext.

Also note that an ExecutionContext does not need to be a thread pool, in fact, it may be fully synchronous (useful for testing and debugging purposes), or relay on distributed processors.

What does the global ExecutionContext underlay?

The question is not very clear to me, so, in a real job interview, I would try to get some more information on what the interviewer is asking.

I read that in Java (starting from Java 7) there is an Executor class in the ForkJoinPool package which shares many operations with ExecutionContext.

After more net surfing I stumbled into this article that states that the global ExecutionContext takes the burden of wrapping code/function to be executed in a ForkJoinTask away from the programmer. Since I never used thread pools in Java, but only ExecutionContext in Scala, I never had such a burden and I considered Scala ways the standard way of doing asynchronous programming.

Akka: Which are the 3 main components in a Stream?

Akka streams are a dataflow model that allows to design and code transformative pipelines. A stream must have an origin (Source) that produces data and enters them into the stream. In the end, data is pushed out of the stream through a Sink. The third component is a data processor, named Flow, that gets data from its input, processes them, and output the results to its output.

Much like an actual pipeline, these components can be connected together to implement complex behaviors.

Conclusions and wrap up

This was the least fun section to answer. Possibly because things are getting very specific and there is not much space for discussion (let the Actor Model alone). I think that the actor topic could be better explored by discussing its strengths and pitfalls.

Stream processing, as well, had very little attention, with just a generic question that you could answer even with a very clumsy knowledge of the matter.

Nonetheless, I enjoyed testing my knowledge of the Scala universe, over these four posts, pretending to be interviewed for a job. If you find this useful, or spot any error, inconsistency, or fault, I really appreciate it if you leave me a comment (otherwise I would be lured in thinking that my knowledge is no less than perfect).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.