In this second installment, I’ll look closer to the core of functional programming – functions. When I started programming, BASIC was the language of choice… the only available choice. And no, it wasn’t the quite convoluted Visual Basic, but the old plain, primitive BASIC with line numbers and poor syntax.
In those days I just used GOTO
s and looked with suspicion at the GOSUB
/RETURN
wondering why in the world I would need to automatically return when I could GOTO
everywhere I needed… But I’m digressing. It is just for saying that (besides yours truly could be not the best authority on this matter), programming languages have come a long way and the poor BASIC programmer had to spin the evolution wheel for quite a while to reach to concept of function as presented in this post.
Functions, when citizenship matters
When functional programmers talk about “programming with functions”, they really mean it – using functions not only as something that can be called but also as something that can be operated in some ways, much like other (traditional) data types.
This is what is usually referred to as “first-class citizenship”. A first-class citizen has the same rights as the other citizens. Consider integers – in most languages you can create integers, assign them to variables, passing them as arguments to functions, receive them as results from functions.
If you can do the same with functions, then functions have first-class citizenship, otherwise, they may be citizens of second, third, or even inferior classes, being limited in how the programmer can employ them.
How well does C++ fare here? Well, not that much. I can hear some of you saying – “What? Wait… C++ has function pointers, we can pass them around, how could you claim that C++ is not doing fine with functions first-class-citizenship?”
In fact, a function pointer is not a function, in much the same way a pointer to an int is not an int. The difference between ints and functions is that I can instantiate an int
and pass it by value or by pointer (or reference), but I cannot instantiate a function, I can only take its address and pass it around as a pointer.
The limitation is pretty clear when you try to compose functions – say you have a function that transforms an integer and you create a new function that adds a number to the original function.
In a functional-programming versed language you would write something like:
def makeFn( f: Int => Int, toAdd: Int ) : Int => Int = f(_)+toAdd
(where f(_)+toAdd
is syntactic sugar for (x:Int) => f(x)+toAdd
).
Using function pointers, you would write something with the signature:
int (*)( int ) makeFn( int(*f)(int), int toAdd )
{
...
}
(I hope I got the syntax right, not an easy task, even after nearly 35 years of C).
But… how would you implement makeFn
?
You can’t use function pointers, because they can be taken only from functions that already exist. On the other hand, you can’t use a lambda (more on this below), since each lambda has its own type not compatible with function pointers (not entirely true, but a good approximation; also ignoring how to deal with the lambda object lifetime).
Is there a way out? No, and yes. No, because there is no solution in the language, but yes because there is a solution in the standard library.
The solution is – std::function
. This is a template capable of storing functions, methods, and lambdas and then call them on demand. std::function is quite a piece of template machinery that stores a function object along with its type so as to be able to call it when needed.
The std::function
is quite a brilliant solution, but has two major drawbacks, first, it may allocate dynamic memory, and then it is not part of the language.
Dynamic memory may be required to host the environment needed to call the function (this may be more clear when we’ll talk about lambdas). But this may prevent this technique from being used in embedded systems or other constrained contexts where memory or determinism is at stake.
Being not part of the language means that it is up to the optimizer to figure out the intent of the programmer and take proper actions. Godbolt is your friend for checking your hypothesis. See also this explanation on how std::function works and why it is not so efficient.
The λ-word
When you type 3
, or even 3+2
, you are producing an integer, but it has no use until you assign it to a variable or pass it, as an argument, to a function. To do the same with functions you need lambdas.
Lambdas (from the Greek letter λ), or anonymous functions, are another step-stone of functional programming. The idea is that you can define an unnamed function, that will be called sometime in the future. This function can be stored in a variable or passed as an argument to a function.
The lambda also captures its environment so that it can perform properly even when called from another place. You may think of the capture environment as a set of arguments that are fixed when the lambda function is defined, as opposed to lambda arguments that are passed when the function is called.
C++ has lambdas since C++11. The first thing about C++ lambdas that hits (pretty badly) the eye, is that their syntax is a triumph of clumsy ugliness. They are so ugly that remind me of C++ casts. C++ casts are a rationalization and a specialization of the traditional C cast, and ugliness serves a purpose.
C++ casts were made with an ugly syntax for the very reason to discourage their use. Given this precedent, the suspect that C++ lambda syntax is so terribly ugly for the same reason comes quickly to mind.
Here is a simple C++ lambda:
auto f = []( int x ){ return x+3; }
;
The initial square brackets are meant to be for capture specification, i.e. what values and how to capture them from the lambda declaration context. In capturing, C++ shows a bit more flexibility with respect to other languages. In fact, C++ can capture by reference or by copying. Copying is a good way to avoid sharing the state across the lambda boundaries (e.g. you may capture values whose life is going to end as the execution returns from the function where the lambda is declared).
Next, parenthesis enclose the parameter list. In the same way, a standard function accepts a sequence of arguments, the lambda function has a sequence of arguments.
Lambda arguments need to be listed with their type, but you can use the generic type auto
. A lambda with auto
the argument is basically a template and therefore it can be applied to any type that satisfies the lambda implementation body.
I see two shortcomings in C++ lambdas. First, they are ugly as sin, and I mean it. in other languages, the lambda in the example above can be expressed as _+3
.
This makes the use of lambdas in C++ very verbose, cumbersome, and error-prone. An uncountable number of times I just forgot either the return keyword or the semicolon at its end.
This verbosity gets in the way when you need to use lambdas as arguments for other functions. E.g. consider how much more verbose and less readable the C++ version:
std::find_if( c.begin(), c.end(), [n]( auto x ){return x%n == 0;} )
with respect to the Scala version:
c.find( _ % n == 0 )
(well the begin()
/end()
pair doesn’t help as well).
It is not a matter of the number of key presses, it is a matter of readability, a matter of clearly communicating the programmer’s intent to the compiler and to other human readers.
A proposal for simpler lambdas has been rejected by the C++ ISO committee. Although the committee is looking forward to other similar proposals, it may take years or even never happen (just look at concepts).
Nonetheless, this casts a gloomy light on the matter, in the reasonable hypothesis that the proposer is smart and accurate and that the committee is clever and knowledgeable, the refusal means that C++ has grown so complex and articulate that even a seemingly simple addition has obscure and complex implications far from being evident nor easily fixable.
The other problem with lambda is that each one of them has its own different type. This ties up with the lack of a function type in C++ and requires you to either use std::function
or templates to accept functions as arguments, and/or to return functions. E.g.:
template<typename F>
std::function<int(int)> makeFn( F f, int n )
{
return [n,f]( int x ) {
return f(x)+n;
};
}
(and this incidentally solves the opening problem in the functions, when citizenship matters section above).
Now, template usage is likely to cause a domino effect – since you don’t know type T, you need to write the function handling T as a template, but in turn, functions calling this function, or being called by it, need to be a template as well.
Templates are not bad per se, but they could bloat your binaries and/or make compilation time explode. And what leaves me a bit uneasy is that there is not a real need for templates, but for the lack of a function type.
The only way to avoid the template domino effect is to use the aforementioned std::function
.
It’s no surprise that both lambdas and 1st class citizenship for functions are an afterthought. And there’s nothing bad with this. The real problem is the bad fit, that separates C++ from other languages that have been “functionalized” on the way. Here is a comprehensive list of lambdas in various languages, but a look at C# and Java gives an idea of how much simpler things get in the neighborhood.