Surprisingly Exceptional

It all starts in schools. When they teach you how to program a computer. You get plenty of code that just works in the happy path scenario. So happy path that there are no other paths at all.

And then it is easy to grow, line of code after line of code, with the idea that error handling is not really part of the code. Elegant code has nothing to do with errors. Could be a sort of brainwashing.

And then language designers, present you the ultimate error-handling solution – lo and behold the Exceptions!

You can still write the code the way you were taught and then when something bad happens, throw an exception. A flare fired in the sky, in the hope that some alert patrol on guard could spot and come to the rescue.

The reality is that sh*t happens and your code must be fully equipped for the worst. Failing to design sound and robust error handling is cursing our software with unreliability and instability (and possibly every kind of security trouble).

I’m all against exceptions in C++ since they are generally used to hide the dirt under the carpet and make it harder to reason about the program. Other more elegant and versatile error handling are available and I’m for their integration into the C++ code (and eventually in the C++ language, but I am afraid this could be close to impossible).

So, when in the technical auto-training biweekly meeting we held at Schindler Informatica Milano, I saw the exception topic was free to be presented, I rushed to sign my name to talk to my colleagues about exceptions and why they should be avoided.

Here I’m talking about my presentation trying to add as many details as make sense for a post. My intention is not to describe the basics (throw, catching) that are possibly known if not even familiar to my readers, but to shed some light on the least intuitive parts and inner workings.

The Basics

Briefly, when an exception is thrown, the normal execution flow is interrupted, and the executed path is searched backward to the first catch statement that matches the thrown exception.

During this backward walk, all the objects constructed on the stack are destroyed and functions return immediately to the caller. This is called stack unwinding.

If no matching catch statement is found, then the run time function std::terminate() is called. This function, by default, aborts the execution of the entire program. You can redefine the behavior by calling std::set_terminate( f ), so that f is called from std::terminate(). Nonetheless, the standard prescribes that f must not return, if it does then it is Undefined Behavior.

Also, not everyone knows that you can actually throw any type, even built-in types – int, pointers, and booleans. Of course, the information you can convey is way too limited and generic for real application but may come in handy for debugging and testing when you don’t have a more convoluted type nearby.

but, beware:

#include <iostream>

int main() {
  try {
      throw 3;
  }
  catch( long n ){
    std::cout << "n=" << n << '\n';
  }
  catch(...) {
      std::cout << "ops\n";
  }
  return 0;
}

this will print “ops” since no implicit integer conversion/promotion is applied.

Catching a superclass of the thrown object instead works as expected, so that you can catch entire families of exceptions with a single catch statement.

Note that if, during the stack unwinding, another exception is thrown (directly or directly by one of the called destructors), the std::terminate is called. This is the main reason for avoiding throwing in the destructors.

Lifecycle

To be precise, the throw accepts an expression, whose result is moved into the temporary object that is the actual exception.

Speaking of temporary, the first question that comes to mind is, where is it allocated? Immediately followed by the question when is it destroyed?

Considering that the exception needs to survive a dramatic stack unwinding, the question is not trivial at all. According to the standard, the location that accommodates the exception is implementation-specific [14.4(4)]. I found the description of the strategy of a given implementation that makes sense and I guess could be pretty general –

  1. the run-time tries to allocate the space for the exception instance. If the allocation fails then
  2. the run-time uses an “emergency” static space. If the space is not enough then
  3. std::terminate is called.

Given these hidden constraints, it makes sense to keep the exception class as small as possible.

The exception is destroyed when the catch scope is exited. If no suitable catch is found, I guess that it is not relevant the exact timing when the exception is destroyed since the program is going to terminate.

There’s an exception though. If you get a pointer to the exception object via the std::current_exception() function. This call returns a std::exception_ptr that allows you to access the exception via dereferencing.

std::exception_ptr is reference counted, so the exception is kept alive until there is at least one std::exception_ptr around.

Now suppose you want to make a wrapper around the exception class so that you can catch an expression and return it as an error value. That would face the problem of slicing, that is, considering just the base class (the one that is defined in the interface) e slicing the specialization (i.e. the subclass-specific part) away.

The only way to make this work is by using pointers to the exception objects and this can be made via the std::exception_ptr:

std::optional<std::exception_ptr> f()
{
    try {
       // possibly throwing code
       return std::nullopt;
    }
    catch( ... ) {
        return current_exception();
    }
}

And if you set up this kind of error handling you may want to be able to create an std::exception_ptr without actually throwing and catching an exception. Is it possible?

You can actually use the std::make_exception_ptr function:

std::exception_ptr f()
{
  ...
  if( fail ) {
    return std::make_exception_ptr( std::logic_error{ "Heck!" });
  }
  ... 
}

Multithreading

Since C++11, multithreading is a way not only to have multiple crashes at the same time but also to throw multiple exceptions at the same time.

What happens when an exception thrown in a thread is not caught?

Alas, neither std::threads, nor std::jthreads have special attention to this – the uncaught exception just triggers the std::terminate() function bringing the whole program to the termination.

Therefore the following code –

int main()
{
    try
    {
        std::thread([]{
            throw std::runtime_error("HELLO");
        }).join();

        std::this_thread::sleep_for(std::chrono::seconds(5));
    }
    catch( ... )
    {
        std::cout << "yes\n";
    }
}

Never prints “yes” and just returns to the OS.

If you want to relay the exception from a thread to its parent you have to build a promise/future contraption:

int main()
{
    std::promise<int> p;
    std::future<int> f = p.get_future();
    std::thread t([&p]{
        try { 
            throw std::runtime_error("HELLO"); 
        }
        catch( ... ) {
            p.set_exception(std::current_exception()); 
        }
    });

    try {
        int x = f.get(); 
    }
    catch( std::exception& e ) {
        std::cout << "what " << e.what() << '\n'; 
    }
    t.join();
}

This works, but it is not automatic and requires quite some diligence to apply every time you run a thread and works only for threads that are expected to do some work and return.

If you have such a frame, then the possibly best solution is to use std::async which automatically catches the exceptions and provides you a std::future:

int main()
{
    auto f = std::async( []() -> int { throw std::runtime_error("HELLO");} );
    try {
        auto x = f.get();
    }
    catch( std::exception& e ) {
        std::cout << "what " << e.what() << '\n';
    }
}

The code is also more compact and clear.

BTW, I don’t like too much the lack of probing a future to understand if it has a value or an exception allowing the get() method to throw if the async is terminated by an exception. Combining std::future with std::expect (not in the C++20 standard) would yield a more consistent and safe design.

Function Signatures

Until C++17 excluded (and deprecated since C++11) the language had the so-called checked exceptions. This structure allowed the programmer to list all the possible exceptions that a function could throw:

void f() throw( std::runtime_error );

The line above means the function f may throw a std::runtime_error or one of the classes that derive from std::runtime_error. If f, either directly or indirectly, tries to throw something else, then std::terminate is called.

This solution had a number of disadvantages, mainly:

  • Some code is added to perform the checking (that may involve RTTI) and in case of an exception this code is executing causing an overhead;
  • Being part of the signature the exception list becomes part of the interface contract, making it harder to change the implementation.
  • The standard required only a run-time check, meaning that the compiler wouldn’t even warn for the following code –
void f() throw() // this means that nothing is thrown from f
{
  throw 0xdead;
}

Leaving a surprising termination for the run-time as soon as f() gets called.

For this reason, the standard committee decided to “simplify” and just mark whether a function may or may not throw:

void troubleless() noexcept;
void troublesome();

As in most of the new addition to the language, the default is wrong being the most permissive and less safe. Anyway, the noexcept specifier after the troubleless() function states that the caller is safe to expect no thrown exception from this call.

Conversely, troublesome() is a function that may or may not throw and the caller should take special care.

What happens if a noexcept function attempts to throw an exception? You guess it right – std::terminate is called.

Interestingly, the noexcept can accept a boolean exception. The last example code can be rewritten as

void troubleless() noexcept(true);
void troublesome() noexcept(false);

This is not really useful unless you are defining a function template. Consider a template function that makes a copy of the templated type. When this template is instantiated on a type, let’s say an int, for which the copy doesn’t throw then the function itself doesn’t throw. On the other hand, if you have a string, then the copy may actually throw (e.g. when out of memory) therefore you want to expose that the function may throw.

These two different behaviors may be implemented using function traits:

#include <type_traits>

template<typename T>
T f( T const& t ) noexcept(std::is_nothrow_copy_constructible_v<T>)
{
    return t; // copy t 
}

You may wonder why should I take the burden of making the distinction. There are two reasons –

  1. The caller should take proper action if a called function may throw, and more precisely must ensure that all the acquired resources are properly disposed of (and not leaked) in case of exception;
  2. the compiler may take advantage of a noexcept function since stack unwinding is not required from that call.

The keyword noexcept is used both as a function specifier and as an operator. In the latter case, it returns false if the argument may throw an exception. The noexcept argument is not evaluated.

In the following line –

void f() noexcept( noexcept( T{} ));

the inner noexcept ponder the T{} argument (default T constructor) without executing it. If the default argument may throw, then noexcept yields false, and the outer noexcept marks the function as potentially throwing.

Again, these mechanisms are mostly useful when designing libraries. In user code, the noexcept function specifier is enough.

Exception Cost

One of the core design decisions for the C++ evolution, from the beginning, has been – “you don’t pay what you don’t use” (zero-overhead principle), meaning that language features do not come with a spread overload that exerts a cost even when they are not exploited in the code.

On the other hand, before using something is usually worth looking for the price and understanding if you can or want to afford it.

Time and Space

Exception cost comes in two flavors –

  1. During the stack unwinding, the code generated by the compiler must select which destructors are to be called among all the possible objects available in the stack frame. This selection is defined via address tables. These tables use memory for the sole purpose of exception handling.
  2. The function code needs to be properly arranged and sorted so that the stack can be unwound.

As you may notice these two points contradict the design principle – meaning that even if your code is not using exceptions, but is compiled with exception support enabled then it’ll pay these costs.

Moreover, exceptions may introduce a random latency in the error handling since the code that throws has no indication of where the handling code is and how long it would take to get there, conversely, the handling code has no clue where the exception has been thrown from and how long it has taken to get there.

In general, latency is not a problem, but when you are designing a real-time system.

When talking about cost, time and space are not the only metric you should look at. Exceptions also touch another kind of cost –

Resource Leakage

If resources are not properly guarded with RAII idiom, then they may leak. Consider the following code:

void f() {
    FILE* fp = fopen( "file.txt", "r" );
    T* p = f();
    if( fp != nullptr ) {
        doRead( p, fp );
        fclose(fp);
    }
    delete p;
}

In this code both f and doRead may throw, causing the loss of an open file and/or the loss of the dynamic memory returned by f.

This kind of code is indeed getting infrequent – naked pointers are replaced by smart pointers that take care of deallocation and resources are wrapped in RAII idiom, but you may need to use an old library or maintain code that does.

Code Readability

Another cost is the cost of development and maintenance. Being a sort of side-channel to the normal execution flow, exceptions make code harder to reason about. You can’t be sure which functions may throw and under which circumstances in turn mean that you can’t be sure a given line is executed or not for every execution path. Consider the following code

void foo()
{
    bar();
    baz();
}

You may be tempted to assert that if bar() is executed, then baz() is executed as well. But this is not granted if bar() may throw. What if you need baz() to be always executed?

Considering that there is no finally concept in C++, you may need something like:

void foo()
{
    try {
        bar();
    }
    catch( ... ) {
        baz();
        throw;
    }
    baz();
}

Which, for sure is not as trivial and readable as the former code. And what if baz may throw as well? In case of both bar() and baz() throwing exception which one has to be propagated?

Alternatives

I hope I gave you quite enough context to deem exception as a bad choice for error handling. What are the alternatives?

Terrible Alternatives

Mentioned just for completeness –

  1. setjmp/longjmp. This is a C mechanism that has all the disadvantages of exception without any of .the advantage. I won’t spend a word more that this, if you are interested, look it up.
  2. assert. Assert causes the termination of the program and is a proper option to give up when invariants are broken, but this condition is beyond the exceptional condition that an exception is expected to report.
  3. Zombie objects are objects with a valid/invalid flag. When the flag is set to invalid all operations are no-op. You can check the flag, but having an object in a zombie state causes no (direct) harm. This is bad since it may give a false sense of progress when actually nothing has happened and the error may manifest quite far away from the cause. Also, this violates the principle “Make invalid states unrepresentable in your code”.

Bad Alternatives

Let’s talk about these just to avoid them.

  1. Global error variable. This is the traditional Unix approach (see errno). It behaves badly with multiprocessing, it is not clear who has to clear the error variable and, you may ignore it without even a warning;
  2. Error code returned from a function. This is another classic – but you can ignore the code (you may enable a warning by using the [[nodiscard]] attribute) and the code eats up the only value you can actually return from a function requiring you to return the function result via an “out” parameter.
  3. null pointers. An evergreen, always cause of great joy. The null pointer is particularly evil because it is a pointer and fully belongs to the set of a pointer possible value. Every operation you can apply to a valid pointer may be applied (with undefined behavior) to a null pointer.

All these alternatives have in common that the error condition may be ignored and the execution may process a result value that could be invalid or null, causing run-time error not necessarily close to where the error’s root cause occurred.

Better Alternatives

A better alternative is to extend the domain of the return value of a function to include error conditions. This forces the programmer to examine the returned value to extract the actual result, but this is only possible if the function has succeeded.

The first tool, available from C++17 is std::optional. Despite the interface being a mess and showing the C++ standard committee’s lack of understanding of functional programming (or how this data structure is designed in other languages), std::optional may be used for the purpose –

std::optional<int> stringToInt( std::string_view s ) noexcept;

void f( std::string_view s )
{
    auto n = stringToint( s );
    if( n.has_value() ) {
        std::cout << "number is " << n.value)o << "\n";
    }
    else {
        std::cout << "'" << s "' is not a number\n";
    }
}

The idea is that before accessing the result, you should check if the result is there or not. Trying to access a nonexisting value, will throw an exception.

Well, that’s not entirely true – if you use the dereference operator to access the value, there won’t be any checks, and accessing a non-existing value will turn into UB. Go figure.

There exist different implementation (non-standard) for std::optional that allows for monadic operation – notably Sy Brand‘s and mine. But talking about monadic operations would need a dedicated space.

std::optional may only tell if a value exists, so you can’t really know anything about the reason for the lack of value. If you are interested in such a reason, and you may use a C++23 compiler, you have to switch to a more convoluted object: std::expected.

The std::expected type contains either a result object or an error object, resulting in an entity that is always defined. As for std::optional you need to examine the object in order to determine whether the result is available or not:

enum class StringToIntError {
  InvalidCharacter,
  OutOfRange,
  EmptyString
};

std::expected<int,StringToIntError> stringToInt( std::string_view s ) noexcept;

void f( std::string_view s )
{
    auto n = stringToInt( s );
    if( n.has_value() ) {
        std::cout << "n=" << n.value() << "'\n";
    }
    else {
        std::cout << "error (" << int(v.error()) << "\n";
    }
}

Also for this class, you may try to access the value in an uncontrolled way through the dereference operator, winning a UB if the value is not there.

If you can’t use C++23, or you want monadic operations, once more you can use Sy Brand’s implementation or mine.

The only problem with std::optional and std::expected (or their equivalent from other libraries) is that these are new entries in the language and programmers may not be used to seeing them and reasoning about them in the codebase. I think more experienced programmers should push these idioms and help other programmers to become familiar with and use them because the advantage in terms of understandability and correctness vastly outweigh the cost of learning their use.

Guidelines

From great powers come great responsibilities and spraying absent-minded exceptions all over your code is never a good idea. C++ Core Guidelines have an entire section devoted to exceptions and error handling, it might be a good idea to start there. Here I’ll list only the most important.

Use Exceptions for Exceptional Cases

If you are parsing a string, you somewhat expect that the string might not be properly formatted. If you are waiting for input from an external device, you should acknowledge that it is part of the operation to handle the case that no input arrives before a timeout.

These cases are not exceptional and should not be handled by exception. Also consider that in the case of parsing, throwing exceptions may make it harder to recover from wrong input and continue the parsing.

Don’t Use Exceptions for Flow Control

Normal flow control is better implemented by traditional constructs. In this context, exceptions are very similar to goto instruction to transfer control to an outer scope. This makes programs harder to understand and reason about, let alone less efficient.

Never in Destructors

Destructors may be called during stack unwinding and throwing another exception would cause the program to terminate immediately.

What if a destructor can’t complete its task? Well, there are not many options – possibly the best one is to rearrange the design so that potentially throwing operations are invoked in a different context. If this is not possible and the program could be left in an invalid state then you may want to assert (or actually throw), but maybe logging the problem could be a valid alternative.

Not for bugs

If a component is invoked violating the precondition, the component for sure won’t be able to complete the operation. This condition may arise because of a bug or because the programmer is not reading the component documentation. Regardless of the reason an exception would just signal the presence of a bug, but there would be no way to recover.

The best option is to design components so that they can’t be misused (This is a variation of design components so that an illegal state is unrepresentable). E.g. if you need to call start() only after you called activate(), change the design so that activate() be a method of an object returned from start().

When a function can’t deliver

A function is invoked to produce a value, an effect, or both. If the function discovers that it cannot deliver what was the required result, then it may report the failure by throwing an exception.

Typically this occurs in constructors. The goal of a constructor is to initialize an object acquiring the needed resources. Unfortunately, the constructor has no way to return a value therefore the only way it may return an error is by throwing an exception.

This may seem like a strong motivation for the exceptions. After all the constructor may return no fail result. But shifting a bit the point of view, you can see that a constructor easily be transformed into a factory method that, thanks to std::expected or similar, returns either a properly built object or an error.

Catch when you know how to handle

Move all the housekeeping to guards (to dispose of acquired resources in case of premature function exit), so that there is no need for catching and not handling the exception.

This guideline really implies that exception handling needs to be actually designed and not left as an afterthought. This is also why exceptions are so difficult to use – they are so well hidden that unless you design beforehand the system, it is so easy to forget about.

Conclusions

Exceptions are a powerful mechanism for handling errors and exceptional conditions. This power comes at a cost – using exceptions is hard and has a number of drawbacks (in space, time, and predictability) and – most important of all – makes code hard to write, hard to read, and reason about.

Failing to handle exceptions may cause run-time crashes very hard to impossible to predict from static analysis.

Alternatives exist and should be preferred when designing new code or refactoring existing code unless compelling reasons draw us to exceptions.

And maybe schools should start teaching how to write robust code right from the start, helping students to catch the differences between the happy path and reality, and presenting sound techniques for error handling including, but not limited to, exceptions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.