Garbage-Collecting Garbage

Here we go again, another system written in one of the advanced languages that promise to avoid you the burden of memory management, another unavoidable performance issue and yet again another desperate run to move everything out of garbage man hands into a set of dedicated memory pools.

According to Wikipedia, GC (Garbage Collection) was invented back at the dawn of computer science, (ca 1959) for the LISP language. Since its inception, GC made promptly clear that it didn’t play well with time constraints (see Humorous Annedoct in History of Lisp).

Dealing with dynamic memory management is a pain in the neck. You need to carefully plan and craft where memory chunks are allocated, how they proliferate and where and how they are no longer useful. A small distraction in the implementation or a minor flaw in the design and you end up with dangling pointers or memory leaks.

Garbage collection promises to relieve you of this burden. She says “just ignore the problem, I’ll take care of it”. And she does it… stopping your application here and there for a brief time.

Until some time ago you could say that – put aside some niches (mainly real-time and interactive systems) – a short pause could be acceptable. Nowadays things are getting rough, mainly for two reasons. First, GC-only languages, in an absurdly counterintuitive move, have been marketed for real-time applications (Java for games and embedded systems), second TFLIO (the Free Lunch is Over), i.e. application software needs more and more to be asynchronous and non-blocking.

So, what does the poor old fellow programmer is expected to do when the application source is in a GC-only language and the GC haunts the application performances?

Let’s say that this is not a nice place to be – if the software has been properly designed and objects have sensible life cycle, then you can set up one or more object pool and get back to manual memory management on a language that was not intended for such activity.

Chances are high that because of the false sense of simplicity that GC conveys, especially if the application has been developed by inexperienced programmers, you are dealing with a spaghetti-architecture – i.e. a mess where objects are created everywhere and spred all around.

If that’s the case then you’ll need months to refactor the architecture adding sense and order so that you could apply object pools.

If your code base is written using a functional approach you will face an additional problem – FP objects are not mutable.

Let it sink down – FP objects are not mutable.

Once they are created they never change their value. There are no setters or clear(). There’s a bit of mismatch with object pools, i.e. services that recycle used objects making them anew for next use. Basically, unless using some black-magic reflection techniques under the language or breaking the FP tenets, the only way to do object pooling in FP is that you need exactly one of the same values you released previously.

If you read until this point – thank you, I imagine you would like to know how I solved this problem :-). Indeed, I was lucky enough that in my scenario, real-time data is produced outside the Scala application. In this case I can add buffering before data gets into the application so that occasional collector’s hiccup doesn’t turn into data loss. Had the data be generated from the JVM, I’d be much in worse shape.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.