The Good, the Bad and the Ugly

Engineers are supposed to deal with systems, usually large and complex devices that are expected working. They are expected to work at least, but the common expectation is that they keep working reliably under a wide set of circumstances and mostly regardless of the environment. Consider any system that results from an engineering process (say a TV set or a car) under this point of view – it is not enough that the device works fine only at the seller location.This should put under the right light one of the most used programmer defense: “It works on my PC”.
For us, as programmers and programming team leaders the point if concern is how to deliver such robust products that could reliably work even in non clean-room conditions. Lot of entropy has been sacrificed to this goal and I don’t want to start anything new or propose any existing process methodology, I’d just like to recall what a friend of mine reported from one of his senior coworkers (thanks Xté).
This is a quick analysis that allows you to understand how much in good shape your project (or task) is.
Consider two nearly binary variables – working and understanding. You can get four combinations that identifies four states of the system composed by the object the engineer is working on and the engineer itself. Here we go to analyze the four states (I swear, I’ll be brief).
It doesn’t’ work and you don’t know why. This is the typical state of the project start, you have this black box not working as expected and you are supposed to fix or implement it. This is not so bad if you have enough time to study, to ask other people, to analyze, well to get your knowledge on what you need to do in order to fix it. Summing it up, this can be fair or bad according to the time you have.
It doesn’t work and you know why. This is quite good. After all, knowledge is power. With the knowledge you can both devise strategies and solutions and figure out the time or the means you need. Moreover you have plenty of facts to explain the situation to the management and asking for the most suitable resources you need to accomplish the task.
It works and you know why. Perfect, you achieved your goal. You fully understand your system, why is it working so that you can predict to a good extent when and how it is going to work.
It works, but you don’t know why. This is the worst case of all. Unfortunately, the rush and the wrong assumption that if it works then who cares, can lead to this really bad situation. In fact the real problem here is that you cannot make any assumption for when the system will cease to work, you have no clue in how deal with it both to repair, move or change. You have no warrants that the system will continue to work.
The fact that it works usually leads the project management to consider it complete and to make pressure to move on. A false sense of security may affect the team cleaning the way for greater disasters.

There are some quick corollaries to this analysis. First, always try to understand what are you doing even if it may seem a waste of time. Second, always include a learning time in your estimations. Third, poking randomly around to fix things is a dangerous way to further damage a system while creating the illusion of work.
Programmers strongly rely on tools, basically there is direct contact with the matter we design and develop, our tools are our manipulators and probes into the hidden work of electrons. We have to know our tools by heart. We cannot go away with a rough understanding of the language we use because we cannot afford that our ignorance would let something in.
The natural question that could rise is: “Which is the extent required for the knowledge?”. Does programming require me to understand OS internals? Digital electronics? Semiconductor physics?
Well it really depends. Gone are the times when a single man could brace the whole human knowledge and art in his lifespan. Nowadays we have to stop at a given interface, taking for grant that what is going on behind is good enough. I think that you have at least have a good knowledge of the first interface you are using, be it OS system calls, environment libraries or hardware if you are working closer to the metal. Anyway an average knowledge up to the next barrier could only be good when you are hunting for problems since it could help you to better exploit the environment and to get helpful hints to get you out of troubles… troubles where engineers spend most of their time in.

Leave a Comment