Journal Articles

CVu Journal Vol 17, #6 - Dec 2005 + Design of applications and programs

Browse in :

All > Journals > CVu > 176 (12)
All > Topics > Design (236)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: A Reflection on Defensive Programming

Author: Administrator

Date: 02 December 2005 06:00:00 +00:00 or Fri, 02 December 2005 06:00:00 +00:00

Summary:

Body:

In his last editorial Paul put out a plea for articles on four specific subjects, one of which was 'Defensive Programming'. This got me reflecting first of all about the basic question: "What is defensive programming?"

I started finding an answer with Wikipedia [1] which began "Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software in spite of unforeseeable usage of said software" and although the whole article was a reasonable summary I felt more could be said.

What is Defensive Programming Defending Against?

There are a number of different things covered by this phrase. In the first instance, a defensive programmer is protecting themselves^[1] against their own mistakes. All programmers are optimists [Brooks] and so each one of us tends to ignore the reality that every line of code we write has a probability of being wrong. The probability may be lower for some of us than for others (we all think that) but it is never zero, and the consequence is that the bigger the program is, the more certain it is to contain at least some incorrect code.

Most programs are not written by a single programmer, and so defensive programming also means defending your code against other people's misuse of it. In an ideal world it should be obvious to others how to use the code I've written and this reduces the need for other forms of defence. However even where other people do not find the code obvious to use we should try to make it easy for them to understand what they have done wrong and how to get it right. This ideal seems can be quite hard to achieve in practice, but it is worth reminding ourselves of it from time to time!

Once written our programs will, we hope, be used. So we need to be defensive against users, possibly untrained and often not terribly computer literate, whose understanding of the purpose and use of our programs might be rather different from our own. Additionally, the program will be running on another person's machine and this could be a machine with a very different hardware configuration, resources, software versions, etc.

Again we hope our programs will continue to be used, perhaps for some to come. So we have to prepare for change and therefore another part of defensive programming is trying to make our programs safe against possible future changes; but this is hard. "It's tough to make predictions, especially about the future." (Yogi Berra.) A simple example from the world of SQL is avoiding the use of SELECT * FROM ... as this usually breaks when the database schema changes.

Finally one of the ever present needs in recent times is defence against malicious crackers, particularly but not exclusively on the Internet. It is part of being a computing professional in this day and age to consider the security requirements of every program that we write and then to make informed judgments about the correct level of security audit required.

How Does "Defensive" Code Behave?

The ultimate goal of writing defensively is simply to prevent problems occurring. An example from everyday life is that of an electric plug and socket with two prongs. A defensive design makes the plug slightly asymmetric so it can only be put together correctly.

In practice however not all problems can be prevented; in which case defensive programming means trying to make any problems that do occur localised and of low cost to the users of the program. A less defensive approach to the plug and socket in the previous paragraph is to protect the user (and the equipment) from any fatal consequences of inserting the plug incorrectly. As an example from the world of IT, I used a full screen editor in the 80s that never ever lost any work done in an edit session (except for the actual screenful being typed in) even when the mainframe we were using crashed. That was a defensively written program - and some modern word processors could do with emulating this behaviour!

A secondary aim is to try and make any problems that do occur as easy to resolve as possible. When a problem does occur defensive programming tries to ensure that enough information is collected and logged when unexpected situations are encountered so that the actual problem can quickly be identified and resolved. For example, this may mean writing code that stops the program quickly on an error rather than attempting to carry on with bad data.

If you are an active programmer consider the current programs you are working on. With the possible exception of Tex [Tex] all substantial programs contain bugs so the chances are that you are writing - and will ship - buggy code. The defensive programmer's approach is to try and ensure that the bugs do as little damage as possible and that they can be reported and resolved rapidly.

Why Are Programs Not Written Defensively?

I remarked above that "All programmers are optimists". The first step is realising that your own code needs to be written defensively. Depending on the individual programmer concerned this can take a long time to really sink in. Even after 20 years in IT I am still surprised by some of the stupid little bugs I introduce.

However, even when we recognise the need for it, there are several difficulties with writing defensive code. First of all, it is hard to write defensively since, almost by definition, you are trying to face the unexpected. One of the things that comes with improved levels of skill is better awareness of the sorts of things that can go wrong with programs and the techniques that can be used to obviate them. Every bug you find gives you an opportunity to improve your awareness of similar potential bugs in the future.

Secondly, many of the techniques that make programs more defensive add code (not all do - in particular defensive design decisions may have no impact on eventual code size). This extra code takes time to write and may have a detrimental effect on performance. However there are some factors standing against these downsides. Firstly, as I'm sure most readers of C Vu already believe, fixing problems usually becomes more expensive the later in the development process that they are discovered and so the cost of writing this extra code may well be recouped in the reduction of time spent fixing bugs later on. Secondly it is almost always better to get the right answer slowly than the wrong answer quickly; as most commentators on optimisation point out it is generally easier to make a slow program faster than it is to make an incorrect program correct.

A final reason may be lack of good examples. Have you noticed how many times in articles about computer programming the code shown comes with the comment "Error handling omitted for clarity"? Although this may be a true excuse the consequence is that we rarely see good examples of defensive code. In fact, I suppose you could say we see more of the opposite of 'defensive' code - 'offensive' code!

What Can We Do?

A good example of defensive programming comes from How to Hunt Elephants [4].

COMPUTER SCIENTISTS hunt elephants by exercising Algorithm A:

Go to Africa
Start at the Cape of Good hope.
Work northward in an orderly manner, traversing the continent alternately east and west.
During each traverse pass:
- Catch each animal seen.
- Compare each animal caught to a known elephant.
- Stop when a match is detected.

EXPERIENCED COMPUTER PROGRAMMERS modify Algorithm A by placing a known elephant in Cairo to ensure that the algorithm will terminate. This amusing example makes a serious point about writing code that copes with the unexpected. The experienced programmer in the story has already faced the possibility that his search finds no elephants before even starting the search! As we write code, we should be conscious of the assumptions that we are making (examples might be 'this file exists' or 'this security permission is held by this user') and make a decision about how to cope when our assumption is not valid before we even execute the program. So for example a parallel to placing an elephant in Cairo might be providing a default setting if a configuration file is unreadable.

Another facet of this preparing for the worst comes into play when writing code for other people to use. We should ask ourselves how an interface might be abused (either accidentally or maliciously) and how we might prevent such abuse. This might be as simple as changing the arguments to a method call. For example Herb Sutter recently pointed out that the C++ standard library call std::transform takes three iterators and so it is all too easy to invoke it wrongly like this:

transform( in.begin(), in2.begin(), in2.end(),
           out ); // 1: oops

But if we had made the user write:

transform( range(in.begin(), in.end()),
           in2.begin(), out ); // 2: right

then the possibility of making a mistake would be drastically reduced as the function arguments themselves make the use of call clearer.

A bigger problem with some interfaces is an implicit ordering of method calls that seemed obvious to the writer but is less obvious to the user. For example in objects with state - such as TCP sockets - it is often not documented which actions are valid in which state(s). To make things harder any restrictions are often not enforced and method calls may just fail silently if they are made when the object is in the incorrect state.

We should reflect on our own bug writing technique. What sort of bugs do we ourselves write most often? If we can identify something we get wrong habitually we can then look at improving our technique to try and prevent such cases in the future. For example, strcpy in 'C' often causes people problems with buffer overruns so using alternatives, whether strncpy or std::string in C++, can prevent problems before they occur.

When we try writing defensively part of the work is examining what should be checked and deciding how certain the check must be. For example, if we are writing code that passes input from the user into a database query we must be aware of the possibility of 'hijacking' the query by embedding escape characters in the input field. This might be either accidental or malicious and our judgement of the degree of checking required must take account the risks involved: a public Web site usually needs better checking than an intranet-based team resource.

Summary

Defensiveness in programming is not a single issue subject but covers defence against a range of people and situations. Different situations will have different tradeoffs but it is our responsibility to improve our awareness of the issues and to give ourselves feedback so we can make the code we write better defended.

Bibliography

[Wikipedia] http://en.wikipedia.org/wiki/Defensive_programming

[Brooks] Fred Brooks "The Mythical Man Month"

[Tex] Tex reportedly bug free. http://web.mit.edu/klund/www/urk/texvword.html

^[1] The grammatically correct alternative "him or herself" is, in my opinion, just too ugly.

Notes:

More fields may be available via dynamicdata ..