Journal Articles

CVu Journal Vol 17, #3 - Jun 2005 + Project Management

Browse in :

All > Journals > CVu > 173 (15)
All > Topics > Management (95)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Professionalism in Programming #32

Author: Administrator

Date: 04 June 2005 05:00:00 +01:00 or Sat, 04 June 2005 05:00:00 +01:00

Summary:

In this article I present a series of thought provoking questions based on some topics that I've covered in past issues.

Body:

Imagination grows by exercise, and contrary to common belief, is more powerful in the mature than in the young. (W. Somerset Maugham)

It's time to put on our walking shoes again, and trudge back through the annals of history to revisit some of the themes we've encountered on our journey through the professionalism series. We've investigated an eclectic series of themes over the years, all to do with the day-to-day programming experience. Throughout it all I've been trying to draw out how 'professional' programmers react, behave, and work. This is the second such nostalgia-inducing column I've written and, as before, this is an opportunity to asses exactly how 'professional' you are.

In this article I present a series of thought provoking questions based on some topics that I've covered in past issues^[1]. Mull them over and see what you think about each question; make the effort to come up with an answer. The motivation is simple: to improve your skill you have to get involved - passively soaking up information doesn't do you much good. Instead, you've got to invest some effort. I make no apologies for writing such a difficult article this issue - it's for your own good!

The plan is simple: I pose the questions first, give you some time to think about them (go fetch a coffee, sit down peacefully, and think it all through). Then I provide some discussion on the questions. This discussion is not a definitive set of 'answers' (most of these questions have no single simple answer), they're more my immediate thoughts and responses. Compare your answers with mine.

Questions

First, here are the questions. We'll look at two different topics here. Spend a while considering your answer to each one before you move on to the following section.

The Need for Speed (C Vu 16.2, February 2004)

This article series focused on software optimisation, explained the reasons not to optimise and also the reasons you should. We looked at general classes of optimisation, and at some quite specific code-level optimisations.

Why is optimisation an issue? Why don't we write efficient code? What stops us from using high performance algorithms in the first place?
A List data type is implemented using an array. What is the worst case algorithmic complexity of each of the following List methods?
- The constructor.
- append - places a new item on the end of the list.
- insert - slides a new item in between two existing list items, at a given position.
- isEmpty - returns true if the list contains no items.
- contains - returns true if the list contains a specified item.
- get - returns the item with a given index.
How important (honestly) is code performance in your current project? What is the motivator for this performance requirement?
In your last optimisation attempt:
- Did you use a profiler?
  - If yes: how much improvement did you measure
  - If no: how did you know whether you made any kind of improvement?
- Did you test the code still worked after optimising
  - If yes: How thoroughly did you test?
  - If no: Why not? How could you be sure the code still worked properly for all cases?
How well specified are your program's performance requirements? Do you have a concrete plan to test that you meet these criteria

Software Testing (C Vu 13.4, October 2001)

This article looked at the thorny topic of testing the software that we write. We looked at when, how and why we test software.

Write a test harness for the following piece of code: a function to calculate the greatest common divisor of two integers. Make it as exhaustive as you can. How many individual test cases have you included?

How many of these passed?
How many failed?

Using these tests, identify any faults and repair the code.

int greatest_common_divisor(int low, int high)
  {
    if (low  high)
    {
      int tmp = high;
      high    = low;
      low     = tmp;
    }

    int gcd = 0;
    for (int div = low; div  0; -div)
    {
      if ((low % div == 0) 
          (high % div == 0))
        if (gcd  div)
          gcd = div;
    }
  return gcd;
  }

Should you test the all of the test code that you write?
How does a programmer's testing differ from a QA department member's testing?
For what percentage of your code do you write tests? Are you happy with this? What sort of testing do you give the remaining code? Is this adequate? What will you do about it?
What's your usual response to finding an error in your code?

Discussion and Answers

Lazy readers will have jumped here already. Please do spend some time considering your answer to each question first. It will be interesting to compare your response with mine. Do you disagree with anything? Do you agree? Let me know.

The Need for Speed

Why is optimisation an issue? Why don't we write efficient code? What stops us from using high performance algorithms in the first place?

There are many perfectly valid reasons for not writing 'optimised' code on the first attempt:

You don't know the final pattern of usage. With no Real World test data, how can you choose the code design that will work best?
It's hard enough to get the programming working, let alone fast. To prove it's feasible we choose designs that are easy to implement so prototypes get finished quickly.
'High performance' algorithms can be more complex and daunting to implement. Programmers naturally shy away from them, since it's an area where faults can be easily introduced.

An obvious mistake is to think that the taken to run something is proportional to the amount of effort spent writing it^[2]. You might have written some file parsing code in hours; it will always takes ages to execute, because disks are slow. The complex code you spent half a week getting right may only consume a few hundred processor cycles. The efficiency of a piece of code, or the amount of time you need to spend optimising it, bears no relation to the amount of time you spent writing it.

A List data type is implemented using an array. What is the worst case algorithmic complexity of each of the following List methods?

The constructor is O(1) since it only needs to create an array; the list is initially empty. However, it's worth considering that the size of this array will affect the complexity of the constructor - most languages create arrays fully populated with objects, even if you don't plan to use them yet. If the constructors for these objects are non-trivial, then the List constructor will take some time to execute.
The array size might not be fixed - the constructor could take a parameter to determine this size (effectively setting the maximum list size possible). The method then becomes O(n).
The append operation is O(1). It simply has to write an array entry and update the list size.
insert is O(n). In the worst case you will be asked to insert an element at the very beginning of the list. This requires all the elements in the array to be shuffled up one place before writing the first element. The more items in the List, the longer this will take.
Unless you have a ridiculously bad implementation, isEmpty is O(1). The list size will be known, so the return value is a single calculation based on this number.
contains is O(n), presuming the list contents are unordered. In the worst case you will be asked to look for an item that doesn't exist, and will have to traverse every single list item.
get is O(1) thanks to the array implementation. Indexing an array is a constant time operation. If List had been implemented as a linked list, then this would have been an O(n) operation.

How important (honestly) is code performance in your current project? What is the motivator for this performance requirement?

The performance requirements should not be arbitrarily chosen. They should be justified; not just a time limit pulled out of thin air. Every performance requirement is important - there are no specifications that don't matter. How much concern a particular requirement generates depends on how hard it is to meet. Whether it's hard or not, you still have to come up with a design that satisfies it.

In your last optimisation attempt:

Did you use a profiler?

If yes: how much improvement did you measure

If no: how did you know whether you made any kind of improvement?

Did you test the code still worked after optimising

If yes: How thoroughly did you test?

If no: Why not? How could you be sure the code still worked properly for all cases?

Only the most dramatic performance improvements can be detected without a profiler, or some other good timing tests. Human perception is easily fooled - when you've just slaved to speed up the program, it will always appear faster to you.

Test performance improvements carefully, and discard those that are not worthwhile. It's better to have clear code than a miniscule speed increase and unmaintainable logic.

How well specified are your program's performance requirements? Do you have a concrete plan to test that you meet these criteria?

Without a clear specification, no one can really complain that your program isn't fast enough!

Software Testing

Write a test harness for the following piece of code: a function to calculate the greatest common divisor of two integers. Make it as exhaustive as you can. How many individual test cases have you included?

How many of these passed?

How many failed?

Using these tests, identify any faults and repair the code.

There are a large number of tests you should run, even though there are very few 'invalid' input combinations. Thinking of invalid inputs first: test for zero. It may or may not be an invalid value (we've seen no spec, so we can't tell), but you'd expect the code to cope reasonably with it.

Next, write tests considering combinations of 'usual' inputs (say of 1, 10, and 100 in all orders). Then try numbers with no common multiple like 733 and 449. Test for some very large numbers, and also for some negative numbers.

How do you write these test cases? Use a good test framework, or put them all in a single test function. For each test, don't programmatically calculate what the correct output value should be^[3], just check against a known constant value. Keep your test code as simple as possible:

assert(greatest_common_divisor(10,  100) == 10);
assert(greatest_common_divisor(100, 10)  == 10);
assert(greatest_common_divisor(733, 449) == 0);
... more tests ...

There are a surprisingly large number of tests for this simple function. You could argue that for such a small piece of code it's easier to inspect, review, and prove correctness rather than laboriously create a set of tests. This seems like a valid argument. But what if later on someone makes modifications? Without the tests you'd have to carefully re-inspect and re-validate the code, an easy task to overlook.

Did you find the mistake in greatest_common_divisor? There's a clue coming up. If you don't want the puzzle spoiled then look away now. Try feeding it a negative argument. This is a more robust (and more efficient) version written in C++:

int greatest_common_divisor(int a, int b)
{
  a = std::abs(a);
  b = std::abs(b);
  for (int div = std::min(a,b); div  0; -div)
    {
      if ((a % div == 0)  (b % div == 0))
          return div;
  }
  return 0;
}

Should you test the all of the test code that you write?

If you think about this for long enough it will give you a headache. You can't keep testing test code how can you be sure the test code for your test code's test code is correct? The trick is to keep tests as simple as possible. This way, the most likely testing errors will be missing important test cases, not problems with the actual lines of test code.

How does a programmer's testing differ from a QA department member's testing?

Testers are more concerned with the black box style of testing, and usually only perform product testing. It's rare to have testers working at the code level, because most products are executable software; there are comparatively few code library vendors.Programmers are more concerned with white-box tests, making sure their masterful creations work as they planned.

The secret aim of any programmer writing tests is to prove that their code works, not to find cases where it doesn't! I can easily write a load of tests to show how perfect my code is by deliberately avoiding all the bits I know are dodgy. This is a good argument for getting someone other than the original programmer to create test harnesses.

For what percentage of your code do you write tests? Are you happy with this? What sort of testing do you give the remaining code? Is this adequate? What will you do about it?

Don't feel obliged to a write test harness for every scrap of code. Use your head. Inspections are fine for small functions. But you must be sure that you are performing the adequate and appropriate testing for which you are responsible, not just skipping an unpleasant task.

Ask yourself this: how many errors have bitten you recently which could have been prevented with a good set of tests? Make sure you do something about it.

What's your usual response to finding an error in your code?

There are several possible reactions:

disgust and disappointment,
the urge to blame someone else,
happiness, if not downright excitement, and even
pretending you didn't find it, ignoring it, and hoping it will go away (as if that's likely)

Some of those are so plainly wrong that I'll assume you can rise above them.

Does it seem a little daft to suggest you might be happy to find a fault? Surely that's the reasonable reaction for a quality-conscious engineer, as it's far better to find faults during development than for a user to find them in the field.

Your level of excitement will depend on where in the development lifecycle the fault is found. Discovering a show-stopping bug the day before release won't make anyone smile.

^[1] You might want to go back and revisit the specific articles before ploughing into these questions.

^[2] That looks stupid when you see it written down, but it's a very easy trap to fall into at the codeface.

^[3] This would open the door to more coding errors - image the pain of bugs in the test code!

Notes:

More fields may be available via dynamicdata ..