Programming Topics + CVu Journal Vol 30, #4 - September 2018
Browse in : All > Topics > Programming
All > Journals > CVu > 304
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: The Ghost of a Codebase Past

Author: Bob Schmidt

Date: 09 September 2018 22:05:13 +01:00 or Sun, 09 September 2018 22:05:13 +01:00

Summary: Pete Goodliffe learns lessons by reviewing his own old code.

Body: 

I will live in the Past, the Present, and the Future.
The Spirits of all Three shall strive within me.
I will not shut out the lessons that they teach!
~ Charles Dickens, A Christmas Carol

Nostalgia isn’t what it used to be. And neither is your old code. Who knows what functional gremlins and typographical demons lurk in your ancient handiwork? You thought it was perfect when you wrote it—but cast a critical eye over your old code and you’ll inevitably bring to light all manner of code gotchas.

Programmers, as a breed, strive to move onwards. We love to learn new and exciting techniques, to face fresh challenges, and to solve more interesting problems. It’s natural. Considering the rapid turnover in the job market, and the average duration of programming contracts, it’s hardly surprising that very few software developers stick with the same codebase for a prolonged period of time.

But what does this do to the code we produce? What kind of attitude does it foster in our work? I maintain that exceptional programmers are determined more by their attitude to the code they write and the way they write it, than by the actual code itself.

The average programmer tends not to maintain their own code for too long. Rather than roll around in our own filth, we move on to new pastures and roll around in someone else’s filth. Nice. We even tend to let our own ‘pet projects’ fall by the wayside as our interests evolve.

Of course, it’s fun to complain about other people’s poor code, but we easily forget how bad our own work was. And you’d never intentionally write bad code, would you?

Revisiting your old code can be an enlightening experience. It’s like visiting an ageing, distant relative you don’t see very often. You soon discover that you don’t know them as well as you think. You’ve forgotten things about them, about their funny quirks and irritating ways. And you’re surprised at how they’ve changed since you last saw them (perhaps, for the worst).

Looking back at your older code will inform you about the improvement (or otherwise) in your coding skills.

Looking back at old code you’ve produced, you might shudder for a number of reasons.

Presentation

Many languages permit artistic interpretation in the indentation layout of code. Even though some languages have a de facto presentation style, there is still a large gamut of layout issues which you may find yourself exploring over time. Which ones stick depends on the conventions of your current project, or on your experiences after years of experimentation.

Different tribes of C++ programmers, for example, follow different presentation schemes. Some developers follow the standard library scheme:

  struct standard_style_cpp
  {
    int variable_name;
    bool method_name();
  };

Some have more Java-esque leanings:

  struct JavaStyleCpp
  {
    int variableName;
    bool methodName();
  };

And some follow a C# model:

  struct CSharpStyleCpp
  {
    int variableName;
    bool MethodName();
  };

A simple difference, but it profoundly affects your code in several ways.

Another C++ example is the layout of member initialiser lists. One of my teams moved from this traditional scheme:

  Foo::Foo(int param)
  : member_one(1),
    member_two(param),
    member_three(42)
  {
  }

to a style that places the comma separators at the beginning of the following line, thus:

  Foo::Foo(int param)
  : member_one(1)
  , member_two(param)
  , member_three(42)
  {
  }

We found a number of advantages with the latter style (it’s easier to ‘knock out’ parts in the middle via preprocessor macros or comments, for example). This prefix-comma scheme can be employed in a number of layout situations (e.g., many kinds of lists: members, enumerations, base classes, and more), providing a nice consistent shape. There are also disadvantages, one of the major cited issues being that it’s not as ‘common’ as the former layout style. IDEs’ default auto-layout also tends to fight with it.

I know over the years that my own presentation style has changed wildly, depending on the company I’m working for at the time.

As long as a style is employed consistently in your codebase, this is really a trivial concern and nothing to be embarrassed about. Individual coding styles rarely make much of a difference once you get used to them, but inconsistent coding styles in one project make everyone slower.

The state of the art

Most languages have rapidly developed their in-built libraries. Over the years, the Java libraries have grown from a few hundred helpful classes to a veritable plethora of classes, with different skews of the library depending on the Java deployment target. Over C#’s revisions, its standard library has also burgeoned. As languages grow, their libraries accrete more features.

And as those libraries grow, some of the older parts become deprecated.

Such evolution (which is especially rapid early in a language’s life) can unfortunately render your code anachronistic. Anyone reading your code for the first time might presume that you didn’t understand the newer language or library features, when those features simply did not exist when the code was written.

For example, when C# added generics, the code you would have written like this:

  ArrayList list = new ArrayList(); // untyped

  list.Add("Foo");
  list.Add(3); // oops!

with its inherent potential for bugs, would have become:

  List<string> list = new List<string>();
  list.Add("Foo");
  list.Add(3); // compiler error - nice

There is a very similar Java example with surprisingly similar class names!

The state of the art moves much faster than your code. Especially your old, untended code.

Even the (relatively conservative) C++ library has grown considerably with each new revision. C++ language features and library support have made much old C++ code look old-fashioned. The introduction of a language-supported threading model renders third-party thread libraries (often implemented with rather questionable APIs) redundant. The introduction of lambdas removes the need for a lot of verbose handwritten ‘trampoline’ code. The range-based for helps remove a lot of syntactical trees so you can see the code-design wood. Once you start using these facilities, returning to older code without them feels like a retrograde step.

Idioms

Each language, with its unique set of language constructs and library facilities, has a particular ‘best practice’ method of use. These are the idioms that experienced users adopt, the modes of use that have become honed and preferred over time.

These idioms are important. They are what experienced programmers expect to read; they are familiar shapes that enable you to focus on the overall code design rather than get bogged down in macro-level code concerns. They usually formalise patterns that avoid common mistakes or bugs.

It’s perhaps most embarrassing to look back at old code, and see how un-idiomatic it is. If you now know more of the accepted idioms for the language you’re working with, your old non-idiomatic code can look quite, quite wrong.

Many years ago, I worked with a team of C programmers moving (well, shuffling slowly) towards the (then) brave new world of C++. One of their initial additions to a new codebase was a max helper macro:

  #define max(a,b) ((a)>(b)) ? (a) : (b))
  // do you know why we have all those brackets?

  void example()
  {
    int a = 3, b = 10;
    int c = max(a, b);
  }

In time, someone revisited that early code and, knowing more about C++, realised how bad it was. They rewrote it in the more idiomatic C++ shown here, which fixed some very subtle lurking bugs (see Listing 1).

template <typename T>
inline T max(const T &a, const T &b)
{
  // Look mum! No brackets needed!
  return a > b ? a : b;
}
void better_example()
{
  int a = 3, b = 10;
  // this would have failed using the macro
  // because ++a would be evaluated twice
  int c = max(++a, b);
}
			
Listing 1

The original version also had another problem: wheel reinvention. The best solution is to just use the built-in std::max function that always existed. It’s obvious in hindsight:

  // don't declare any max function
  
  void even_better_example()
  {
    int a = 3, b = 10;
    int c = std::max(a,b);
  }

This is the kind of thing you’d cringe about now, if you came back to it. But you had no idea about the right idiom back in the day.

That’s a simple example, but as languages gain new features (e.g., lambdas) the kind of idiomatic code you’d write today may look very different from previous generations of the code.

Design decisions

Did I really write that in Perl; what was I thinking?! Did I really use such a simplistic sorting algorithm? Did I really write all that code by hand, rather than just using a built-in library function? Did I really couple those classes together so unnecessarily? Could I really not have invented a cleaner API? Did I really leave resource management up to the client code? I can see many potential bugs and leaks lurking there!

As you learn more, you realise that there are better ways of formulating your design in code. This is the voice of experience. You make a few mistakes, read some different code, work with talented coders, and pretty soon find you have improved design skills.

Bugs

Perhaps this is the reason that drags you back to an old codebase. Sometimes coming back with fresh eyes uncovers obvious problems that you missed at the time. After you’ve been bitten by certain kinds of bugs (often those that the common idioms steer you away from) you naturally begin to see potential bugs in old code. It’s the programmer’s sixth sense.

Conclusion

No space of regret can make amends for one life’s opportunity misused.

~ Charles Dickens,

A Christmas Carol

Looking back over your old code is like a code review for yourself. It’s a valuable exercise; perhaps you should take a quick tour through some of your old work. Do you like the way you used to program? How much have you learnt since then?

Does this kind of thing actually matter? If your old code isn’t perfect, but it works, should you do anything about it? Should you go back and ‘adjust’ the code? Probably not – if it ain’t broke don’t fix it. Code does not rot, unless the world changes around it. Your bits and bytes don’t degrade, so the meaning will likely stay constant. Occasionally a compiler or language upgrade or a third-party library update might ‘break’ your old code, or perhaps a code change elsewhere will invalidate a presumption you made. But normally, the code will soldier on faithfully, even if it’s not perfect.

It’s important to appreciate how times have changed, how the programming world has moved on, and how your personal skills have improved over time. Finding old code that no longer feels ‘right’ is a good thing: it shows that you have learnt and improved. Perhaps you don’t have the opportunity to revise it now, but knowing where you’ve come from helps to shape where you’re going in your coding career.

Like the Ghost of Christmas Past, there are interesting cautionary lessons to be learnt from our old code if you take the time to look at it.

Questions

Pete Goodliffe Pete Goodliffe is a programmer who never stays at the same place in the software food chain. He has a passion for curry and doesn’t wear shoes. Pete can be contacted at pete@goodliffe.net or @petegoodliffe

Pete's latest book, Becoming a Better Programmer, is published by O'Reilly. It's available at http://shop.oreilly.com/product/0636920033929.do (and all good book stores).

Notes: 

More fields may be available via dynamicdata ..