Programming Topics + CVu Journal Vol 30, #4 - September 2018

Browse in :

All > Topics > Programming
All > Journals > CVu > 304
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: The Ghost of a Codebase Past

Author: Bob Schmidt

Date: 09 September 2018 22:05:13 +01:00 or Sun, 09 September 2018 22:05:13 +01:00

Summary: Pete Goodliffe learns lessons by reviewing his own old code.

Body:

I will live in the Past, the Present, and the Future.
The Spirits of all Three shall strive within me.
I will not shut out the lessons that they teach!
~ Charles Dickens, A Christmas Carol

Nostalgia isnâ€™t what it used to be. And neither is your old code. Who knows what functional gremlins and typographical demons lurk in your ancient handiwork? You thought it was perfect when you wrote itâ€”but cast a critical eye over your old code and youâ€™ll inevitably bring to light all manner of code gotchas.

Programmers, as a breed, strive to move onwards. We love to learn new and exciting techniques, to face fresh challenges, and to solve more interesting problems. Itâ€™s natural. Considering the rapid turnover in the job market, and the average duration of programming contracts, itâ€™s hardly surprising that very few software developers stick with the same codebase for a prolonged period of time.

But what does this do to the code we produce? What kind of attitude does it foster in our work? I maintain that exceptional programmers are determined more by their attitude to the code they write and the way they write it, than by the actual code itself.

The average programmer tends not to maintain their own code for too long. Rather than roll around in our own filth, we move on to new pastures and roll around in someone elseâ€™s filth. Nice. We even tend to let our own â€˜pet projectsâ€™ fall by the wayside as our interests evolve.

Of course, itâ€™s fun to complain about other peopleâ€™s poor code, but we easily forget how bad our own work was. And youâ€™d never intentionally write bad code, would you?

Revisiting your old code can be an enlightening experience. Itâ€™s like visiting an ageing, distant relative you donâ€™t see very often. You soon discover that you donâ€™t know them as well as you think. Youâ€™ve forgotten things about them, about their funny quirks and irritating ways. And youâ€™re surprised at how theyâ€™ve changed since you last saw them (perhaps, for the worst).

Looking back at your older code will inform you about the improvement (or otherwise) in your coding skills.

Looking back at old code youâ€™ve produced, you might shudder for a number of reasons.

Presentation

Many languages permit artistic interpretation in the indentation layout of code. Even though some languages have a de facto presentation style, there is still a large gamut of layout issues which you may find yourself exploring over time. Which ones stick depends on the conventions of your current project, or on your experiences after years of experimentation.

Different tribes of C++ programmers, for example, follow different presentation schemes. Some developers follow the standard library scheme:

  struct standard_style_cpp
  {
    int variable_name;
    bool method_name();
  };

Some have more Java-esque leanings:

  struct JavaStyleCpp
  {
    int variableName;
    bool methodName();
  };

And some follow a C# model:

  struct CSharpStyleCpp
  {
    int variableName;
    bool MethodName();
  };

A simple difference, but it profoundly affects your code in several ways.

Another C++ example is the layout of member initialiser lists. One of my teams moved from this traditional scheme:

  Foo::Foo(int param)
  : member_one(1),
    member_two(param),
    member_three(42)
  {
  }

to a style that places the comma separators at the beginning of the following line, thus:

  Foo::Foo(int param)
  : member_one(1)
  , member_two(param)
  , member_three(42)
  {
  }

We found a number of advantages with the latter style (itâ€™s easier to â€˜knock outâ€™ parts in the middle via preprocessor macros or comments, for example). This prefix-comma scheme can be employed in a number of layout situations (e.g., many kinds of lists: members, enumerations, base classes, and more), providing a nice consistent shape. There are also disadvantages, one of the major cited issues being that itâ€™s not as â€˜commonâ€™ as the former layout style. IDEsâ€™ default auto-layout also tends to fight with it.

I know over the years that my own presentation style has changed wildly, depending on the company Iâ€™m working for at the time.

As long as a style is employed consistently in your codebase, this is really a trivial concern and nothing to be embarrassed about. Individual coding styles rarely make much of a difference once you get used to them, but inconsistent coding styles in one project make everyone slower.

The state of the art

Most languages have rapidly developed their in-built libraries. Over the years, the Java libraries have grown from a few hundred helpful classes to a veritable plethora of classes, with different skews of the library depending on the Java deployment target. Over C#â€™s revisions, its standard library has also burgeoned. As languages grow, their libraries accrete more features.

And as those libraries grow, some of the older parts become deprecated.

Such evolution (which is especially rapid early in a languageâ€™s life) can unfortunately render your code anachronistic. Anyone reading your code for the first time might presume that you didnâ€™t understand the newer language or library features, when those features simply did not exist when the code was written.

For example, when C# added generics, the code you would have written like this:

  ArrayList list = new ArrayList(); // untyped

  list.Add("Foo");
  list.Add(3); // oops!

with its inherent potential for bugs, would have become:

  List<string> list = new List<string>();
  list.Add("Foo");
  list.Add(3); // compiler error - nice

There is a very similar Java example with surprisingly similar class names!

The state of the art moves much faster than your code. Especially your old, untended code.

Even the (relatively conservative) C++ library has grown considerably with each new revision. C++ language features and library support have made much old C++ code look old-fashioned. The introduction of a language-supported threading model renders third-party thread libraries (often implemented with rather questionable APIs) redundant. The introduction of lambdas removes the need for a lot of verbose handwritten â€˜trampolineâ€™ code. The range-based for helps remove a lot of syntactical trees so you can see the code-design wood. Once you start using these facilities, returning to older code without them feels like a retrograde step.

Idioms

Each language, with its unique set of language constructs and library facilities, has a particular â€˜best practiceâ€™ method of use. These are the idioms that experienced users adopt, the modes of use that have become honed and preferred over time.

These idioms are important. They are what experienced programmers expect to read; they are familiar shapes that enable you to focus on the overall code design rather than get bogged down in macro-level code concerns. They usually formalise patterns that avoid common mistakes or bugs.

Itâ€™s perhaps most embarrassing to look back at old code, and see how un-idiomatic it is. If you now know more of the accepted idioms for the language youâ€™re working with, your old non-idiomatic code can look quite, quite wrong.

Many years ago, I worked with a team of C programmers moving (well, shuffling slowly) towards the (then) brave new world of C++. One of their initial additions to a new codebase was a max helper macro:

  #define max(a,b) ((a)>(b)) ? (a) : (b))
  // do you know why we have all those brackets?

  void example()
  {
    int a = 3, b = 10;
    int c = max(a, b);
  }

In time, someone revisited that early code and, knowing more about C++, realised how bad it was. They rewrote it in the more idiomatic C++ shown here, which fixed some very subtle lurking bugs (see Listing 1).

template <typename T>
inline T max(const T &a, const T &b)
{
  // Look mum! No brackets needed!
  return a > b ? a : b;
}
void better_example()
{
  int a = 3, b = 10;
  // this would have failed using the macro
  // because ++a would be evaluated twice
  int c = max(++a, b);
}

Listing 1

The original version also had another problem: wheel reinvention. The best solution is to just use the built-in std::max function that always existed. Itâ€™s obvious in hindsight:

  // don't declare any max function
  
  void even_better_example()
  {
    int a = 3, b = 10;
    int c = std::max(a,b);
  }

This is the kind of thing youâ€™d cringe about now, if you came back to it. But you had no idea about the right idiom back in the day.

Thatâ€™s a simple example, but as languages gain new features (e.g., lambdas) the kind of idiomatic code youâ€™d write today may look very different from previous generations of the code.

Design decisions

Did I really write that in Perl; what was I thinking?! Did I really use such a simplistic sorting algorithm? Did I really write all that code by hand, rather than just using a built-in library function? Did I really couple those classes together so unnecessarily? Could I really not have invented a cleaner API? Did I really leave resource management up to the client code? I can see many potential bugs and leaks lurking there!

As you learn more, you realise that there are better ways of formulating your design in code. This is the voice of experience. You make a few mistakes, read some different code, work with talented coders, and pretty soon find you have improved design skills.

Bugs

Perhaps this is the reason that drags you back to an old codebase. Sometimes coming back with fresh eyes uncovers obvious problems that you missed at the time. After youâ€™ve been bitten by certain kinds of bugs (often those that the common idioms steer you away from) you naturally begin to see potential bugs in old code. Itâ€™s the programmerâ€™s sixth sense.

Conclusion

No space of regret can make amends for one lifeâ€™s opportunity misused.

~ Charles Dickens,

A Christmas Carol

Looking back over your old code is like a code review for yourself. Itâ€™s a valuable exercise; perhaps you should take a quick tour through some of your old work. Do you like the way you used to program? How much have you learnt since then?

Does this kind of thing actually matter? If your old code isnâ€™t perfect, but it works, should you do anything about it? Should you go back and â€˜adjustâ€™ the code? Probably not â€“ if it ainâ€™t broke donâ€™t fix it. Code does not rot, unless the world changes around it. Your bits and bytes donâ€™t degrade, so the meaning will likely stay constant. Occasionally a compiler or language upgrade or a third-party library update might â€˜breakâ€™ your old code, or perhaps a code change elsewhere will invalidate a presumption you made. But normally, the code will soldier on faithfully, even if itâ€™s not perfect.

Itâ€™s important to appreciate how times have changed, how the programming world has moved on, and how your personal skills have improved over time. Finding old code that no longer feels â€˜rightâ€™ is a good thing: it shows that you have learnt and improved. Perhaps you donâ€™t have the opportunity to revise it now, but knowing where youâ€™ve come from helps to shape where youâ€™re going in your coding career.

Like the Ghost of Christmas Past, there are interesting cautionary lessons to be learnt from our old code if you take the time to look at it.

Questions

How does your old code shape up in the modern world? If it doesnâ€™t look too bad, does that mean that you havenâ€™t learnt anything new recently?
How long have you been working in your primary language? How many revisions of the language standard or built-in library have been introduced in that time? What language features have been introduced that have shaped the style of the code you write?
Consider some of the common idioms you now naturally employ. How do they help you avoid errors?

Pete Goodliffe Pete Goodliffe is a programmer who never stays at the same place in the software food chain. He has a passion for curry and doesnâ€™t wear shoes. Pete can be contacted at pete@goodliffe.net or @petegoodliffe

Pete's latest book, Becoming a Better Programmer, is published by O'Reilly. It's available at http://shop.oreilly.com/product/0636920033929.do (and all good book stores).

Notes:

More fields may be available via dynamicdata ..