ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinTaming Complexity: A Class Hierarchy Tale

Overload Journal #67 - Jun 2005 + Programming Topics   Author: Mark Radford

Introduction

Some years ago, I was involved in the design of the software for a security system1. The software had to process events from various sensor devices detecting motion in the building. However, consider the operation of such a system installed in a typical office building: obviously alarms shouldn't ring when people are in the building during normal working hours. Actually that's not strictly correct: there may be areas of the building that are normally off limits, and so should be monitored in normal working hours. Further, there may be cases when people are working during the night in certain parts of the building. Clearly this security system needed to be flexible - i.e. it needed a lot of configurability designing into it. To this end I had the task of designing a logic engine in support of configurability far beyond the simple description given above.

From this design comes the class hierarchy shown in Listing 1.

The trigger class represents a trigger event, i.e. an event that potentially participates in the triggering of an alert. Motion being detected in a certain area and a door that should be closed being open, are both examples of triggers going active. Further, a trigger can be a simple time interval, 1800-0800 for example - i.e. not during daytime office hours. In the domain terminology, such events are said to cause a trigger to "go active". It is possible that a single trigger event could trigger an alarm, motion being detected in an off-limits part of the building, for example. However, motion detected in certain (most) parts of the building between 1800 and 0800 - i.e. a combination of two trigger events - is the kind of combination of events requiring an alert to be raised. This is where the evaluator class hierarchy comes in. I don't propose to go any further into the problem domain analysis, but it turns out that when two trigger events are of interest, there are two cases where it may be necessary to raise an alert:

  • The case where both triggers need to be active (logical AND), and

  • The case where either or both triggers need to be active (logical OR)

The evaluator hierarchy shown in Listing 1 facilitates this, and also leaves open the possibility of extending the system to handle the logical XOR case. Note in passing, that I'm ignoring cases where an alert is raised if only a single trigger goes active - these do not form part of this illustration.

The C++ Three Level Hierarchy

I've used the evaluator class hierarchy and spent some time explaining it, because I like the idea of a realistic example to work with. The above comes from a real project, although what I've shown is a simplified version for illustration purposes.

I'm using this hierarchy as an example of the three level structures that are idiomatic in C++ class hierarchy design: interface class (see [Radford] for an explanation of the term) at the top, common (but abstract) implementation in the middle, and the most derived classes forming the (concrete) implementations.

However the tale of this hierarchy has a slight twist in it. Let me just digress for the moment and look at another three level hierarchy - one I trust will need no explanation.

The circular_shape hierarchy, shown in Listing 2 on page 26, uses the three level approach already discussed. However in this case, the common_circular_shape class (the middle class) captures state common across the concrete (most derived) classes, while the derived classes themselves have their own specific state. Also, note the constructors: there is a lot of repetition of code here, in fact the constructors of circle and ellipse differ in name only.

Compare the circular_shape and evaluator hierarchies. In the latter case, the concrete (most derived) classes have no state of their own - all state is common and captured in the common implementation. However, there is something and_evaluator and or_evaluator have in common with circle and ellipse: they are saddled with the repetitive constructor baggage. That is to say, the derived classes must each have an almost identical constructor just to initialise state that is kept entirely in a base class. The areas of variability in and_evaluator and or_evaluator are:

  • The name of the constructor - everything else about the constructors is the same for both classes

  • The two characters denoting logical operation in evaluate()- other than that, the implementation of this virtual function is exactly the same for both classes, and the same would apply if an xor_evaluator were ever to be added

Everything else is the same between these two classes (and this also applies to a future xor_evaluator).

This suggests a need to investigate ways of simplifying the design. Herein is the subject matter for the rest of this article.

I will investigate two ways to separate the commonality and variability: one using delegation, and the other using static parameterisation using C++ templates.

Approach Using Delegation

This approach draws on the STRATEGY design pattern [Gamma-]. First, let's deal with the evaluate() virtual function - I'm going to factor out and_evaluator and or_evaluator's implementations into and_operation and or_operation respectively, and derive these from the interface class operation. The resulting hierarchy is shown in Listing 3 on page 27.

The evaluator class no longer needs to be in a hierarchy. It now looks like Listing 4. This design has the benefit that there is no need for repetitive constructor code (the operation hierarchy is stateless). However there is a disadvantage - the lifetimes of operation objects must now be managed.

This can be addressed by taking the evolution of this design one step further - i.e. operations being stateless can be taken advantage of and, instead of using classes, function pointers can be used, as shown in Listing 5 on page 28.

However even at this stage in the evolution of the design, a crucial piece of commonality remains to be dealt with: the operation::result() virtual function implementations still differ only in the two characters denoting the logical operation.

Approach Using C++ Templates

This approach mixes run time polymorphism with static parameterisation. In C++ terms, it mixes classes that have virtual functions, with templates. (See Listing 6 on page 28).

The evaluator interface (base) class is just the same as it was in the original three level hierarchy we started with, and this design does not introduce a second hierarchy. The part that's changed is the derived classes - these have been replaced by a single class template called evaluator_implementor, derived (publicly) from evaluator. The evaluator_implementor class contains the state and implements the evaluate() virtual function using a function object supplied as a template parameter.

The idea is that and_evaluator and or_evaluator can now be implemented in a very straightforward manner using the standard library's logical_and<> and logical_or<> function object templates as template arguments:

typedef evaluator_implementor<std::logical_and<bool> > and_evaluator;
typedef evaluator_implementor<std::logical_or<bool> > or_evaluator;

Advantages:

  • The hierarchy is much simplified. A three level hierarchy has been collapsed into a two level hierarchy and there is only one class at each level!

  • There is only one implementation of evaluate() and only one derived class constructor - thus eliminating the repetition of code sharing much commonality.

  • The separation between commonality and variability is much more clearly stated. C++ templates have been used to very effectively express this separation of concerns. The structure is expressed in a base class with one class template derived from it. The implementation specifics are factored out into the template parameter.

Disadvantages:

  • With evaluator_implementor being a class template, there is no way to avoid the implementation of evaluate() leaking into the client code. This is not a major disadvantage and is unavoidable in some template code. Nevertheless, I regard it as being more desirable for such implementation leaks to be avoided if possible. There is a straightforward mechanism for dealing with this, which I'll come back to in a moment.

  • The final disadvantage is very much one of pragmatics. There are still very few programmers out there who are comfortable with templates and template techniques. Although more and more are becoming familiar with the STL, this only requires the knowledge to use a template, whereas the design under discussion requires the confidence to write one. In practice this usually means that if a development group has a programmer who is happy to design/write this sort of code, the programmer may be discouraged from doing so because other members of the group will not be comfortable with the code. It is sadly ironic that most working C++ programmers are likely to be happier with a design containing the noise of more repetition of code (and hence more opportunity to make a mistake) and less clarity of intent, just because it doesn't use a template.

Just before moving on, I need to illustrate the mechanism I alluded to in the first disadvantages point above.

Rather than following the traditional approach of placing the evaluator_implementor class template in a header file, it can be placed in an implementation (typically .cpp) file. The and_evaluator and or_evaluator typedefs can also be placed in this implementation file. Client code does not need to see the class template definition, or the typedefs, because all use of objects is intended to be via the evaluator interface class. This still leaves us with a slight problem - how can client code create and_evaluator and or_evaluator objects? Well, they can't do so directly, but simple factory functions making it possible can be provided. The code fragment in Listing 7 illustrates this.

The definitions of the factory functions are in the same implementation file as evaluator_implementor. Therefore, both implementation and instantiation of this class template are in the same file, and its definition is never required anywhere else in the code.

Finally

When complexity occurs in design, attempts to pretend it isn't there lead inevitably to trouble. Complexity can't be eliminated, it must be managed.

While the delegation based approach had to be explored, it didn't really buy us very much. It solved only the problem of constructor code repetition, but introduced the extra problem of managing more objects. Taking the approach one step further - i.e. using function pointers - eliminated this problem but the problem of repetitive constructor code remained.

The approach using C++ templates, however, is an excellent illustration of managing complexity by moving it out of the code and into the programming language. In stating the advantages and disadvantages of this approach I made the point about most programmers still not being happy to write templates. Templates are actually a complex feature of the C++ language, and no doubt this accounts for the reluctance of programmers to adopt them. However, the benefit of including a complex feature in the language is borne out in the design using templates - it is an excellent illustration of the C++ language absorbing complexity, rather than allowing the complexity to manifest itself in the actual code. This is the approach I adopted for the design of the security system described in the introduction.

Acknowledgements

Many thanks to Phil Bass and Alan Griffiths for their helpful comments.

References

[Gamma-] Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, 1995.

[Radford] Mark Radford, "C++ Interface Classes - An Introduction", http://www.twonine.co.uk/articles/CPPInterfaceClassesIntro.pdf

Overload Journal #67 - Jun 2005 + Programming Topics