Introduction
The interface to an object indicates what you can do with it, and what you can do with an object is a matter of design. C++ supports different kinds of interfaces, from the explicit compileable interface of a class to the more implicit requirements-based interfaces associated with the STL. Interfaces can be named and organized in terms of one another, again a matter of design.
The most common form of interface relationship is that of substitutability, which, in its popular form, is associated with the good use of inheritance – if class B is not a kind of a class A, B should probably not publicly inherit from A. More generally, one type is said to be substitutable for another if the second satisfies the interface of the first and can be used in the same context. Satisfaction is a matter of direct realization, where all the features of expected of an interface are provided, or specialization, where new features may be added, existing features constrained, or both.
Not all specialization is inheritance and not all substitutability is inheritance based [Henney2000a, Henney2000b, Henney2000c, Henney2001]. Substitutability acts as a more general guideline for the use of inheritance, conversions, operator overloading, templates, and const–volatile
qualification.
Substitutability is relative to a given context or use. Normally the context and use are implied, so that statements about substitutability come out as absolutes. For example, when a developer says that instances of class B may be used where A is expected, this is normally a shorthand for saying that pointers or references to B may be used where pointers or references to A are expected. It is unlikely that copy and slice were intended, especially if A is an abstract class.
The idea that a const
-qualified type is effectively a supertype of a non-const
-qualified type was introduced in the last column [Henney2001]. Each ordinary class can be considered – with respect to const qualification – to have two interfaces, one of which is a specialization of the other. This interesting and different perspective can influence how you go about the design of an individual class, but does it have any other practical and tangible consequences? A single class can support multiple interfaces, but by the same token you can split a class into multiple classes according to interfaces. Consideration of mutability does not simply allow us a rationale for adorning our member functions with const–volatile
qualifiers; it can also allow us to separate a single class concept into base and derived parts.
Ellipsing the Circle
Modeling the relationship and similarity between circles and ellipses (or squares and rectangles, or real numbers and complex numbers) is a recurring question in books, articles, and news groups [Cline+1999, Coplien1992, Coplien1999]. It pops up with a regularity you could almost set your system clock by. Some believe it to be a problem with inheritance – or indeed object orientation as a whole – whereas others believe it is one of semantics [Cline+1999]. But I’m getting ahead of myself: “A circle is a kind of ellipse, therefore circle
inherits from ellipse
â€. Let’s start at the (apparent) beginning.
Redundant State
A direct transliteration of that English statement gives us the following C++:
class ellipse {
...
};
class circle : public ellipse {
...
};
No problems so far. Depending on your sensibilities, you will next focus on either the public interface or the private implementation. Let’s do this back to front and focus on the representation first, just to get one of the common problems out of the way. To keep things simple, let’s focus only on the ellipse axes:
class ellipse {
...
private:
double a, b;
};
class circle : public ellipse {
...
};
An ellipse’s shape can be characterized by its semi-major (a) and semi-minor axes (b), whereas a circle
needs only its radius to describe it. This means that there is no need to add any additional state representation in circle
as everything that it needs has already been defined in ellipse
. So what’s the problem? Before you think it, the private
specifier does not need to be protected
– that would be a problem. The problem is that not only does circle
not need any additional state, it actually has too much state: a == b
is an assertable invariant of the circle
, which means that there is redundant data. It is neither possible to uninherit features (a rock) nor desirable to maintain redundant state (a hard place).
Focusing on inheritance of representation alone could you lead to the topsy-turvy view that ellipse
should inherit from circle
because it would add to its state. Although bizarre, this view is entertained surprisingly often (but is entertaining only if the code is not yours to work with). From bad to worse, there would not only be the absence of any concept of substitutability whatsoever, the circle
would also have the wrong state to inherit in the first place: ellipse
would not just inherit a double
, it would specifically inherit a radius. Now, is the radius the semi-major or the semi-minor axis? The proper conclusion of a representation-focused approach should be that circle
and ellipse
should not share an inheritance relationship at all. At least this conservative view is safe and free from strange consequences. For some systems it may also prove to be the right choice in practice: Premature generalization often leads to unnecessary coding effort in creating options that are never exercised.
Interface Classes
Let’s focus on the interfaces instead. How can you represent the interface for using instances of a class without actually representing the implementation? In particular, how can you do this if you have related but separate types of objects, e.g. circle versus ellipse? Abstract base classes are often thought of as a mechanism for factoring out common features – interface or implementation, but typically a bit of both – in a class hierarchy. Their most common use is as partially implemented classes, but it turns out that their most powerful use is as pure interfaces.
Interface classes are an idiom rather than a language feature, a method for using a mechanism:
-
All ordinary member functions in an interface class are pure
virtual
andpublic
. -
There are no data members in an interface class, except perhaps
static
const members. -
The destructor in an interface class is either
virtual
andpublic
or non-virtual
andprotected
. In the former case destruction through the interface class is permitted, and therefore must be made safe. In the latter case destruction is not one of the features offered by the interface, and restricting it also excludes the kinds ofpublic
problem that requirevirtual
. -
An interface class may have type members, e.g. nested classes, which need not be abstract.
-
These requirements apply recursively to any base classes.
-
That’s it.
Writing a class that actually does nothing seems counter-intuitive, if not a little uncomfortable, for many developers. However, having a pure representation of interface gives the developer two clear benefits:
-
The ability to publish the interface to objects without the potential cloud and clutter of implementation, the flip side of which is the ability to focus on implementation detail given a clear interface.
-
Decoupling client code from implementation detail, which reduces build times (assuming that the concrete implementing class is not in the same header) and the rebuild effect of any change to the implementation (remembering that in software development the only constant is change).
It turns out that these two benefits can be summarized more generally as one: separation of concerns. Returning to our shapes, we can start establishing the usage for ellipses and circles without getting lost in representation:
class ellipse
{
public:
virtual double semi_major_axis() const = 0;
virtual double semi_minor_axis() const = 0;
virtual double area() const = 0;
virtual double eccentricity() const = 0;
...
};
class circle : public ellipse
{
public:
virtual double radius() const = 0;
...
};
Concrete Leaves
The separation of interface from implementation class makes things much clearer. Now, when we come to provide sample implementations, we can see that there is little to be gained from using trying to share implementation detail:
class concrete_ellipse : public ellipse
{
public:
ellipse(double semi_major, double semi_minor)
: a(semi_major), b(semi_minor) {}
virtual double semi_major_axis() const
{
return a;
}
virtual double semi_minor_axis() const
{
return b;
}
virtual double area() const
{
return pi * a * b;
}
virtual double eccentricity() const
{
return sqrt(1 – (b * b) / (a * a));
}
...
private:
double a, b;
};
class concrete_circle : public circle
{
public:
explicit circle(double radius)
: r(radius) {}
virtual double radius() const
{
return r;
}
virtual double semi_major_axis() const
{
return r;
}
virtual double semi_minor_axis() const
{
return r;
}
virtual double area() const
{
return pi * r * r;
}
virtual double eccentricity() const
{
return 0;
}
...
private:
double r;
};
Should you wish to share implementation you can turn to other techniques to do so, e.g. the BRIDGE pattern [Gamma+1995], but the choice of representation is kept clear from the presentation of interface.
The hierarchy shown here has two very distinct parts to it: The classes that represent usage and the classes that represent implementation. Put another way, the classes exist either for interface, e.g.
void draw(point, const ellipse &);
or for creation, e.g.
concrete_circle unit(1);
draw(origin, unit);
In the class hierarchy only the leaves are concrete, and the rest of the hierarchy is abstract – in this case, fully abstract. This design recommendation to “make non-leaf classes abstract†[Meyers1996] can also be stated as “never inherit from concrete classesâ€. It often helps to simplify subtle class hierarchies by teasing apart the classes according to the distinct roles that they play.
Implementation-Only Classes
If you wish to take decoupling a step further, the separation of concerns can be further reinforced by introducing the polar opposite of an interface class: an implementation-only class. This idiom allows developers to define classes that can only be used for object creation; subsequent object manipulation must be through interface classes [Barton+1994]:
-
An implementation-only class inherits, using
public
derivation, from one or more interface classes. -
All ordinary functions in an implementation-only class are
private
. -
Constructors are
public
to allow creation of instances. -
All manipulation of instances is done via pointers or references to one of the base interface classes.
-
The destructor is normally
private
if it ispublic
in the base interface classes. This means that instances of implementation-only classes are normally on the heap, i.e. the result of the new expression is passed immediately to an interface class pointer. -
If the destructor must be
public
in the implementation-only class, this means that instances are normally value based and their lifetime is bound by the enclosing scope, e.g. local variables or data members. This is sometimes a little awkward because the object as held cannot have its members called without an explicit upcast! Hence, implementation-only classes make practical sense only for heap-based objects, which is not appropriate for all designs.
This idiom takes advantage of a feature of C++ that is often seen as a quirk: access specification and virtual
function mechanisms are orthogonal. Here are alternative definitions for concrete circles and ellipses:
class creation_only_ellipse : public ellipse
{
public:
creation_only_ellipse(double semi_major, double semi_minor);
...
private:
virtual double semi_major_axis() const;
virtual double semi_minor_axis() const;
virtual double area() const;
virtual double eccentricity() const;
...
double a, b;
};
class creation_only_circle : public circle
{
public:
explicit creation_only_circle(double radius);
...
private:
virtual double radius() const;
virtual double semi_major_axis() const;
virtual double semi_minor_axis() const;
virtual double area() const;
virtual double eccentricity() const;
...
double r;
};
In use, we could see a circle being created and used as follows:
circle *unit = new creation_only_circle(1);
Change Happens
Management summary of the story so far: circles substitutable for ellipses, some useful techniques, and no real problems. However, good as these techniques are in many designs, they have not resolved the challenge that really makes the circle–ellipse problem as infamous as it is. If you look back, or if you are already familiar with the problem, you will notice that there has been a little sleight of hand in the interfaces presented: The only ordinary member functions are const
qualified. In other words, there are no modifiers. You cannot resize circles or stretch ellipses.
Let’s introduce setters corresponding to the axis getters:
class ellipse
{
public:
virtual void semi_major_axis(double) = 0;
virtual void semi_minor_axis(double) = 0;
... // as before
};
class circle : public ellipse
{
public:
virtual void radius(double) = 0;
... // as before
};
It is fairly obvious what an actual circle does for radius
and an actual ellipse does for semi_minor_axis
and semi_major_axis, but what does a circle do for the semi_minor_axis
and semi_major_axis
setters that it has inherited?
class concrete_ellipse : public ellipse
{
public:
virtual void semi_major_axis(double new_a)
{
a = new_a;
}
virtual double semi_minor_axis(double new_b)
{
b = new_b;
} ...
private:
double a, b;
};
class concrete_circle : public circle
{
public:
virtual void radius(double new_r)
{
r = new_r;
}
virtual void semi_major_axis(double new_a)
{
... // what goes here?
}
virtual double semi_minor_axis(double new_b)
{
... // what goes here?
}
...
private:
double r;
};
This is the issue that gets people excited, producing all kinds of fabulous suggestions, all of which violate substitutability in some way:
-
Set the radius to the new semi-major or semi-minor axis, regardless. This means that
semi_major_axis
andsemi_minor_axis
break the contract expected of them, i.e. that they resize either one of the semi-major or semi-minor axis, but not both. -
Throw an exception, or crash the program with a diagnostic message, when one of the awkward functions is called.
-
Pretend that there isn’t really a problem, and that users of the code should not expect anything specific to happen anyway. Besides, all it takes a little semantic elastic to reinterpret the implied meaning of the
semi_major_axis
setter not as “reset the semi-major axis†but as “reset the semi-major axis, or do something magical and unspecified, or do nothingâ€.
None of these are reasonable. Distinct unrelated classes are preferable to any of them.
No Change
When you find yourself at the bottom of a hole, stop digging. The design we developed moved forward in stable, well-defined steps, and only became unstuck with a particular requirement.... Wait a minute, resizing circles and ellipses was never a requirement. In fact, we have no requirements or real statement of the original problem domain. So, we’re in the middle of a design that does not work, and we have no requirements; do we blame the design, the paradigm, or what? “A circle is a kind of ellipse†was about as much analysis as we – or indeed most people looking at the problem – set out with. We have not stated the purpose or context of our system: Design is context dependent, so if you remove the context you cannot have meaningful design.
Circles and ellipses also seem such a simple, familiar, and intuitive domain that we all feel like domain experts. However, intuition has misled us: When you were taught conic sections at school, were you ever taught that you could that you could resize them? If you transform an ellipse by stretching it, you have a different ellipse, if you resize a circle you have a different circle. In other words, if we chose to model our classes after the strict mathematical model, we would not evolve the design beyond the const
–member–function-only version, although we might wish to add factory functions to accommodate the transformations. That’s it. What violated substitutability was the introduction of change: Eliminate it and you have a model that is both accurate and works [Henney1995].
Now that we understand the real root of the problem – substitutability with respect to change – we can consider alternative approaches that accommodate both change and inheritance-based substitutability. If we do not need both of those features, we can choose to disallow modifiers or not to relate the classes by inheritance, as necessary
Change is the Concern
Therefore, separate according to change. Here is a more useful interpretation of the original statement: “a circle is a kind of ellipse, so long as you don’t change it as an ellipseâ€. This is perhaps more constrained than is strictly required, i.e. geometrically resizing an ellipse on both of its axes is also safe for circles, but the constraint makes the issues and their solution clearer.
Given that statement, what we want is something like the following:
class ellipse {
...
};
class circle : public const ellipse {
// not legal C++
...
};
Other than that one minor nit – that the code isn’t legal C++ – we have a solution that allows circles to be substituted for ellipses where they cannot be changed. Where a function expects to change an ellipse
object via reference or pointer a circle
would not work, but references and pointers to const ellipse
would allow circle
objects to be passed.
However, the good news is that the wished-for version does give us a hint as to how we might organize our classes and functions. Factoring the const
member functions into a separate interface class gives the following:
class const_ellipse
{
public:
virtual double semi_major_axis() const = 0;
virtual double semi_minor_axis() const = 0;
virtual double area() const = 0;
virtual double eccentricity() const = 0;
...
};
class ellipse : public const_ellipse
{
public:
using const_ellipse::semi_major_axis;
using const_ellipse::semi_minor_axis;
virtual void semi_major_axis(double) = 0;
virtual void semi_minor_axis(double) = 0;
...
};
class const_circle : public const_ellipse
{
public:
virtual double radius() const = 0;
...
};
class circle : public const_circle
{
public:
using const_circle::radius;
virtual void radius(double) = 0;
...
};
For consistency with the standard library’s naming conventions for iterators, const_ellipse
and ellipse
are named rather than ellipse
and mutable_ellipse
, but your mileage may vary. For consistency within the same design, circle
is also split into a const_circle
and a circle
class.
Conclusion
Substitutability is not always about “is a kind ofâ€, and not all forms of specialization fit an inheritance hierarchy out-of-theÂbox. Natural language is often not a very good guide at this level so we often need more phrasing options to reveal the useful relationships in our system, such as “implements†and “can be used where a … is expectedâ€.
There are many criteria for partitioning classes into base and derived parts. Taxonomic classification is the most familiar to OO developers, but we have also seen that interface–implementation separation is compelling and that separation with respect to mutability also has a place.
Kevlin Henney
kevlin@curbralan.com
References
[Barton+1994] John J Barton and Lee R Nackman, Scientific and Engineering C++: An Introduction with Advanced Techniques and Examples, Addison-Wesley, 1994.
[Cline+1999] Marshall Cline, Greg Lomow, and Mike Girou, C++ FAQs, 2nd edition, Addison-Wesley, 1999.
[Coplien1992] James O Coplien, Advanced C++: Programming Styles and Idioms, Addison-Wesley, 1992.
[Coplien1999] James O Coplien, Multi-Paradigm Design for C++, Addison-Wesley, 1999.
[Gamma+1995] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, 1995.
[Henney1995] Kevlin Henney, “Vicious Circlesâ€, Overload 8, June 1995.
[Henney2000a] Kevlin Henney, “From Mechanism to Method: Substitutabilityâ€, C++ Report 12(5), May 2000, also available from http://www.curbralan.com.
[Henney2000b] Kevlin Henney, “From Mechanism to Method: Valued Conversionsâ€, C++ Report 12(7), May 2000, also available from http://www.curbralan.com.
[Henney2000c] Kevlin Henney, “From Mechanism to Method: Function Follows Formâ€, C/C++ Users Journal C++ Experts Forum, November 2000, http://www.cuj.com/ experts/1811/henney.html.
[Henney2001] Kevlin Henney, “From Mechanism to Method: Good Qualificationsâ€, C/C++ Users Journal C++ Experts Forum, January 2001, http://www.cuj.com/experts/1901/ henney.html.
[Meyers1996] Scott Meyers, More Effective C++: 35 New Ways to Improve Your Programs and Designs, Addison-Wesley, 1996.
This article was originally published on the C/C++ Users Journal C++ Experts Forum in March 2001 at http://www.cuj.com/experts/1903/henney.htm
Thanks to Kevlin for allowing us to reprint it.
Overload Journal #53 - Feb 2003 + Programming Topics
Browse in : |
All
> Journals
> Overload
> 53
(9)
All > Topics > Programming (877) Any of these categories - All of these categories |