Journal Articles

Overload Journal #50 - Aug 2002 + Programming Topics
Browse in : All > Journals > Overload > 50 (7)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Template Metaprogramming: Shifting Down a Gear

Author: Administrator

Date: 02 August 2002 22:58:29 +01:00 or Fri, 02 August 2002 22:58:29 +01:00

Summary: 

Body: 

Template metaprogramming (MP) in C++ is a powerful technique but the syntax used can be obscure and difficult to understand. Here I propose an alternative approach in which a subset of the standard C++ language is used to write template metaprograms in a natural and familiar style.

This article assumes either an understanding of template metaprogramming or a pretty good ability to absorb new ideas. If you want to read up on the topic before reading this article: see [Veldhuizen1995] online for Todd Veldhuizen's historical paper and some useful links; and [Walker2001] for a recent Overload article on the subject.

There are a number of forms of MP but in this document we use only the general-purpose one presented in Andrei Alexandrescu's Modern C++ Design [Alexandrescu2001]; I will refer to this form of MP as AMP.

Template Metaprogramming

C++ template-metaprogramming uses standard features of the language to achieve computation in the type-domain, at compilation time.

This means that computation is done on types rather than on values. This may sound bizarre but in practice it can aid both in design abstraction and in time/space efficiency. Applications of MP include high-performance numerical computing [Blitz], matrix computation [GMCL], reflection [Attardi-], dimensional analysis [Barton-] and static configuration [Breymann-].

However, despite the power of the technique it isn't really used in the mainstream, and it's been 8 years since Erwin Unruh wrote the first MP program [Unruh1994].

This may be due in part to a lack of suitable C++ compilers in the past but another reason must be that MP is not a designed part of the language: it's really an accident resulting from the interaction of several language features. And - as so often happens when something is used for other than its intended purpose - MP code can be obscure and difficult to understand.

factorial Example

Let's start by implementing the factorial function as a template metaprogram.

Here is a standard implementation of the factorial function:

int factorial(int n) {
  if (n==1) {
    return 1;
  }
  else {
    return n*factorial(n-1);
  }
}
...
// example call
int x = factorial(5);

The two branches of the conditional statement return the two possible outcomes:

  • when n==1 the function simply returns 1

  • when n!=1 the function returns the result of calling itself recursively with an argument of n-1

Here is an MP implementation of the factorial function[1]:

// definition
template<int n>
struct factorial {
  enum {RET = n*factorial<n-1>::RET};
};
// partial specialization
template<>
struct factorial<1> {
  enum {RET = 1};
};
...
// example call
enum {x = factorial<5>::RET};

The definition and partial specialization of the factorial class template here give the two possible outcomes:

  • when n!=1 the result is given by the result of instantiating itself recursively with a template argument of n-1

  • when n==1 the result is simply 1

Compare the possible outcomes of the function with those of the class. Although the factorial class looks very different from the factorial function they have the same logical structure :-

  • return 1 if the parameter is 1

  • return n times the factorial of n-1 otherwise

The big difference between the two implementations is that the template computation happens at compile-time instead of when the program is run. Integer results of template computations are available as compile time constants, for example, in the expression factorial<5>::RET. This means that, for example, you could declare an array like this:

int buffer[factorial<5>::RET];

Types can be manipulated at compile time too, using typedef to name intermediate and final results in the same way that enum (or const int) is used to name integer values. There's an example which uses types later in the article.

The MP Execution Model

If you want to know what happens - in general - when you run a program in a given language you need to know its execution model: a specification of what happens when a program written in the language is run on a conforming implementation.

My first acquaintance with the idea of an execution model for MP was a talk by Gabriel Dos Reis at ACCU 2001 [Reis2001] in which he showed how C++ template metaprograms could be modelled in the Scheme language. Scheme is a good model for MP because both languages' execution models are essentially those of functional programming languages.

Dos Reis talked about M-values (M stands for meta) being the MP equivalent to values in most programming languages. An M-value is a type or anything else that can be manipulated at compile time.

AMP M-values:

  • template instantiation plays the role of a function call

  • template partial specialization provides conditional branching

  • enums set local aliases for complex expressions and return integer results

  • typedefs set local aliases for complex types and return type results

This information is summarized in Table 1.

<colgroup> <col> <col> <col></colgroup> <thead> </thead> <tbody> </tbody>
Feature MP Implementation Example Code
conditional branching template partial specialization
template<typename T>
struct Setup { /* code for general case */ };
template<>
struct Setup<Null> { /* code for T==Null */ };
// OR
template<int N>
struct Factorial{ /* code for general case */ };
template<>
struct Factorial<0> { /* code for N==0 */ };
integer expressions enum
enum {N=Length<T::Tail>+1};
set integer alias enum
return integer results enum
enum {value=N};
call MP "functions" template instantiation
typedef Next<T>::value NextType;
set type alias typedef
return type result typedef
typedef Next<T>::value value;

Table 1. AMP Execution Model

If you look back to the MP implementation of the factorial function you will see that it uses features 1, 2 and 3 with these pieces of code:

  1. factorial<n-1>::RET
    
  2. template<> struct factorial<1>
    
  3. enum {RET = 1}
    

Using these language features sophisticated programs can be written (even a Lisp interpreter [Czarnecki-]) but it's not easy: the syntax is unhelpful and the programs can't be effectively debugged (you can't single-step through a compilation ...).

Modelling AMP in Another Language

The C++ MP code in the factorial example and in Listings 1 and 2 can be difficult to follow but the abstract execution model is very simple. It has single assignment (variables are initialised on declaration and cannot be modified thereafter), conditional selection (choice, as in switch or if), but no iteration (no for, while or do loops - recursion is used instead).

Given the simplicity of the execution model we can consider writing programs in a source language with a more suitable syntax than C++ MP code. Programs written in this source language could be automatically translated to the correct C++ MP code.

But which language? A functional language such as Scheme or Haskell would have the right sort of execution model, but would not be taken up by many C++ programmers.

Instead I propose that we actually use a subset of C++ itself as the source language, which I've called typeshift. This will by definition be familiar to C++ programmers and there are other advantages, as we shall see.

Here's the factorial example in typeshift :-

int factorial(int n) {
  switch(n) {
    case 1:
      return 1;
    default:
      return n*factorial(n-1);
  }
}

It is, of course, the same code as the standard factorial function we gave earlier (which should be no surprise). This code is similar but not identical to the factorial function we gave earlier. It uses switch instead of if because typeshift will not initially support if.

The point of typeshift is this: you write a program in the typeshift language and then use a translator to convert it to C++ MP code. The translator would convert the above factorial program to this MP program (again, this code is identical to that of the earlier example):

// definition
template<int n>
struct factorial {
  enum {RET = n*factorial<n-1>::RET};
};
// partial specialization
template<>
struct factorial<1> {
  enum {RET = 1};
};

So now you can write a program in something resembling everyday C++ and have it converted to a template metaprogram.

Now this is just a tutorial example: a MP factorial program isn't very practical because it has to be recompiled every time you want to compute a different factorial. An MP program is only useful if you can make use of the results of the compile-time computation when the program is run. Later on we'll look at an admittedly abstract but genuinely useful example of template metaprogramming.

typeshift

typeshift is a small subset of C++. We take only those features which are required to support the AMP execution model:

  • classes/structs with simple data and function members

  • variable initialisation

  • switch statements

  • return statements

We do, of course, need to be able to represent types so that we can support template type-parameters. You might think that this is where it gets complicated, but in fact it doesn't - in "real" C++ types are very different from values but in typeshift they are quite similar: everything is just an M-value.

typeshift uses distinguished identifiers like type, fixed_type and template_type to declare variables (and subclasses) which to represent types and such variables behave as they do in MP.

This execution model of AMP as supported by typeshift is shown in Table 2 [next page] which you will want to compare with Table 1. Remember that this is only the first version of typeshift: over time its syntax and semantics will be extended to make it even easier to write template metaprograms.

I have not yet looked into mapping other forms of MP into typeshift but I hope that we will only need to add a few more features of C++ to the language for it to be able to model any current use of MP.

Typelist Example

Here's an example of MP which uses types. Listings 1 and 2 are an implementation of the typelist data-structure from Alexandrescu [Alexandrescu2001] (not actually his implementation). They demonstrate how AMP can be used to implement a simple type-data- structure (a linked list of types) and a type-function which finds the length of such a list.

Listings 3 and 4 [next page] implement the same program, but in typeshift.

The .h file in Listing 3 is very different from the .h file in Listing 1 but if you read them while referring to Tables 1 and 2 you should be able to follow how the two sets of code correspond.

<colgroup> <col> <col> <col></colgroup> <thead> </thead> <tbody> </tbody>
Feature typeshift Implementation Example Code
conditional branching (on types) switch
switch(T) {
  case Null: // code for T==Null
  default: // code for general case
};
// OR
switch(N) {
  case 0: // code for N==0
  default: // code for general case
}
integer expressions expression
int N=Length(T.Tail)+1;
set integer alias definition
return integer results return
return N;
call MP "functions" function call syntax
type NextType=Next(T);
set type alias type definition
return type result return
return NextType;

Table 2. typeshift Execution Model

One important difference between the two sets of code is the namespace: it was typelist in the original C++ MP but is metatypeshift::meta::_typelist in the typeshift code. This is because we propose to signal the presence of typeshift code by enclosing it in a distinguished namespace whose name begins with typeshift::meta, or in a namespace derived from this. An enhanced C++ compiler or external tool can use this to pick out the typeshift code from the "normal" C++ code.

The .cpp file in Listing 4 is practically identical to the .cpp file in Listing 2 - only the namespace identifier and the name for the null list-terminator are different. This is because I don't propose changing the syntax of references to the names in the generated C++ MP code (at least, not yet) because that is going to be rather more difficult to handle than just transforming the definition.

Coins Example

In [Reis2001] Gabriel Dos Reis gives an example from Abelson & Sussman [Abelson-1985] of a coin-counting program, first of all in Scheme and then in C++ MP code. Given an amount of money (in pennies) the program returns the number of ways in which change can be given using British coins.

Dos Reis's C++ MP code along with a test rig is given in Listings 5 and 6.

I translated it into typeshift and this (again, with a test rig) is given in Listings 7 and 8.

If you compare the two sets of listings I think you will agree that the typeshift version is clearer. But there's more - typeshift is a subset of C++ so we can actually compile and run a typeshift program without translating it to C++ MP code.

Both programs can be built[2] with gcc 2.91.66. When compiling the C++ MP code pass -ftemplate-depth-99 to g++ to maximise the size of the problem that can be handled.

When run with the same input the programs give identical results so they are in some sense operationally equivalent. However, the C++ MP code computes the answer at compile-time but the typeshift code computes the answer at run-time.

This operational equivalence means that metaprograms can be written in typeshift and tested and debugged in the value-domain. Only then need the program be transformed to C++ MP code for execution at compilation time.

Even typeshift programs that use types can be compiled, run and debugged in this way using the typeshift 'type' class library.

Implementation

There are two ways of implementing typeshift:

  • by extending a C++ compiler

  • by writing an external tool

A C++ compiler essentially already has the mechanism to compile typeshift because syntactically and semantically it is a true subset of C++. Two changes would be needed:

  • Only a subset of C++ features are allowed in typeshift so the standard parser would need to be adapted to handle this restricted "dialect".

  • The typeshift code needs to be transformed into the C++ MP code before it is compiled in the normal way. This is a fairly straightforward transformation.

There should be no impact on compilation outside the typeshift::meta_xxx namespace because from the point of view of code outside the namespace the names defined inside the namespace are from the transformed C++ code. It is not possible for code outside the namespace to "have any knowledge of" the original typeshift code.

Implementation by an External Tool

A proof-of-concept typeshift pre-processor is currently under development, and should be available from http://www.typeshift.org when this article is published. It will be downloadable in the form of source code for a C++ program released under the GPL (GNU General Public License [GPL]).

Conclusion

Template metaprogramming has traditionally been viewed as an esoteric and obscure area of C++, but using typeshift metaprograms can now be written (and even debugged) in a familiar language. Hopefully this will lead to metaprogramming being used much more widely in C++.

Acknowledgements

Many thanks to Gabriel Dos Reis for his advice and encouragement, and also for his permission to quote the example program from [Reis2001].

Thanks too to Mark Radford for his invaluable comments and suggestions, which have certainly made the article easier to understand.

References

[Veldhuizen1995] Veldhuizen, 1995: Template Metaprograms (http://www.osl.iu.edu/~tveldhui/papers/Template-Metaprograms/meta-art.html)

[Walker2001] Walker, 2001: "Template Metaprogramming: make your compiler work for you" Overload 46

[GMCL] Generative Matrix Computation Library (http://wwwia.tu-ilmenau.de/~czarn/gmcl/)

[Attardi-] Giuseppe Attardi, Antonio Cisternino: Reflection support by means of template metaprogramming (http://citeseer.nj.nec.com/451721.html)

[Barton-] John J. Barton, Lee R. Nackman: "Scientific and Engineering C++: Dimensional Analysis" C++ Report, vol 7 p39, Jan. 1995

[Breymann-] Ulrich Breymann, Krzysztof Czarnecki, Ulrich Eisenecker: Generative Components: One Step Beyond Generic Programming (http://home.t-online.de/home/Ulrich.Eisenecker/dag.htm)

[Unruh1994] Unruh, 1994: Prime number computation (ANSI X3J16-94-0075/ISO WG21-462)

[Alexandrescu2001] Alexandrescu, 2001: Modern C++ Design (Addison-Wesley, ISBN 0-201-70431-5)

[Reis2001] Dos Reis, 2001: Metaprogramming in C++ (http://www.cmla.ens-cachan.fr/~dosreis/C++/)

[Czarnecki-] Czarnecki, Eisenecker: metalisp.cpp (http://home.t-online.de/home/Ulrich.Eisenecker/meta.htm)

[Abelson-1985] Abelson and Sussman, 1985: Structure and Interpretation of Computer Programs (http://mitpress.mit.edu/sicp)

[GPL] GNU General Public License (http://www.gnu.org/licenses/gpl.html#SEC1)

[Reis2002] Dos Reis, 2002: (personal communication)



[1] The historical use of enum in this context was originally a workaround for compiler limitations. On a compiler that supports the latest version of the standard "const int RET = 1;" is not only perfectly legal, but perhaps a more idiomatic usage

[2] MSVC6 won't compile the C++ MP program because it does not support partial template specialisation (I have not tried MSVC7)

Notes: 

More fields may be available via dynamicdata ..