ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinFunctional Programming Using C++ Templates (Part 2)

Overload Journal #82 - Dec 2007 + Programming Topics   Author: Stuart Golodotz
Continuing his exploration of functional programming and template metaprogramming, Stuart Golodetz looks at some data structures.

Introduction

In my last article [Golodetz], I explored some of the similarities between functional programming in languages such as Haskell, and template metaprogramming in C++. This time, we'll look at how to implement compile-time binary search trees (see Figure 1 for an example). As mentioned previously, these are useful because they allow us to implement static tables that are sorted at compile-time.

Figure 1 is an example of a binary search tree. Note that an in-order walk of the tree produces a list which is sorted in ascending order.

An example of a binary search tree
Figure 1

Orderings

Our finished tree code will implement a compile-time map from a key type to a value type. Since binary search trees are sorted (an in-order walk of the tree will produce a sorted list), we'll need to define an ordering on the key type to do this.

We'll start by defining a default ordering of our various element types. Bearing in mind our definitions of the Int and Rational types in the last article, we can write this as follows:


      template <typename x, typename y> struct Order
      {
        enum { less = x::value < y::value };
        enum { greater = x::value > y::value };
      };

We'll also define ordering predicates for later use with Filter (Listing 1).

    template <typename y,
       template <typename,typename> class Ordering>
    struct LessPred
    {
      template <typename x>
      struct Eval
      {
        enum { value = Ordering<x,y>::less };
      };
    };

    template <typename y,
       template <typename,typename> class Ordering>
    struct GreaterPred
    {
      template <typename x>
      struct Eval
      {
        enum { value = Ordering<x,y>::greater };
      };
    };
        
Listing 1

As we'll see later, our trees will be constructed from a list of (key,value) pairs. We can define a simple compile-time pair type as follows:


      template <typename F, typename S>
      struct Pair
      {
        typedef F first;
        typedef S second;
      };

The above definition is for a generic pair (not just one which can be used with our tree code). We can define an ordering for a (key, value) pair as follows (note that I've called this KeyOrder rather than defining it as a partial specialization of the Order template - we don't really want there to be a 'default' ordering for pairs):


      template <typename x,
         typename y> struct KeyOrder;

      template <typename k1, typename v1,
         typename k2, typename v2>
      struct KeyOrder<Pair<k1,v1>, Pair<k2, v2> >
      {
        enum { less = Order<k1,k2>::less };
        enum { greater = Order<k1,k2>::greater };
      };

Building a balanced tree

For a binary search tree to be 'efficient', we want it to be as balanced as possible; in other words, we ideally want its height to be logarithmic rather than linear in the number of its elements. A good way to go about arranging this when constructing a tree is to start from a list l of (key,value) pairs (key1, value1) ... (keyn, valuen), where all the keyi values are distinct, and proceed as follows:

  • Choose a pair (keys, values) from l such that:
            | #{i:keyi < keys} - #[i:keyi > keys} |
    is minimised, i.e. such that keys divides the list of keys as evenly as possible. (Note that < and > in the above are defined in terms of the ordering on the key type.)
  • Generate two sublists:
            lL = {(keyi,valuei):keyi < keys}
    and
            lR = {(keyi,valuei):keyi > keys}
    (Note that lLlR∪{(keys,values)}=l).
  • Recursively construct a tree for each of the sublists (unless they are the empty list) and attach the generated trees as the children of the current node. Set the splitter at the current node to be (keys, values).

Modulo minor alterations for different scenarios, this algorithm is a standard way of building (reasonably) balanced trees. (A similar sort of approach is used when building binary space partitioning (BSP) trees or decision trees.) How do we go about coding it using templates?

Conceptually, it's probably easiest to use a top-down approach to the problem in this instance (especially given the way the description above is written). We start by defining the BuildNode template. This takes a list of (key, value) pairs and an ordering, and defines a typedef called result which will be the type defining the resulting tree (Listing 2).

    template <typename xs,
       template <typename,typename> class Ordering>
    struct BuildNode;

    template <template <typename,typename>
       class Ordering>
    struct BuildNode<NullType, Ordering>
    {
      typedef NullType result;
    };

    template <typename k, typename v, typename xs,
       template <typename,typename> class Ordering>
    struct BuildNode<List<Pair<k,v>,xs>, Ordering>
    {
      typedef List<Pair<k,v>,xs> list;
      typedef typename ChooseSplitter<list,list,
         Ordering>::result splitter;

      typedef typename Filter<LessPred<splitter,
         Ordering>::Eval, list>::result leftList;
      typedef typename Filter<GreaterPred<splitter,
         Ordering>::Eval, list>::result rightList;

      typedef typename BuildNode<leftList,
         Ordering>::result leftChild;
      typedef typename BuildNode<rightList,
         Ordering>::result rightChild;

      typedef TreeNode<splitter, leftChild,
         rightChild> result;
    };
        
Listing 2

The way this works is exactly as you'd expect. First we choose a splitter, then we filter the list l (list in the code) twice to get lL (leftList) and lR (rightList). Finally, we recursively build the subtrees and combine them into a TreeNode.

When it comes to looking up values in the tree (which we will see later), we will need to know what ordering was used when building it. The BuildTree template uses BuildNode to build the actual tree, then wraps the tree and the ordering into a compound type:


      template <typename xs, template <typename,
         typename> class Ordering = KeyOrder>
      struct BuildTree
      {
        typedef Tree<typename BuildNode<xs,Ordering>::
           result,Ordering> result;
      };

Choosing a splitter

We now need to decide how to choose a splitter for a list. The basics of the idea were given in the high-level description above; the way we implement it in practice is to evaluate a metric for each element of the list and pick the one with the lowest score (or any one of those with the lowest score, if necessary). The metric computes the absolute difference between the number of elements less than the potential splitter and the number of those greater than it (Listing 3).

    template <int n>
    struct Abs
    {
      enum { value = n >= 0 ? n : -n };
    };

    template <typename y, typename xs,
       template <typename,typename> class Ordering>
    struct EvaluateMetric;

    template <typename y,
       template <typename,typename> class Ordering>
    struct EvaluateMetric<y, NullType, Ordering>
    {
      enum { less = 0, greater = 0 };
    };

    template <typename y, typename x, typename xs,
       template <typename,typename> class Ordering>
    struct EvaluateMetric<y, List<x,xs>, Ordering>
    {
      enum { less = Ordering<y,x>::less +
         EvaluateMetric<y,xs,Ordering>::less };
      enum { greater = Ordering<y,x>::greater +
         EvaluateMetric<y,xs,Ordering>::greater };
      enum { value = Abs<(greater - less)>::value };
    };
        
Listing 3

The ChooseSplitter template then uses this to pick the best possible splitter each time (Listing 4).

    template <typename ys, typename xs,
       template <typename,typename> class Ordering>
    struct ChooseSplitter;

    template <typename y, typename xs,
       typename Ordering>
    struct ChooseSplitter<List<y,NullType>, xs,
       Ordering>
    {
      typedef y result;
      enum { metric = EvaluateMetric<y,xs,Ordering>::
         value };
    };

    template <typename y, typename ys, typename xs,
       template <typename,typename> class Ordering>
    struct ChooseSplitter<List<y,ys>, xs, Ordering>
    {
      typedef typename ChooseSplitter<ys,xs,
         Ordering>:: result candidate;
      enum { candidateMetric = ChooseSplitter<ys,xs,
         Ordering>:: metric };
      enum { metric = EvaluateMetric<y,xs,Ordering>::
         value };
      typedef typename Select<(
         metric < candidateMetric), y, candidate>::
         result result;
    };
        
Listing 4

Outputting a tree

We have code for building a tree, but before we can test it we need some way of outputting trees. The code in Listing 5 does exactly that, indenting each layer of the tree by a fixed amount to make it easier to read.

    template <typename T, int offset = 0>
    struct OutputNode;

    template <int offset>
    struct OutputNode<NullType, offset>
    {
      void operator()()
      {
        for(int i=0; i<offset; ++i) std::cout << ' ';
        std::cout << "Null\n";
      }
    };

    template <typename Splitter, typename Left,
       typename Right, int offset>
    struct OutputNode<TreeNode<Splitter, Left, Right>,
       offset>
    {
      void operator()()
      {
        for(int i=0; i<offset; ++i) std::cout << ' ';
        std::cout << Splitter::first::value << ' ' <<
           Splitter::second::value << '\n';
        OutputNode<Left, (offset+2)>()();
        OutputNode<Right, (offset+2)>()();
      }
    };
    template <typename Tree> struct OutputTree;

    template <typename Root,
       template <typename,typename> class Ordering>
    struct OutputTree<Tree<Root,Ordering> >
    {
      void operator()()
      {
        OutputNode<Root>()();
      }
    };
        
Listing 5

An example

Figure 2 shows a key -> value map, implemented using a binary search tree. Null nodes are not shown.

A key -> value map
Figure 2

Now that we can output trees, we can examine a concrete example of tree-building. Specifically, we will build the tree shown in Figure 2.

First of all, we define some macros to make the final code easier to read (see Listing 6).

#define KPLIST1(kt,vt,k1,v1)             List<Pair<kt<k1>,vt<v1> >, NullType>
#define KPLIST2(kt,vt,k1,v1,k2,v2)       List<Pair<kt<k1>,vt<v1> >, KPLIST1(kt,vt,k2,v2)>
#define KPLIST3(kt,vt,k1,v1,k2,v2,k3,v3) List<Pair<kt<k1>,vt<v1> >, KPLIST2(kt,vt,k2,v2,k3,v3)>
...
#define KPLIST7 ...

#define BUILD_TREE(L)   BuildTree<L>::result
#define OUTPUT_TREE(T)  OutputTree<T>()()
         
Listing 6

The KPLISTn macros take a key type and a value type, followed by a list of (key,value) pairs. Given the above definitions, we can build and output our tree as follows:

      OUTPUT_TREE(BUILD_TREE(KPLIST7(Int,Int,0,23,1,9,
      2,84,3,24,4,12,5,18,6,42)));

This gives the expected result:

      3 24
        1 9
          0 23
            Null
            Null
          2 84
            Null
            Null
        5 18
          4 12
            Null
            Null
          6 42
            Null
            Null

Lookup

Tree-building is all well and good, but to finish implementing compile-time maps, we now need to implement a lookup function on our trees. This turns out to be relatively straightforward: we locate the key we want to find in the tree recursively by comparing it against the key at the current node at each stage; if it's less than the nodal key, we recurse down to the left child, if it's greater than the nodal key, we recurse down to the right child, and if it's equal then we've found what we're looking for and 'return' the nodal value by means of a typedef. If the current node is null, the key can't be found in the tree and we 'return' null (or NullType). The code is shown in Listing 7.

    template <typename Key, typename Node,
       template <typename,typename> class Ordering>
    struct LookupInNode;

    template <typename Key,
       template <typename,typename> class Ordering>
    struct LookupInNode<Key, NullType, Ordering>
    {
      typedef NullType result;
    };

    template <typename Key, typename Value,
       typename Left, typename Right,
       template <typename,typename> class Ordering>
    struct LookupInNode<Key, TreeNode<Pair<Key,Value>,
       Left,Right>, Ordering>
    {
      typedef Value result;
    };

    template <typename Key, typename SplitKey,
       typename SplitValue, typename Left,
       typename Right, template <typename,typename>
       class Ordering>
    struct LookupInNode<Key, TreeNode<Pair<SplitKey,
       SplitValue>,Left,Right>, Ordering>
    {
      // Note that the = case is handled by the
      // partial specialization above.
      typedef typename Select<Ordering<Key,SplitKey>::
         less,LookupInNode<Key,Left,Ordering>,
         LookupInNode<Key,Right,Ordering>
         >::result::result result;
    };

    template <typename Key, typename Tree>
    struct Lookup;

    template <typename Key, typename Root,
       template <typename,typename> class Ordering>
    struct Lookup<Key, Tree<Root,Ordering> >
    {
      typedef typename LookupInNode<Key,Root,
         SubsidiaryOrdering<Ordering>::Eval>::
         result result;
    };
        
Listing 7

One slight trick lies in the way we handle orderings: the ordering stored with the tree itself is an ordering over (key,value) pairs, since it was an ordering over the list from which the tree was built. The ordering we want for lookup, on the other hand, is an ordering over the key type. There are two equivalent ways of dealing with this: either we initially define an ordering on the key type and lift it to (key,value) pairs, or we initially define one on (key,value) pairs and lower it back to the key type. The first way probably makes more intuitive sense (since the alternative relies on the fact that the (key,value) pair ordering we define depends only on the key) and is left as a mini-exercise for the reader. In the code in Listing 7, we use the SubsidiaryOrdering template to lower the (key,value) pair ordering back to the key type; its definition is as shown in Listing 8.

    template <template <typename,typename> class Ordering> struct SubsidiaryOrdering;

    template <>
    struct SubsidiaryOrdering<KeyOrder>
    {
      template <typename x, typename y>
      struct Eval
      {
        enum { less = Order<x,y>::less };
        enum { greater = Order<x,y>::greater };
      };
    };
        
Listing 8

Having thus finished our lookup implementation, we can try it out, again defining some macros to make things neater:


      #define LOOKUP(kt,kv,T)		Lookup<kt<kv>,T>::result
      #define OUTPUT_VALUE(v)		OutputValue<v>()()

      ...

      OUTPUT_VALUE(LOOKUP(Int,4,BUILD_TREE(KPLIST7(
      Int,Int,0,23,1,9,2,84,3,24,4,12,5,18,6,42))));

This correctly outputs the value 12, as we expect.

Conclusion

Let's take a step back from the code-face and take a look at what we've gained from this. Aside from seeing how to implement a compile-time map, which is a valuable technique in itself, we've seen that template metaprogramming in general can be a really useful technique, and (as discussed in the last article) one that needn't unnecessarily befuddle us if we keep in mind its similarities to functional programming in other languages. Aside from providing an interesting programming challenge in its own right, template metaprogramming has applications which are useful in the real world and is well worth looking into.

References

[Golodetz] Functional Programming Using C++ Templates (Part 1), Overload 81, October 2007.

Overload Journal #82 - Dec 2007 + Programming Topics