Journal Articles

Overload Journal #131 - February 2016 + Programming Topics
Browse in : All > Journals > Overload > o131 (7)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Template Programming Compile Time Combinations & Sieves

Author: Martin Moene

Date: 04 February 2016 18:50:45 +00:00 or Thu, 04 February 2016 18:50:45 +00:00

Summary: Functional style frequently uses sequences. Nick Weatherhead applies these ideas to combinations in C++.

Body: 

Previously [Weatherhead15], I discussed the use of C++ templates to compile time program some well-known string katas. Template metaprogramming in this way imposes a functional style with the immutability of variables a key consideration. In this article I’m applying the same treatment to combinations of k elements and a variation on Eratosthenes sieve. The foundations for each are built as we go along including holding data within lists, adding and removing items from them, implementing queues, defining sequences and operating on them with sets. There is a bit to it so you may wish to skip ahead to a solution and come back to the detail. Either is fine but you’ll probably want to keep Listing 1 to hand for reference.

Lists

Without arrays and mutability at our disposal, a structure is required that can represent a list and after each operation produces a new variant. It’s common for functional languages to have an operation which adds an element to the beginning of a list and is typically known as cons (short for construct). This functional list differs from the imperative viewpoint of linearly linking elements with a head at one end, a body and a tail at the other. Instead it is woven from a compound pair of pairs. Each pair comprises a head plus tail that splits until a null tail terminates a given path. It’s good to have an appreciation of its recursive nature and the other features it affords see [SICP96]. However, for the most part, this article will attempt to adhere to the metaphor for a single chain.

  1 #include <iostream>
  2
  3 using namespace std;
  4
  5 struct nop {
  6
  7   static const char delim = '\0';
  8
  9   friend ostream& operator<<( ostream& os
 10   , const nop& ) { return os; }
 11 };
 12
 13 struct nil : nop {
 14
 15   typedef nil type;
 16
 17   template< size_t SIZE = 0 > struct size {
 18
 19     static const size_t value = SIZE;
 20
 21     friend ostream& operator<<( ostream& os
 22     , const size& ) { return os << SIZE; }
 23   };
 24
 25   template< class RHS >
 26   struct append  : RHS { typedef RHS type; };
 27   template< class RHS >
 28   struct prepend : append< RHS > { };
 29   template< class RHS >
 30   struct uunion  : append< RHS > { };
 31   template< class RHS >
 32   struct except  : nop { typedef nil type; };
 33   template< class RHS >
 34   struct intrsct : except< RHS > { };
 35
 36   template< size_t N, size_t I_SIZE, class O
 37   , size_t O_SIZE = 0 > struct kombine_
 38   : conditional< O_SIZE == N, O, nop >::type
 39   { };
 40 };
 41
 42 template< class HEAD, class TAIL
 43 , char DELIM = ',' > struct list_ {
 44
 45   #define t typename
 46
 47   typedef list_  type; typedef list_  LHS;
 48   typedef HEAD   head; typedef TAIL   tail;
 49
 50   static const char delim = DELIM;
 51
 52   friend ostream& operator<<( ostream& os
 53   , const list_& ) { return os
 54     << HEAD( ) << tail::delim << TAIL( ); }
 55
 56   template< size_t SIZE = 0 > struct size
 57   : tail::template size< SIZE + 1 > { };
 58
 59   template< class RHS > struct append
 60   : list_< LHS, RHS, DELIM > { };
 61   template< class RHS > struct prepend
 62   : list_< RHS, LHS, DELIM > { };
 63
 64   template< class LHS, class RHS >
 65   struct append_
 66   : LHS::type::template append< RHS >  { };
 67   template< class LHS, class RHS >
 68   struct prepend_
 69   : LHS::type::template prepend< RHS > { };
 70   template< class LHS, class RHS >
 71   struct except_
 72   : LHS::type::template except< RHS >  { };
 73   template< class LHS, class RHS >
 74   struct uunion_
 75   : LHS::type::template uunion< RHS >  { };
 76   template< class LHS, class RHS >
 77   struct intrsct_
 78   : LHS::type::template intrsct< RHS > { };
 79
 80   template< class RHS, bool = true >
 81   struct uunion
 82   : conditional< ( LHS::head::value <
 83                    RHS::head::value )
 84     , append_<   LHS::head
 85       , uunion_< LHS::tail,   RHS       > >
 86     , append_< t RHS::head
 87       , uunion_< LHS      , t RHS::tail > >
 88     >::type {
 89       typedef HEAD head; typedef TAIL tail; };
 90
 91   template< bool NA > struct uunion< nil, NA >
 92   : append_< head, tail > { };
 93
 94   template< class RHS, bool = true >
 95   struct intrsct
 96   : conditional< is_same< LHS, RHS >::value
 97     , LHS
 98     , t conditional< ( LHS::head::value ==
 99                        RHS::head::value   )
100       , append_< LHS::head
101         , intrsct_< LHS::tail, t RHS::tail > >
102       , t conditional< ( LHS::head::value <
103                          RHS::head::value  )
104         , intrsct_< LHS::tail,   RHS     >
105         , intrsct_< LHS      , t RHS::tail >
106         >::type
107       >::type
108     >::type {
109       typedef HEAD head; typedef TAIL tail; };
110
111   template< bool NA >
112   struct intrsct< nil, NA > : nil { };
113
114   template< class RHS, bool = true >
115   struct except
116   : conditional< is_same< LHS, RHS >::value
117     , nil
118     ,t conditional< ( LHS::head::value <
119                       RHS::head::value  )
120       , append_< t LHS::head
121         , except_< LHS::tail,   RHS       > >
122       , t conditional< ( LHS::head::value >
123                          RHS::head::value )
124         , except_< LHS      , t RHS::tail >
125         , except_< LHS::tail, t RHS::tail >
126         >::type
127       >::type
128     >::type {
129       typedef HEAD head; typedef TAIL tail; };
130
131   template< bool NA > struct except< nil, NA >
132   : append_< head, tail > { };
133
134   template< size_t N, size_t I_SIZE
135   , class O, size_t O_SIZE > class kkombine_ {
136
137     friend ostream& operator<<( ostream& os
138     , const kkombine_& that ) {
139
140       static const int i_size = I_SIZE - 1;
141
142       return os
143       << t TAIL::template kombine_< N, i_size
144        , prepend_< HEAD, O >, O_SIZE + 1 >( )
145       << t TAIL::template kombine_< N, i_size
146        , O                  , O_SIZE     >( );
147   } };
148
149   template< size_t N, size_t I_SIZE, class O
150     = nop, size_t O_SIZE = 0 > struct kombine_
151   : conditional< !O_SIZE && I_SIZE == N
152     , prepend_< type, nop >
153     , t conditional< O_SIZE == N, O
154       , t conditional< !I_SIZE, nil
155         , kkombine_< N, I_SIZE, O, O_SIZE >
156         >::type
157       >::type
158     >::type { };
159
160   template< size_t N > struct kombine
161   : kombine_< N, size< >::value > { };
162
163   template< size_t N, bool NA = true >
164   struct powerset_ {
165
166     friend ostream& operator<<( ostream& os
167     , const powerset_& that ) { return os
168       << powerset_< N - 1 >( )
169       << kombine< N >( ); }
170   };
171
172   template< bool NA >
173   struct powerset_< 0,  NA > : nop { };
174
175   struct powerset
176   : powerset_< size< >::value > { };
177
178   #undef t
179 };
180
181 template< class HEAD, class TAIL
182 , char DELIM = ',' > struct list
183 : list_< HEAD, TAIL, DELIM > { };
184
185 template< class TAIL, char DELIM >
186 struct list_< nop, TAIL, DELIM > : TAIL {
187
188   friend ostream& operator<<(
189     ostream& os, const list_& ) {
190     return os << ' ' << TAIL( ); }
191 };
192
193 template< class HEAD, char DELIM >
194 struct list< HEAD, nil, DELIM >
195 : list_< HEAD, nil, DELIM > {
196
197   typedef HEAD type;
198
199   friend ostream& operator<<(
200     ostream& os, const list& ) {
201     return os << HEAD::value; }
202
203   template< class RHS > struct append
204   : list< HEAD, RHS, DELIM > { };
205   template< class RHS > struct prepend
206   : list< RHS, HEAD, DELIM > { };
207 };
208
209 template< class T, T V, class VS, char DELIM >
210 struct v
211 : list< v< T, V, nil, DELIM >, VS, DELIM > {
212   static const T value = V; };
			
Listing 1

In Listing 1, list_ (42–179) takes a HEAD element and TAIL elements; an additional parameter DELIM defines the separator used between elements when printing. Appending an underscore to a class’s name is the convention used here to indicate internal usage, hence, list (181–183) exposes list_. No-operation nop (5–11) simply places a null marker into the output as a guard element. A specialisation of list_ (185–191) which has nop as its head element is used to indicate that it’s a list of lists and when printed each list can be delimited by another character. Another list (193–207) is provided for elements at the end of a list i.e. that’s TAIL is nil (13–40) where some functions are overridden whilst nil implements functions that operate on an empty list. When performingoperations between elements it is convenient to think in terms of what appears on the right and left hand sides, so list_ provides some internal functions that take RHS and LHS respectively. A value of type v (209–212) is itself a list_ with its HEAD being a single value element of the same type. Thus each element can be operated on in the same way.

Queues

Imperative lists and vectors describe consecutive elements, as does the functional list, and all can be used as the underlying structure of a queue. Removing the HEAD of a list_ requires a simple reference to the TAIL elements. To prepend (28–29, 61–62, 205–206) elements another list_ is created with the new content in the HEAD and the existing content in the TAIL; append (28–29, 59–60, 203–204) just reverses the concatenation of the HEAD and TAIL. There are also implementations of these that take explicit left and right hand side arguments – see append_ (65–66) and prepend_ (67–78).

Size

Functional programs don’t have loop constructs so rely on recursion to perform inductive operations. Templates don’t optimise tail recursive calls so each grows the stack, hence compilers place a limit on its depth (note that despite this if an expression is sufficiently long it’s still feasible to exhaust the memory). A classic way to observe this is to use a size function size< > (17–23, 56–57) that iterates over a list to obtain a count. What’s not always clear is that functors used as default template parameter arguments can be evaluated independently of the template to which they’re applied. Take list_<HEAD, TAIL>::kombine_<N, I_SIZE, O = nop, O_SIZE = 0> (149–158), if the input size I_SIZE is defaulted to size<> one might expect it to be calculated, as per a regular function, on invocation. However, as size<>’s template parameters are not dependent on kombine_’s the compiler instantiates it whenever list_ is called. This is okay for k-combinations but would unintentionally execute for primes too. To prevent this the default is wrapped by kombine<N> (160–161). Alternatively how about having a size attribute i.e. size = 1 for single elements and size = HEAD::size + TAIL::size as they are combined? This is fine when the HEAD and TAIL are lists. However, they are also used for list expressions; size isn’t intrinsic to these and their resultant type is unknown until fully evaluated.

Combinations

The brief is to find all the unique combinations of k elements from a set of distinct values. For the purpose of this discussion each element has its own letter and the input set is in alphabetical order.

In Figure 1 (k-combinations) the input elements are treated as a queue; at each step the last item is removed from the top of the queue and either placed at the bottom of the output queue when branching right or dropped when going left. If the output reaches the desired length before the input queue is exhausted then it forms a terminal node and branching ceases. Further, if the input is of the required length and the output queue is empty then the input can be directly substituted for the output without further branching.

Here k-combinations (Listing 2) are represented with consecutive characters, specified by c and cs. As previously mentioned list_<HEAD, TAIL>::kombine<N> (160–161) wraps the initial call list_<HEAD, TAIL>::kombine_<N, size< >, nop, 0>. This takes parameters for the desired combination length, the size of the input list which when first called is the size of the initial list, the output list and its size which are initially empty. The underlying definition list_<HEAD, TAIL>::kombine_<N, I_SIZE, O, O_SIZE> (149–158) makes several checks. The first sees if the output can be directly substituted with the input. Otherwise a check is made to see if the output list has reached the desired length. If neither of these are satisfied list_<HEAD, TAIL >::kkombine_<N, I_SIZE, O, O_SIZE> (134–147) is called to branch left and right. If the end of the input list is reached i.e. nil::kombine_<N, size<>, nop, 0> (36–39) branching ceases with or without the output having reached the specified length. The powerset (175–176) is implemented as all the combinations of k for 0 to n elements.

…
template< char C, class CS = nil
, char DELIM = '\0' > struct c
: v< char, C, CS, DELIM >            { };
template< char C, char... CS >
struct cs      : c< C, cs< CS... > > { };
template< char C >
struct cs< C > : c< C >              { };

int main( ) {

  /* kombine('abcd', 2)=' ab ac ad bc bd cd'  *
   * powerset('abcd')=' a b c d ab ac ad bc   *
   * bd cd abc abd acd bcd abcd'              */
  cout
   << "\nkombine('abcd', 2)='"
   << cs<'a','b','c','d'>::kombine<2>( )<< "'"
   << "\npowerset('abcd')='"
   << cs<'a','b','c','d'>::powerset( ) << "'";
}
			
Listing 2

It can be seen from the enumeration of elements that the permutations, whilst not strictly necessary, are in lexicographic order. Another way of representing combinations is in binary whereby each bit set maps to a corresponding value to print. Investigation of this is left as an additional exercise.

Member template specialisation

You may have noticed that there are a number of member template functors that take an additional, seemingly redundant, parameter; in each case these have a specialisation. For example the power set is all the subsets of an input set including itself and the empty set. In the general case this is lists between the length of the input list powerset_<N> to powerset_<1> (163–170). A specialisation powerset_<0> (172–173) terminates the set with an empty list. Nested classes are dependent on their enclosing template types, hence if explicitly specialised the enclosing classes need to be too. A work-around to this restriction is to provide partial specialisations by adding a dummy parameter.

Sequences

Lists can be long; however, if the content is not arbitrarily distributed then it can be described as a function rather than with individual elements. Evaluating the function and generating the list can then be deferred until first use. This is the case for sieving ranges of numbers with set intervals.

In Listing 3 (Sequences & Sets) a cons of integers i is defined for use as a sequence seq. The common case produces all integers in ascending order between a low LO and high HI value. The interval I can be altered to increase the step size. Reversing direction by beginning at a high value, ending with a low value and specifying a negative increment will produce a sequence in descending order. At each step the current value is appended to the list and, whilst within bounds, the next step is defined as a sequence of itself with an incremented starting range.

…
template< int I, class IS = nil
, char DELIM = ',' > struct i
: v< int, I, IS, DELIM >             { };
template< int I, int... IS >
struct is      : i< I, is< IS... > > { };
template< int I >
struct is< I > : i< I >              { };

template< int LO, int HI, int I = 1
, char DELIM = ',' > struct seq
: conditional<
  ( I > 0 ? LO + I <= HI : LO >= HI - I )
  , i< LO, seq< LO + I, HI, I, DELIM > >
  , i< LO > >::type { };

int main( ) {

  /* seq(-3, 3, 2).uunion(seq(-2, 2, 2))= *
   *     -3,-2,-1,0,1,2,3                 *
   * seq(-3, 3).intrsct(seq(-2, 2, 2))=   *
   *     -2,0,2                           *
   * seq(-3, 3).except(seq(-2, 2, 2))=    *
   *     -3,-1,1,3                        */
  cout
  << "\nseq(-3, 3, 2).uunion(seq(-2, 2, 2))="
  << seq<-3, 3, 2>::uunion< seq<-2, 2, 2>>( )
  << "\nseq(-3, 3).intrsct(seq(-2, 2, 2))="
  << seq<-3, 3>::intrsct<seq< -2, 2, 2>>( )
  << "\nseq(-3, 3).except(seq(-2, 2, 2))="
  << seq<-3, 3>::except<seq<-2, 2, 2>>( );
}
			
Listing 3

Sets

In addition to sequences some set theory is required to sieve. Union, intersection and difference of ascending ordered sets are provided (a description of each follows) although not all will be required. When combining operations it’s feasible that earlier operations result in an empty list, hence the member templates for sets (30–34) in nil.

Union

The general case list<HEAD, TAIL>::uunion<RHS> (80–89) determines which HEAD, the LHS or RHS, is less than the other. Whichever is chosen becomes the HEAD of a new list with the rest being the union of its TAIL with the other list. The union of a list with an empty list is the list itself (91–92). As it isn’t fully defined at the point of parsing its creation is delayed with a concatenation operation i.e. append_<HEAD, TAIL>.

Intersection

The general case list<HEAD, TAIL>::intrsct<R> (94–109) determines whether the LHS and RHS’s HEADs are of equal value; if they are one becomes the HEAD of the resulting list with the rest comprising the intersection of the respective list’s TAILs. Otherwise, the HEAD with least value is discarded and the intersection between the remaining elements is performed. Intersection with an empty list (111–112) is, of course, nil.

Set difference

Known here as list<HEAD, TAIL>::except<R> (114–129) it is the equivalent of minus where any elements in the LHS that also appear in the RHS are removed. If both the LHS and RHS are identical then they cancel one another out resulting in the empty list nil; otherwise, the HEAD of each is compared. If the LHS is less than the RHS then it becomes the HEAD of a new list with the rest being the set difference between its TAIL and the RHS’s. If it’s greater the RHS’s HEAD is removed and if equal both HEADs are removed with the set difference calculated between the remaining elements. If the RHS is an empty list then there is nothing to minus (131–132); as with union the list isn’t fully defined at the point of parsing so a concatenation operation i.e. append_<HEAD, TAIL> delays its creation.

Sieves

Eratosthenes proposed a simple way for finding primes up to a specified integer using the efficient principal of sieving. There are other well-known variations such as Euler’s sieve. The basic mechanism removes multiples of each integer between 2 and n thereby leaving only those that are divisible by one and themselves. There are some quick ways to refine this algorithm which also reduce recursive calls. All even numbers with the exception of 2 can be removed from the initial range; that is all multiples seeded from an even index and every other value from those with an odd. Further, as in table 1, higher order progressions have some values that overlap with lower ones. Thus sequences can begin at the square of their seed. Similarly it is unnecessary to go beyond an index that is √n as if n is not prime it must have a divisor less than or equal to √n [SICP96]. Table 1 shows the arithmetic progressions required to sieve integers between 2 and 100.

Index (Odd N) Interval (N + N) Begin (N x N) Arithmetic Progression
3 6 9 9 15 21 27 ...99
5 10 25 ...15 25 35 45 ...95
7 14 49 ...21 35 49 63 ...98
9 18 81 ...27 45 63 81 99
Table 1

A common solution is to have an array of elements indexed from 1 to n, marking 1 for removal, then generating multiples from 2 to the square root of n, beginning each sequence with the square of its index and marking these for removal too, and finally printing all unmarked values. Instead the algorithm can be conceived in terms of operations on ordered sets.

In Listing 4 (Eratosthenes Sieve) the template parameter HI is the limit value, N is the current sieving multiple beginning at 3 and incrementing by 2 for odd intervals. O is the output list which will be successively sieved; it begins as a list of a fixed value of 2 followed by the sequence of odd numbers up to and including the limit value. Whilst the square of the multiple i.e. LO = N * N is less than or equal to the HI boundary every other odd multiple N + N is sieved from the output list.

…
template< int HI, int N = 3
, class O = i< 2, seq< 3, HI, 2 > >
, int LO = N * N > struct primes
: conditional< LO <= HI
  , primes< HI, N + 2
    , typename O::type::template except<
      seq< LO, HI, N + N > > >
  , O >::type { };

int main( ) {

  /* primes(350)=2,3,5,7,11,13,17,19,23,29,31 *
   * ,37,41,43,47,53,59,61,67,71,73,79,83,89  *
   * ,97,101,103,107,109,113,127,131,137,139  *
   * ,149,151,157,163,167,173,179,181,191,193 *
   * ,197,199,211,223,227,229,233,239,241,251 *
   * ,257,263,269,271,277,281,283,293,307,311 *
   * ,313,317,331,337,347,349                 */
  cout << "\nprimes(350)=" << primes<350>( );
}
			
Listing 4

Summary

In [Weatherhead15] I covered ASCII to integer conversion, roman numerals and palindromes. Each used a variant of the list construct to represent strings. Here it was used to generate combinations with a queue and to sieve sequences by applying set operations. Representative of the way runtime classes are developed the data and methods were encapsulated for reuse. Further ways to experiment with this include implementing other sieves and adding functions to filter and sort.

Note

All the code in this article compiles with GCC 4.9.2 and Clang 3.5.2 using -std=c++11. However, your mileage may vary with other compilers. Also whilst a null character should not be displayed in a terminal, some platforms show them as a space.

References

[SICP96] Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of Computer Programs Second Edition pp.53–54, 116, 139–145, The MIT Press, 1996.

[Weatherhead15] Nick Weatherhead. Template Programming Compile Time String Functions, Overload 128, August 2015.

Further reading

Chris Oldwood. List incomprehension, CVu Volume 26 Issue 2 pp.7–8, May 2014.

Stuart Golodetz. Functional Programming Using C++ Templates (Part 1), Overload 81, October 2007.

Acknowledgements

I’d like to thank the Overload review team for providing their feedback which enabled me to elevate the content presented in this article.

Notes: 

More fields may be available via dynamicdata ..