    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: C++ Range and Elevation</title>
        <link>https://members.accu.org/index.php/articles/1833</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Design of applications and programs + Overload Journal #117 - October 2013</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c67/">Design</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c330/">o117</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-67-330/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+67+330/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;C++ Range and Elevation</h1>
<p><strong>Author:</strong>&nbsp;Martin Moene</p>
<p>
<strong>Date:</strong> 02 October 2013 22:38:59 +01:00 or Wed, 02 October 2013 22:38:59 +01:00</p>
<p><strong>Summary:</strong>&nbsp;C++ provides many features for higher-level programming, but lacks some common ones present in other languages.</p>
<p><strong>Body:</strong>&nbsp;<p>Back in 2009, Andrei Alexandrescu gave a presentation at the ACCU Conference about Ranges. The short version is that although you can <em>represent</em> a range in C++98 using a pair of iterators, the <em>usage</em> is cumbersome for most simple tasks. Even for a simple loop, using iterators can be a trial.</p>

<pre class="programlisting">
  for( vector&lt; int &gt;::iterator pos = data.begin();
    pos != data.end(); ++pos )
  {
    .... do interesting things with ints
  }
</pre>
  
<p>The C++ Standard Library provides a simple enough algorithm which improves on this a bit, but you need to provide the action to be performed as a functor argument, thus losing the locality of reference for that action.</p>

<pre class="programlisting">
  for_each( data.begin(), data.end(), action );
</pre>

<p>And all thatâ€™s just using the iterators the C++ Standard Library gives you; defining your own iterator is notoriously complex. So, Andrei introduced a much simpler â€“ and more powerful â€“ abstraction: the Range [<a href="#[Alexandrescu09]">Alexandrescu09</a>].</p>

<p>There have been a few attempts to implement Andreiâ€™s ideas in C++ (he implemented them for D, and they form the basis of the D Standard Library) but for one reason or another, it just hasnâ€™t caught on. Part of the reason for that is that in order to take advantage of this new abstraction as Andrei envisioned it, you need a rewrite of the Standard Library that uses Ranges instead of Iterators. For some reason, there seems to be little appetite for this. Some implementations have arisen that are interoperable with C++ Standard Library algorithms [<a href="#[Boost]">Boost</a>], [<a href="#[Github1]">Github1</a>], but even they appear to not have had as much traction in the C++ community at large as might have been hoped. Similarly, there have been proposals to the C++ Standards effort [<a href="#[Standards]">Standards</a>], but still, not much apparent interest in something that is little more than a thin wrapper around the existing C++ container types which are, after all, really just iterator factories.</p>

<p>Part of the reason Andrei himself implemented his idea of Ranges in D, rather than C++, was that C++ at the time didnâ€™t provide good enough language support to make it straightforward. In particular, there was a C++ proposal at the time for <code>auto</code> variable declarations, which would have been crucial, but had experimental support in only one widely used compiler. Now, C++ officially has <code>auto</code>, and itâ€™s very widely supported. But not a widely-used â€“ let alone standard â€“ range type.</p>

<p>All of this is highly relevant, of course, but misses an important point: what do these range types actually achieve? What problem are they attempting to solve?</p>

<p>Time for a quick segue into a different world....</p>

<h2>The cross-pollination conundrum</h2>

<p>Itâ€™s common knowledge among experienced programmers that an intimate understanding of a few different languages (and a possibly less intimate knowledge of many) is a good thing. Techniques from one language can inform and inspire neat solutions to problems in other languages. Itâ€™s also common knowledge among experienced programmers that idiomatic features of one language are not necessarily transferable to other languages, and that doing so can result in code that is truly incomprehensible to its readers. With both of those things in mind, I want to explore a little modern C# idiom: <code>IEnumerable</code>, the C# Iterator.</p>

<p>This interface is what permits the <code>foreach</code> loop in C#:</p>

<pre class="programlisting">
  foreach( var item in container )
  {
      .... do interesting things with items
  }
</pre>  
  
<p>C# has had <code>IEnumerable</code> from the very beginning, although itâ€™s undergone a few revisions over the years.</p>

<p><code>IEnumerable</code> forms the basis of a much higher-level abstraction than merely accessing the contents of containers, however. It underpins all the functionality of LINQ<sup><a href="#FN01">1</a></sup>, introduced in Visual Studio 2005 with .Net 3.5, and builds on one key feature of .Net 2.0 â€“ the <code>yield </code>keyword, which creates an <code>IEnumerable</code> <strong>on demand</strong> (actually, it creates an implementation of <code>IEnumerator&lt;T&gt;</code>, which is the <em>real</em> iterator type). This facility means that iterating over an <code>IEnumerable</code> is lazy â€“ access to an element isnâ€™t performed until itâ€™s asked for. In C#, this is referred to as Deferred Execution.</p>

<pre class="programlisting"> 
 var results =008512&quot;
   container.Where( item =&gt; item.Id == expected )
   .Select( filtered =&gt; filtered.Count.ToString() )
   .Take( 2 );</pre>
   
   
<p>The reason lazy access is important is that no matter how many elements <code>container</code> has, the clauses for <code>Where</code> and <code>Select</code> will be called a maximum of 2 times. Obviously, this is significant if <code>container</code> has 20 million items in it.</p>

<p>So what has all this to do with C++? In the first case, the reason that <code>IEnumerable</code> works <em>as an interface</em> in C# is that it is implemented by all the standard containers, and is in fact very simple to implement for your own container types. C++ has no such interface, and in fact, there is no actual relationship â€“ inheritance or otherwise â€“ between the standard containers, or their iterator types. In the second case, does this entirely idiomatic C# translate at all into C++? Or does it make for an incomprehensible mess? That is the nature of the cross-pollination conundrum.</p>

<h2>The missing lin[qk]?</h2>

<p>It should be clear that <code>IEnumerable</code> in C# has much more in common with Andrei Alexandrescuâ€™s vision of a Range than it does with C++ iterators â€“ or even iterator pairs. C++11 introduces many language features which allow a neat syntax for it, such as lambda and type deduction facilities. Suppose there were a range type in C++; would it on its own enable the implementation of something like <code>Select</code> or <code>Where</code> from C#? What might that look like?</p>

<pre class="programlisting">
  auto result =
     container.where( [&amp;]( const thing &amp; t )
     { return t.name == expected; } )
     .select( []( const thing &amp; t )
     { return to_string( t.count ); } )
           .take( 2 );
</pre>
		   
<p>Itâ€™s not inconceivable.</p>

<p>But.</p>

<p>What type is container? We could implement a single type that has <code>where</code>, <code>select</code>, <code>take</code> and all the other things needed as member functions. What really makes C#â€™s <code>IEnumerable</code> work is the existence of extension methods. The power of that mechanism really shines through when you need to write your own function that fits in with other LINQ facilities, and C++ has no analogue for that.</p>

<p>A much neater idea would be closer to the Alexandrescu range concept whereby <code>select</code> returns a specific kind of range, and <code>where</code> returns a different kind of range. The chaining of those operations together as shown above would still require some common base interface, and would still be hard to extend.</p>

<p>Lastly there is also the question of lazy evaluation (or Deferred Execution) and efficiency. Itâ€™s hard to see how to make lazy evaluation safe in the above scenario, without resorting to passing functions around under the hood and capturing state. As for efficiency, this idea of a single base class is raising the spectre of allocating stuff on the heap (<strong>gasp!</strong>), and too many C++ programmers have got used to the flexibility and efficiency of templates and type deduction of iterators to give it up that easily.</p>

<p>So, there are questions.</p>

<p>Perhaps we should wind back our expectations a little, start with something very, very simple, and see if it can be implemented. Then we can look to see if we can build on small beginnings to do something more elaborate.</p>

<p>So. Where do we start?</p>

<h2>Write a test</h2>

<p>For the purposes of testing examples, assumptions and results, Iâ€™m using Catch [<a href="#[Github2]">Github2</a>] because itâ€™s easy to read, clear to write and not too verbose.</p>

<p>It should be obvious by now that the first step is to define a very simple and lightweight range type, upon which we can somehow â€˜hangâ€™ all of the operations we need. This range should be trivially initialisable from some standard container. Iâ€™m going to make a conceptual leap here, because itâ€™s clear the range type needs a way to access the â€˜currentâ€™ element, and to move forwards one position. With these operations, itâ€™s possible to make a simple check that the â€˜contentsâ€™ of the range match the original data. Andrei Alexandrescu asserted that the pointer-like interface for iterators is a Bad ThingTM, but I think it makes for a neat syntax, so I will stick with it. (See Listing 1.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Range is constructable from standard
  collection and is iterable&quot;,
  &quot;[range][init][iterable][stl]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data );
  REQUIRE( *range++ == data[ 0 ] );
  REQUIRE( *range++ == data[ 1 ] );
  REQUIRE( *range++ == data[ 2 ] );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 1</td>
	</tr>
</table>

<p>What is required to make this test compile? The most obvious thing is <code>make_range</code> â€“ is that a function or a class? In order to make it as general as possible, it should (obviously) be a template, and to take full advantage of type-deduction it should be a function returning....what? Some type which exhibits the right interface. With just the information in the test, we can sketch it out. Note that, with the use of <code>auto</code>, the actual type is never named. This isnâ€™t a crucial observation, but does give us a lot of leeway on the choice of name. I have been unimaginitive, however. (See Listing 2.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename iterator_type &gt;
class iterable_range
{
  public:
    iterable_range( iterator_type begin,
                    iterator_type end )
      : pos{ begin }, end_pos{ end } { }
    auto operator*() const -&gt; 
      typename iterator_type::value_type
    {
      return *pos;
    }
    auto operator++( int ) -&gt; iterable_range
    {
      iterable_range tmp{ *this };
      ++pos;
      return tmp;
    }
  private:
    iterator_type pos, end_pos;
};
template&lt; typename container_type &gt;
auto make_range( const container_type &amp; ctr ) -&gt;
   iterable_range
   &lt; typename container_type::const_iterator &gt;
{
  return iterable_range
    &lt; typename container_type::const_iterator &gt;
  { begin( ctr ), end( ctr ) };
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 2</td>
	</tr>
</table>

<p>There is nothing particularly startling about this. The <code>iterable_range</code> class just squirrels away a pair of iterators (the observant will already have noticed that <code>end</code> isnâ€™t used, but its purpose should be obvious!), and the <code>make_range</code> function is really just a convenience for creating an instance of the class. The <code>iterable_range</code> is a kind of â€˜proto-rangeâ€™; itâ€™s not terribly useful on its own, but it does form the basis for other things.</p>

<h2>True to type</h2>

<p>Itâ€™s already time to do something a bit harder. Listing 3 is another test.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Transformation of elements results in
   new range leaving originals intact&quot;,
   &quot;[range][transform]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data );
  auto result = select( range, []( int i ) {
    return std::to_string( i ); } );
  std::string expected[] = { &quot;1&quot;, &quot;2&quot;, &quot;3&quot; };
  REQUIRE( *result++ == expected[ 0 ] );
  REQUIRE( *result++ == expected[ 1 ] );
  REQUIRE( *result++ == expected[ 2 ] );
  REQUIRE( *range++ == data[ 0 ] );
  REQUIRE( *range++ == data[ 1 ] );
  REQUIRE( *range++ == data[ 2 ] );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 3</td>
	</tr>
</table>

<p>I already mentioned the idea of having different range types, e.g. <code>select</code> provides a â€˜selecting rangeâ€™, and <code>where</code> provides a â€˜filtering rangeâ€™. The line <code>auto result = select( ...</code> now needs some function <code>select</code>, and something to return â€“ a transforming range type. Itâ€™s going to need the same basic operations as the <code>iterable_range</code>, and it needs to operate on an underlying range. The operations on the new range defer to that underlying range type â€“ which in this case is an <code>iterable_range</code>.</p>

<pre class="programlisting">
template&lt; typename range_type,
  typename transformation &gt;class transforming_range
{
  public:
    transforming_range( range_type r,
      transformation fn ) : r{ r }, fn{ fn } { }
  private:
    range_type r;
    transformation fn;
};
</pre>

<p>The incrementing operator should be straightforward enough â€“ it needs to increment the underlying range object and return a copy of its previous self.</p>

<pre class="programlisting">
  auto operator++( int ) -&gt; transforming_range
  {
    transforming_range tmp{ *this };
    r++;
    return tmp;
  }
</pre>
  
<p>What about the dereference operator? It needs to call the transformation function <code>fn</code> with the current element, and return its result.</p>

<pre class="programlisting">
  auto operator*() const -&gt; ???
  {
    return fn( *r );
  }
</pre>
  
<p>C++11 provides an army of tools for determining the types of things at compile time for situations like this. The one with the most visibility, <code>decltype</code>, provides the declared type of an expression â€“ including the result of calling a function.</p>

<pre class="programlisting">
  auto fn( int ) -&gt; bool { return true; }
  auto fn( double ) -&gt; int { return 10; }
  auto type = decltype( fn( 10 ) );
  // type is bool - type returned if fn were
  // called with an int
</pre>
  
<p>This makes determining the result of <code>operator*()</code> very simple. The only restriction on this use is that both <code>fn</code> and <code>r</code> need to have already been â€˜seenâ€™, which leads to the need to declare them before they are used in the <code>decltype</code> expression.</p>

<pre class="programlisting">
  private:
    range_type r;
    transformation fn;
	
  public:
    auto operator*() const -&gt; decltype( fn( *r ) )
    {
      return fn( *r );
    }
</pre>
	
<p>I normally much prefer to declare classes with the <code>public</code> section at the top, where itâ€™s most obvious and visible. However, it seems a fair trade in this case to allow the use of <code>decltype</code> in such a simple fashion. The alternative is <strong>much</strong> worse!<sup><a href="#FN02">2</a></sup></p>

<p>With this addition, the <code>transforming_range</code> class should pass all the tests, after we add the now-trivial <code>select</code> function. (See Listing 4.)</p>
 
<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename range_type,
          typename transformation &gt;
auto select( range_type r, transformation fn ) -&gt; transforming_range&lt; range_type, transformation &gt;
{
  return transforming_range&lt; range_type,
         transformation &gt;{ r, fn };
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 4</td>
	</tr>
</table>

<h2>Just the good ones</h2>

<p>Time for another test (Listing 5).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Filtering elements contains just the matches&quot;, &quot;[range][filter]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data );

  auto result = where( range, []( int i ) {
    return i % 2 != 0; } );

  REQUIRE( *result++ == data[ 0 ] );
  REQUIRE( *result++ == data[ 2 ] );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 5</td>
	</tr>
</table>

<p>With all the funky stuff learned implementing the <code>transforming_range</code>, implementing a <code>filtering_range</code> to be returned by a <code>where</code> function should be pretty straightforward (Listing 6).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename range_type, 
          typename unary_predicate &gt;
class filtering_range
{
  private:
    range_type r;
    unary_predicate fn;
  public:
    filtering_range( range_type r,
       unary_predicate fn ) : r{ r },
       fn{ fn } { }
    auto operator*() const -&gt; decltype( *r )
    {
      return *r;
    }
    auto operator++( int ) -&gt; filtering_range
    {
      filtering_range tmp{ *this };
      r++;
      while( !fn( *r ) ) r++;
      return tmp;
    }
};
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 6</td>
	</tr>
</table>

<p>The cleverness is all in <code>operator++()</code>, which keeps incrementing until the predicate is false. The <code>where</code> function is simplicity itself â€“ almost identical to <code>select</code>, just returning a different range type.</p>

<pre class="programlisting">
  template&lt; typename range_type, typename filter &gt;
  auto where( range_type r, filter fn ) -&gt;
    filtering_range&lt; range_type, filter &gt;
  {
    return filtering_range&lt; range_type, filter &gt;{
      r, fn };
  }
</pre>
  
<p>Run the tests....and they all pass. Time for the next one....</p>

<h2>Wait a moment!</h2>

<p>Once again, the observant among you will have already spotted the completely fatal flaws in that code. Just to labour the point, Listing 7 is a test that exposes one of them.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Filtering range returns correct matches when first element is a mismatch&quot;, &quot;[range][filter][empty]&quot; )
  {
    std::vector&lt; int &gt; data { 2, 3, 4 };
    auto range = make_range( data );

    auto result = where( range, []( int i ) { return i % 2 != 0; } );

    REQUIRE( *result++ == data[ 1 ] );
  }
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 7</td>
	</tr>
</table>

<p>The problem here is that the filtering range is fine if the first element in the range matches the predicate. In any other case, it fails the test. Hopefully, this starts a chain of thought leading to questions like â€˜what if no elements match?â€™ and then â€˜what if the initial range is empty?â€™. <strong>Now</strong> we know why the <code>iterable_range</code> has an <code>end</code> iterator! We need some sort of check that the range object is valid â€“ that it hasnâ€™t run out of elements. Fortunately, it is trivial to implement a safe conversion to <code>bool</code> on all three of our range types. For <code>iterable_range</code>, it returns <code>false</code> if <code>pos==end_pos</code> and <code>true</code> otherwise. For <code>transforming_range</code> and <code>filtering_range</code>, it simply returns the value for the underlying range. C++11 provides an <code>explicit</code> conversion operator â€“ which means the new function wonâ€™t take part in any arithmetic or otherwise unsafe operations.</p>

<pre class="programlisting">
  explicit iterable_range::operator bool() const 
    { return pos != end_pos; }
  explicit transforming_range::operator bool()
    const { return !!r; }
</pre>
	
<p>The slightly odd <code>!!r</code> says what it does: â€˜not notâ€™, invoking the underlying rangeâ€™s <code>bool</code> conversion. Because the conversion is <code>explicit</code>, just <code>return r;</code> wonâ€™t work, of course!<sup><a href="#FN03">3</a></sup></p>
 
<p>With the addition of this rather important facility, we can add a helper function to <code>
filtering_range</code> called <code>find_next_match</code>.</p>

<pre class="programlisting">
  void find_next_match()
  {
    while( r &amp;&amp; !fn( *r ) )
      ++r;
  }
</pre>
  
<p>This function is invoked by <code>operator*()</code> (thus making <code>filtering_range</code> lazy-evaluated), so finds the first matching element. The increment operator also needs to invoke it to ensure a range with no further matches becomes invalidated.</p>

<p>This implies that client code must first check that the range is valid before invoking <code>operator*()</code><sup><a href="#FN04">4</a></sup>, and so <code>operator bool()</code> must <em>also</em> invoke it in order to detect a range with no matching elements.</p>

<p>Time for some more tests! (Listing 8)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Filtering range is immediately
  invalid when no elements match&quot;,
  &quot;[range][filter][nomatch]&quot; )
{
  std::vector&lt; int &gt; data { 2, 3, 4 };
  auto range = make_range( data );
  auto result = where( range, []( int ) 
  { return false; } );

  REQUIRE( ! result );
}

TEST_CASE( &quot;Filtering range returns correct
   matches when first element is a mismatch&quot;,
   &quot;[range][filter][empty]&quot; )
{
  std::vector&lt; int &gt; data { 2, 3, 4 };
  auto range = make_range( data );
  auto result = where( range, []( int i ) 
  { return i % 2 != 0; } );

  REQUIRE( !!result );
  REQUIRE( *result++ == data[ 1 ] );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 8</td>
	</tr>
</table>

<p>Whilst weâ€™re at it, weâ€™ll add a prefix <code>operator++()</code> to all the range types, too (did I manage to slip that one past you in the implementation of <code>find_next_match</code>?), since efficiency is one of our design principles.</p>

<p>For <code>filtering_range</code>, that operator is used by the postfix version, and looks like the following:</p>

<pre class="programlisting">
  auto operator++() -&gt; filtering_range &amp;
  {
    ++r;
    find_next_match();
    return *this;
  }
</pre>
  
<p>Now weâ€™re really cooking. It must be time to turn the world upside down yet again.</p>

<h2>Joined up</h2>

<p>It should be fairly clear how to implement <code>take</code> using the techniques already discussed here. Whatâ€™s missing is the ability to chain expressions together. In fact, our existing API allows a limited form of composing expressions.</p>

<pre class="programlisting">
  auto result = select( where( []( int i ) {
    return i % 2 != 0; } ),
  []( int i ) { return i * 2; } );
</pre>
  
<p>This is somewhat unwieldy, however, especially when composing more than a small number of expressions.</p>

<p>As already noted, we could add a common base class declaring all the high-level functionality like <code>
select</code>, <code>where</code> and so on. The trouble with this approach is that itâ€™s inflexible; it would be difficult to extend with new operations.</p>

<p>Itâ€™s not possible to overload <code>operator.()</code> in C++, so directly mimicking the syntax of extension methods is out, but there are other operators we could use. There is something appealing about hijacking the â€˜pipeâ€™ operation common in filesystem operations. Time for another test. (Listing 9.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Range results can be composed using
   simple syntax&quot;, &quot;[range][composition]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data );

  auto result = range 
  | where( []( int i ) 
    { return i % 2 != 0; } )
  | select( []( int i ) 
    { return i * 10; } );

  REQUIRE( *result++ == data[ 0 ] * 10 );
  REQUIRE( *result++ == data[ 2 ] * 10 );
  REQUIRE( ! result );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 9</td>
	</tr>
</table>

<p>The main thing to note here is that the signatures of the range functions <code>select</code> and <code>where</code>
 have changed â€“ they no longer take a range object â€“ almost as if the range is being passed in through some â€˜standard inputâ€™. This suggests some global operator, perhaps like this:</p>

<pre class="programlisting">
  template&lt; typename left_range_type,
            typename right_range_type &gt;
  auto operator|( left_range_type left,
                  right_range_type right ) -&gt; ???
  {
    ???
  }
</pre>
  
<p>The question is â€“ what should it return? Perhaps it would be simpler to see the solution to this using actual named funtions instead of overloaded operators.</p>

<pre class="programlisting">
  auto result = 
  range.apply( where( []( int i )
             { return i % 2 != 0; } ) )
       .apply( select( []( int i ) 
             { return i * 10; } ) );
</pre>
			 
<p>This is similar enough to actual chaining, there appears to be some merit to following it to see where it leads.</p>

<p>The first obstacle is this redefinition of <code>select</code> and <code>where</code>. To take <code>where</code> as an example, it cannot construct an instance of <code>filtering_range</code>, because that type requires an underlying range object on which to operate. The range object isnâ€™t supplied until <code>apply</code> is called â€“ whatever that is. It looks from here very much like a member function common to all range types. Given that <code>where</code> canâ€™t return a <code>filtering_range</code>, but must capture the filtering predicate somehow, another intermediate type is indicated. This intermediate is what gets passed to <code>apply</code>, and is a factory for the real range type. The <code>
apply</code> function then invokes that factory to create a real range type.</p>

<p>One approach to this might be to have a separate intermediate factory for every range type; this would certainly make the implementation simple: the <code>filtering_range</code> would have a <code>filtering_range_factory</code>, and the implementation of that would know how to construct a <code>filtering_range</code> given an underlying range object to construct it with.</p>

<p>However, for the cases in this example at least, there is a more general solution. Once again we look to Andrei Alexandrescu, and make use of a simple policy template. The <code>where</code> function â€˜knowsâ€™ it requires a <code>filtering_range</code>, but has insufficient data or template parameters to actually create one. If the <code>filtering_range_factory</code> makes use of a template-template parameter, it can create a <code>filtering_range</code> when all the parameters required are available.</p>
 
<p>This can be generalised out to a factory that can create any range type that takes two template parameters. (See Listing 10.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; template&lt; typename, typename &gt; 
  class range_type, typename expression_type &gt;
  class range_factory
{
  public:
    range_factory( expression_type action ) :
      action{ action } { }
    template&lt; typename range_of &gt;
    auto operator()( range_of r ) const 
      -&gt; range_type&lt; range_of, expression_type &gt;
    {
      return range_type&lt; range_of,
        expression_type &gt;{ r, action };
    }
  private:
    expression_type action;
};
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 10</td>
	</tr>
</table>

<p>Now, the <code>where</code> function instantiates the factory with the required range <em>type</em> (<code>filtering_range</code>), and the predicate function to be captured.</p>

<pre class="programlisting">
  template&lt; typename unary_predicate &gt;
  auto where( unary_predicate fn )
    -&gt; range_factory&lt; filtering_range,
       unary_predicate &gt;
  {ID=&quot;pgfId-1007831&quot;&gt;
    return range_factory&lt; filtering_range,
           unary_predicate &gt;{ fn };
  }
</pre>
  
  
<p>The <code>select</code> function can do the analogous operation using <code>transforming_range</code>.</p>

<p>With this information, we can now implement the <code>apply</code> function. Our original vision was to replace <code>apply</code> with an overload of <code>operator|()</code>. Whilst <code>apply</code> needed to be a member function to allow chaining calls together, the overloaded operator does <strong>not</strong> need to be a member. We can sidestep <code>apply</code> altogether, and jump straight to the <code>operator|()</code> implementation, to make our original test for this pass. All thatâ€™s needed is to call the function-call operator on the provided factory with the provided underlying range object. (Listing 11.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename range_type,
   typename range_factory_type &gt;
  auto operator|( range_type range,
    range_factory_type factory ) -&gt; 
    decltype( factory( range ) )
{
  return factory( range );
}

TEST_CASE( &quot;Range results can be composed using
   simple syntax&quot;, &quot;[range][composition]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data );

  auto result = range | where( []( int i ) 
    { return i % 2 != 0; } )
    | select( []( int i ) { return i * 10; } );

  REQUIRE( *result++ == data[ 0 ] * 10 );
  REQUIRE( *result++ == data[ 2 ] * 10 );
  REQUIRE( ! result );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 11</td>
	</tr>
</table>

<h2>Extensions</h2>

<p>One of the motivations for the ease of chaining operations was to make the API easy to extend. Letâ€™s see how well weâ€™ve achieved that, and write <code>take</code>. The idea is that <code>take</code> produces a range of up to a given number of elements.</p>

<p>If the original range has fewer than the requested number, then all are returned. (See Listing 12.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Range can be limited to a number of
   elements with take&quot;, &quot;[range][take]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3, 4, 5 };
  auto range = make_range( data );

  auto result = range | take( 2 );

  REQUIRE( *result++ == data[ 0 ] );
  REQUIRE( *result++ == data[ 1 ] );
  REQUIRE( !result );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 12</td>
	</tr>
</table>

<p>Implementing the range used â€“ letâ€™s call it <code>limited_range</code> â€“ looks straightforward; <code>operator bool()</code> just needs to return <code>false</code> after a certain number of iterations of the underlying range. The problem with this one is the implementation of the helper function: <code>take</code> itself. Up to this point, the <code>range_factory</code> has had a simple task of creating any one of a number of different range types that all had in common their template parameters and construction; each one with two template parameters, and a constructor that took a range-type and functor-type.</p>

<p>Letâ€™s take a quick look at the basic construction of a <code>limited_range</code>:</p>

<pre class="programlisting">
  template&lt; typename range_type &gt;
  class limited_range
  {
    public:
      limited_range( range_type r, int count )
</pre>
	  
<p>Only one template parameter, and the count, which canâ€™t be made into a template parameter because it might not be a constant. The <code>range_factory</code> now needs to do something different, i.e. be able to create objects having either one or two template parameters.</p>

<p>The existing <code>range_factory</code> exists because the type of the underlying range isnâ€™t known until the range operation (e.g. <code>select</code>) is invoked via the <code>operator|()</code> mechanism. Introducing a new range type with only one template parameter means the original â€˜action expressionâ€™, which for <code>take</code> is just a number, still needs to be captured before the range object is created, but the range type is still not known.</p>

<p>This means the basic mechanism is the same, but the implementation of <code>operator()()</code> needs to vary according to the number of template arguments for the target range type. The ideal solution to this would be to create a partial specialization of <code>range_factory</code> using variadic templates to represent the differences. Something like Listing 13.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; template&lt; typename, typename... &gt; 
   class range_type, typename expression_type &gt;
  struct range_factory { };

  template&lt; template&lt; typename, typename,
     typename... &gt; class range_type,
     typename expression_type &gt;
  struct range_factory&lt; range_factory
    &lt; range_type, expression_type &gt; &gt; { };
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 13</td>
	</tr>
</table>

<p>This would rely on factory types with two or more template parameters matching the specialisation, and those with only one template parameter matching the primary. However, the rules of partial specialization donâ€™t allow this; the primary templateâ€™s implicit types (<code>range_type</code> and <code>expression_type</code>) arenâ€™t distinguishable from the specialization, so the second <code>struct</code> is ambiguous.</p>

<p>It doesnâ€™t prevent a new factory type which understands how to create types having only one template parameter, and having the <code>take</code> function use it directly.</p>

<pre class="programlisting">
  template&lt; template&lt; typename &gt; class range_type,
     typename expression_type &gt;
  class range_factory_1
  {
</pre>
  
<p>It is otherwise idential to <code>range_factory</code>, except for <code>operator()()</code></p>


<pre class="programlisting">
  template&lt; typename range_of &gt;
  auto operator()( range_of r ) const 
     -&gt; range_type&lt; range_of &gt;
  {
    return range_type&lt; range_of &gt;{ r, action };
  }
</pre>
  
<p>The only difference here is the number of template arguments provided to the <code>range_type</code> in the factory function.</p>

<p>This solution works, but it does require any other extensions to know which ...<code>factory</code> class to invoke. Itâ€™s not a catastrophe, but it could be made easier. Instead of variadic templates and specialization, we turn to ordinary function overloading to come to the rescue. Instead of creating the correct factory type directly, <code>take</code> can use a call to a function which is overloaded based on the same template-template upon which we wished we could specialize <code>range_factory</code>. (Listing 14.)</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; template&lt; typename, typename &gt; 
  class range_type, typename expression_type &gt;
auto make_range_factory( expression_type expr ) 
  -&gt; range_factory&lt; range_type, expression_type &gt;
{
  return range_factory&lt; range_type,
               expression_type &gt;{ expr };
}

template&lt; template&lt; typename &gt; class range_type,
   typename expression_type &gt;
  auto make_range_factory( expression_type expr )
     -&gt; range_factory_1&lt; range_type,
                         expression_type &gt;
{
  return range_factory_1&lt; range_type,
                        expression_type &gt;{ expr };
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 14</td>
	</tr>
</table>



<p>This pair of function overloads selects the correct factory class to construct based on the number of template parameters required by the <code>range_type</code> template-template. Any function that now needs a <code>range_factory</code> or <code>range_factory_1</code> can just use the overloaded function and provide the correct type for that factory to create, and overloading will do the rest (Listing 15).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
auto take( size_t count ) -&gt; 
  decltype( make_range_factory&lt; limited_range &gt;
  ( count ) )
{
  return make_range_factory&lt; limited_range &gt;
    ( count );
}

template&lt; typename unary_predicate &gt;
auto where( unary_predicate fn ) -&gt; 
  decltype( make_range_factory
    &lt; filtering_range &gt;( fn ) )
{
  return make_range_factory&lt; filtering_range &gt;
  { fn };
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 15</td>
	</tr>
</table>

<p>Here is the <code>take</code> function implemented to use the new factory factory function (<code>!</code>), and using the same function return type deduction as used previously, with <code>decltype</code>. Iâ€™ve also re-implemented <code>
where</code> to show the usage is identical; these two functions make use of <em>different</em> range factory types, but that is merely â€˜implementation detailâ€™.</p>

<p>Itâ€™s not unreasonable to imagine range implementation classes needing more template arguments. In such cases, the corresponding <code>range_factory</code>_N and <code>make_range_factory</code> overload pair would be needed, but in practice, one and two template parameter range types cover most of the most useful things.</p>

<h2>All done</h2>

<p>This article set out with the stated aim of implementing something not dissimilar to the simple C# LINQ expression which does simple filtering, transformation and range-limiting. With the implementation so far, itâ€™s <strong>very</strong> similar (but not exactly the same, for some good reasons). See Listing 16.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
var result = container.Where
   ( item =&gt; item.Id == expected )
 .Select( filtered =&gt; filtered.Count.ToString() )
 .Take( 2 );

auto result = range 
  | where ( [&amp;]( const thing &amp; t )
    { return t.name == expected; } )
  | select( []( const thing &amp; t )
    { return to_string( t.count ); } )
  | take( 2 );
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 16</td>
	</tr>
</table>

<p>There was a mention in passing, however, of being able to interoperate with the existing C++ Standard Library implementations, without losing efficiency. The high-level API achieved here is certainly not just a thin wrapper around C++ iterators; it provides a very rich and type-safe platform which is extended fairly easily (at least, no more difficult than extending C#â€™s <code>IEnumerable</code> facilities). How hard is it to go the extra step, and allow this new range type to play nicely with C++ algorithms?</p>

<p>With some restrictions, itâ€™s very simple. Those restrictions again are inspired by C#â€™s LINQ: <code>IEnumerable</code> is an <em>immutable</em> interface. You cannot modify elements of it, nor modify the range represented by it, without going via some concrete container type, which in C# means calling <code>ToList()</code> or <code>ToArray()</code> on the range.</p>
 
<p>I believe immutability isnâ€™t an unreasonable restriction on this kind of programming. With increased focus on concurrency and multi-processing to achieve better performance, and with both of those techniques benefiting greatly from the use of immutable data, making the ranges that this API operates on immutable isnâ€™t a restriction, itâ€™s a design principle. Turning an immutable range into mutable data which works with mutating algorithms requires the C++ equivalent of <code>ToList()</code>.</p>

<p>Letâ€™s see if we can express what is required for that in another test (Listing 17).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
TEST_CASE( &quot;Range can be used to populate a
   standard lib container&quot;,
   &quot;[range][export][stl]&quot; )
{
  std::vector&lt; int &gt; data { 1, 2, 3 };
  auto range = make_range( data ) |
               select( []( int x )
  { return std::to_string( x ); } );

  std::vector&lt; std::string &gt; result 
  { begin( range ), end( range ) };

  REQUIRE( result.size() == data.size() );
  REQUIRE( std::to_string( data[ 0 ] ) ==
     result[ 0 ] );
  REQUIRE( std::to_string( data[ 1 ] ) ==
     result[ 1 ] );
  REQUIRE( std::to_string( data[ 2 ] ) ==
     result[ 2 ] );
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 17</td>
	</tr>
</table>

<p>This highlights the fact that in C++, containers and algorithms work with <em>pairs</em> of iterators â€“ the <code>[begin, end)</code> â€˜rangeâ€™, whereas the range types described in this article encapsulate the pair of iterators into a single item.</p>

<p>Implementing <code>begin</code> for the range type looks straightforward â€“ perhaps the necessary operations (only an <code>InputIterator</code>â€™s operations are required) could be added to the range types, so equivalence operators <code>!=</code> and <code>==</code>.</p>

<p>The possible difficulty might be in implementing <code>end</code>.</p>

<p>There is also precedence for writing <code>end</code> when an end position isnâ€™t known: <code>std::ostream_iterator</code> uses a sentry type (effectively an invalid position created by the default constructor), so a similar technique could be employed here. The only way to find the end of a Range is to consume it, which is obviously undesirable.</p>

<p>Thereâ€™s more to a C++ Iterator than <code>++</code>, <code>*</code>, <code>!=</code> and <code>==</code> however, in practice. There are some embedded typedefs to consider as well, which is usually captured by inheriting from <code>std::iterator</code>. So, instead of making changes to all of the range types, or some common base class, it seems to be a better separation of concerns to implement the necessary iterator type separately. As already noted, only the operations and types associated with <code>InputIterator</code>s are needed if range types are immutable. Most of the required operations are easy enough to write (see Listing 18).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename range_type,
  typename value_type &gt;
  class range_output_iterator : public
     std::iterator&lt; std::input_iterator_tag,
                    value_type &gt;
{
  public:
    range_output_iterator( const range_type &amp; r )
      : r{ r }
    {
    }
    auto operator==
       ( const range_output_iterator &amp; )
       const -&gt; bool
    {
      ???
    }
    auto operator!=
       ( const range_output_iterator &amp; r )
       const -&gt; bool
    {
      return !operator==( r );
    }
    auto operator++() -&gt; range_output_iterator &amp;
    {
      ++r;
      return *this;
    }
    auto operator*() const -&gt; value_type
    {
      return *r;
      }
  private:
    range_type r;
};
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 18</td>
	</tr>
</table>

<p>The sticking point is to decide how to implement <code>operator==()</code>. How to compare two ranges to see if their current positions are the same, in the same range of iterators? Time to step back and consider: how would the <code>std::vector</code> constructor taking two iterators be implemented? Something very like:</p>

<pre class="programlisting">
  vector( iterator b, iterator e )
  {
    while( b != e )
      push_back( *b++ );
  }
</pre>
  
<p>The point here is that the most important consideration is the comparison to the â€˜endâ€™ position, which our ranges <em>already do</em> to determine <code>operator bool()</code>, needed for other things internally. If we make the assumption that such comparisons are only ever between a valid range and the end of the same range, then <code>operator==()</code> can use <code>operator bool()</code> â€“ two ranges always compare equal if the first is valid.</p>

<p>A side effect of this is that the right-hand-side of an expression <code>range_l == range_r</code> is never used, so it doesnâ€™t matter what <code>end</code> returns â€“ it doesnâ€™t even need to be a default-constructed sentry value (which would be difficult, given the lack of default constructors of the range types).</p>

<pre class="programlisting">
  auto operator==
    ( const range_output_iterator &amp; ) const -&gt; bool
  {
    return !r;
  }
</pre>
  
<p>The implementations of <code>begin</code> and <code>end</code> therefore become as shown in Listing 19</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
template&lt; typename range_type &gt;
auto begin( range_type r ) -&gt; range_output_iterator&lt; range_type,
                       decltype( *r ) &gt;
{
  return range_output_iterator&lt; range_type,
                       decltype( *r ) &gt;{ r };
}
template&lt; typename range_type &gt;
auto end( range_type r ) -&gt;
  range_output_iterator&lt; range_type,
                         decltype( *r ) &gt;
{
  return range_output_iterator&lt; range_type,
                         decltype( *r ) &gt;{ r };
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 19</td>
	</tr>
</table>

<p>Thatâ€™s not a printing error: they are identical in implementation. <code>end</code> merely has to return something of the correct <em>type</em> â€“ it is never used, not even to compare with anything.</p>

<p>A side effect of this is that interoperability with non-mutating C++ algorithms is achieved for free, for example</p>

<pre class="programlisting">
  {std::copy( begin( r ), end( r ),
              std::back_inserter( my_list ) ); }.</pre>

<h2>To conclude</h2>

<p>There have been a few range libraries developed for C++ over recent years, but none seem to have had the same take-up as ranges have in D, for example. I think itâ€™s partly because C++ iterator pairs have become such a central part of writing C++ programs that use the Standard Library, partly because such range libraries that there are are either clunky to use, or have disappointing behaviour regarding the C++ Standard Library, and partly because there are actually no really compelling use cases for them that canâ€™t be achieved using other techniques.</p>

<p>In this article, I set out to demonstrate some uses for a very lightweight range type thatâ€™s very easy to use, and provides facilities that are very widely used in a different language â€“ C# â€“ which in turn took the ideas of functional languages and coupled them with what some people describe as the ultimate declarative language, SQL.</p>

<p>C++ has many functional facilities, and many of the standard algorithms mirror functional constructs. <code>std::transform</code> is essentially a list comprehension, but lacks the brevity and simplicity of transformations in truly functional languages, at least in part because it uses two iterator-pairs to represent separate ranges.</p>

<p>The C# <code>Select</code> expression to transform one <code>IEnumerable</code> into another captures the simple case that is most common â€“ turn every element into a new range of some new type â€“ and C++â€™s <code>transform</code> doesnâ€™t really even compete. A simple range type for C++ overcomes the problems with having that simplicity, whilst still allowing simple and efficient interoperability with existing C++ Standard Library facilities.</p>

<p>All the code in this article compiles with GCC 4.6.3 (with -std=c++0x) on Ubuntu, and GCC 4.8.1 (with -std=c++11) and Microsoft Visual Studio 2013 CP on Windows. Itâ€™s available from <a href="https://github.com/essennell/narl">https://github.com/essennell/narl</a>.</p>

<h2>Acknowledgements</h2>

<p>Many thanks to Andy Sawyer and Roger Orr for fixing some of my misunderstandings of <code>decltype</code> and trailing return types, Jon Wakely for helping me understand <strong>why</strong> partial specialisation of types based just on the number of template-template parameters doesnâ€™t work, and to Frances Buontempo, Roger Orr and Chris Oldwood for reading early revisions of the article and spotting my inevitable errors.</p>

<h2>References</h2>

<p class="bibliomixed"><a id="[Alexandrescu09]"></a>[Alexandrescu09]  <a href="http://accu.org/content/conf2009/AndreiAlexandrescu_iterators-must-go.pdf">http://accu.org/content/conf2009/AndreiAlexandrescu_iterators-must-go.pdf</a></p>

<p class="bibliomixed"><a id="[Boost]"></a>[Boost]  <a href="http://www.boost.org/doc/libs/1_54_0/libs/range/doc/html/range/reference.html">http://www.boost.org/doc/libs/1_54_0/libs/range/doc/html/range/reference.html</a></p>

<p class="bibliomixed"><a id="[Github1]"></a>[Github1]  <a href="https://github.com/dietmarkuehl/kuhllib/wiki/STL-2.0#andrei-alexandrescus-ranges">https://github.com/dietmarkuehl/kuhllib/wiki/STL-2.0#andrei-alexandrescus-ranges</a></p>

<p class="bibliomixed"><a id="[Github2]"></a>[Github2]  <a href="https://github.com/philsquared/Catch">https://github.com/philsquared/Catch</a></p>

<p class="bibliomixed"><a id="[Standards]"></a>[Standards]  <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html</a></p>

<p class="footnotes"></p>

<ol>
<li><a id="FN01"></a>Specifically, LINQ for Objects, which operate on containers rather than DataSources.</li>
<li><a id="FN02"></a>Alright, youâ€™d need something like <code>auto operator*() const -&gt; typename std::result_of&lt; transformation( decltype( *std::declval&lt; range_type &gt;() ) ) &gt;::type</code> I hope you agree that declaring <code>
private</code> at the top is a small price to pay! </li>
<li><a id="FN03"></a>More verbose but perhaps more descriptive might be<code>return static_cast&lt; bool &gt;( r );</code></li>
<li><a id="FN04"></a>The â€˜functionalâ€™ approach to this problem is <code>any</code>, which returns <code>false</code> if a range contains no elements.</li>
</ol>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
