    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Are You Afraid of The Dark Too? Making More Sense of the
STL</title>
        <link>https://members.accu.org/index.php/articles/433</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Overload Journal #44 - Aug 2001</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c160/">44</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-160/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+160/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Are You Afraid of The Dark Too? Making More Sense of the
STL</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 26 August 2001 17:46:07 +01:00 or Sun, 26 August 2001 17:46:07 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<p>On a previous expedition to this territory [<a href=
"#Love2001">Love2001</a>], we explored some of the basic ideas
behind the STL, and how to make simple use of it effectively. This
time we'll be shining a light on the associative containers,
something we missed out on before. In passing, we'll see some more
algorithms, some more functors, and some more ways to extend the
STL and bend it to your will. So get your hats on! We're going
underground...</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e25" id="d0e25"></a>Stalagmites
&amp; Stalactites Revisited - Containers &amp; Iterators</h2>
</div>
<p>A quick recap then, since we're passing through. The C++
Standard Library provides us with seven main container classes,
consisting of three sequence containers, and four associative
containers. On our last trip we looked a little at the sequence
containers, <tt class="classname">list</tt>, <tt class=
"classname">vector</tt> and <tt class="classname">deque</tt>, and
saw some of their common characteristics, and some of their
individualities.</p>
<p>This time, we'll look at the associative containers, <tt class=
"classname">map</tt>, <tt class="classname">multimap</tt>,
<tt class="classname">set</tt> and <tt class=
"classname">multiset</tt>, see what sets them apart from the
sequences, and what distinguishes them from each other. Of course,
they also have lots in common with the sequence containers, and
we'll have a look at those, too.</p>
<p>There is one other container class we'll have a look at here as
well. It's well hidden in a dark corner right at the back; even the
C++ Standard seems to mention it only as an afterthought (
[<a href="#ISO1998">ISO1998</a>], 23.1.2/1 (&quot;Associative
Containers&quot;) does not mention it at all, but it is described at the
end of that section). The standard associative container called
<tt class="classname">bitset</tt> is useful, after all, and we will
see some common ways to put it to work.</p>
<p>First off, though, let's look at everyone's favourite punch-bag.
Love it, or hate it, we can, I'm sure, find some use for it. A word
about...</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e63" id="d0e63"></a>Bed Of Flint -
Standard String</h2>
</div>
<p>Much has been written on this chap of late, quite a lot of it
not terribly complimentary, and the truth is, the string class has
its flaws. It isn't pretty, and it doesn't always do what you'd
expect, but if you're careful and treat it kindly, you'll have a
friend for life.</p>
<p>Strings are commonly used to store information such as names,
sentences and so on. Given a string such as &quot;Steve&quot;, or &quot;My house
is on a hill&quot; it's easy, and you are in fact encouraged to think of
the string as a singular entity, rather than a sequence of
characters. Hence, it is useful to think of adding strings together
(one time when addition is not commutative, because &quot;Steve&quot; +
&quot;Love&quot; isn't the same as &quot;Love&quot; + &quot;Steve&quot;) and reading from a
file:</p>
<pre class="programlisting">
string name;
cin &gt;&gt; name;
</pre>
<p>It <span class="emphasis"><em>is</em></span> sometimes useful to
remember that a string is a sequence of characters, however. As it
happens, the standard string class conforms to all the requirements
of a standard Sequence container<sup>[<a name="d0e77" href=
"#ftn.d0e77" id="d0e77">1</a>]</sup>. This isn't hard - you must
have iterators, and a <tt class="methodname">begin()</tt> -
<tt class="methodname">end()</tt> pair of member functions which
describe the sequence<sup>[<a name="d0e93" href="#ftn.d0e93" id=
"d0e93">2</a>]</sup>. This turns out to be extremely useful. In our
previous visit, we looked at the standard algorithm <tt class=
"function">find_if()</tt>, which in the case of a string could be
used to find a single character. This is a generally useful
technique for parsing a string; so is finding a partial string, a
sub-sequence. Again, the C++ Standard Library provides:</p>
<pre class="programlisting">
string s( &quot;My house is on a hill&quot; );
string sub( &quot; is&quot; );
string::iterator p = 
  search( s.begin(), s.end(), 
    sub.begin(), sub.end() );
s = string( s.begin(), p );
cout &lt;&lt; s &lt;&lt; &quot;\n&quot;;
</pre>
<p>This will print &quot;My house&quot; to the console.</p>
<p>While we're on the topic, let's have a look at one way you can
use the techniques of iterators, algorithms and sequences to reduce
the above code. The elements are already there; <tt class=
"function">search()</tt> returns an iterator to the start of the
found sub-sequence (<tt class="literal">End</tt> if it can't find
it). You can make a new string from a sequence of iterators. So how
about:</p>
<pre class="programlisting">
s = string( s.begin(), 
  search( s.begin(), s.end(), 
    sub.begin(), sub.end() ) );
</pre>
<p>We're replacing the string <tt class="varname">s</tt> with a new
string created from the range [<tt class="literal">s.begin().
search(...)</tt> ). To the uninitiated, this expression may appear
over-complicated, but then if you're not a C++ or C programmer, so
does any usage of <tt class="function">strtok()</tt>.</p>
<p>So, the standard <tt class="classname">string</tt> isn't
strictly a part of the C++ STL. It is an extension to it, albeit a
standard one, and shows how easy it is to roll your own container
classes which work seamlessly with the rest of STL.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e133" id="d0e133"></a>Reflections -
Unique Associative Containers</h2>
</div>
<p>On the subject of strings, how many of you have ever wanted to
do this:</p>
<pre class="programlisting">
void function( const string &amp; s )
{
  switch( s )
  {
  case &quot;Jan&quot;:
    //...
  case &quot;Feb&quot;:
    //...
  }
}
</pre>
<p>It seems perfectly reasonable, and yet can't be done; string is
a class after all, not a built-in integral type. But there must be
a better way than</p>
<pre class="programlisting">
void function( const string &amp; s )
{
  if( s == &quot;Jan&quot; )
  {
    //...
  }
  else if( s == &quot;Feb&quot; )
  {
    //...
  }
  //...
}
</pre>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e144" id="d0e144"></a>Map</h3>
</div>
<p>Well, you guessed it. If there weren't, I wouldn't be talking
about it. Enter the multi-talented map container. <tt class=
"classname">map</tt> is a Pair Associative Container; it
<span class="emphasis"><em>maps</em></span> a key to a value. More
than that, a map is sorted by its key, which has the fortunate
effect of making it easily searchable by its key. Correspondingly,
the type used for a key in a map container must be Less Than
Comparable [<a href="#Austern1998">Austern1998</a>]<sup>[<a name=
"d0e158" href="#ftn.d0e158" id="d0e158">3</a>]</sup>.</p>
<p>A map stores two objects per element; one is the key, the other
the value. So that both key and value types can be any type,
built-in or user-defined, the C++ Standard Library has a helper
class called <tt class="classname">pair</tt>, which is
parameterised on its first and second types:</p>
<pre class="programlisting">
template &lt;class _T1, class _T2&gt;
struct pair {
  pair( const _T1 &amp; f, const _t2 &amp; s )
    : first( f ), second( s ) { }
  _T1 first;
  _T2 second;
}; // from [<a href="#STLPort2001">STLPort2001</a>]
</pre>
<p><tt class="classname">map</tt> stores elements of type
<tt class="classname">pair</tt> and uses first as the key and
second as the value. (The pair type is much more generally useful,
as we shall see). There is also a helper function for making pairs
allowing us to add items to a map like this:</p>
<pre class="programlisting">
map&lt; string, int &gt; months;
months.insert( make_pair( &quot;Jan&quot;, 1 ) );
months.insert( make_pair( &quot;Feb&quot;, 2 ) );
</pre>
<p>Here we create a map with <tt class="classname">string</tt> as
the key type and <span class="type">int</span> as the value, and
start adding months to the container.</p>
<p>There are two things to note about this piece of code. First,
we've created a map and associated the value 1 with the string
&quot;Jan&quot; and the value 2 with the string &quot;Feb&quot;. This is
straightforward enough. The second item of note is that a map is
sorted on its key; printing the map &quot;in order&quot; demonstrates
this<sup>[<a name="d0e191" href="#ftn.d0e191" id=
"d0e191">4</a>]</sup>:</p>
<pre class="programlisting">
void print_me(const pair&lt;string, int&gt; 
    &amp;value){
  cout &lt;&lt; value.first &lt;&lt; &quot;,&quot; 
       &lt;&lt; value.second &lt;&lt; &quot;\n&quot;;
}
// ...
for_each( months.begin(), months.end(), 
        print_me );
// Output:
// Feb,2
// Jan,1
</pre>
<p>Remember that the key is the <tt class="classname">string</tt>
type for our map months, and so &quot;Feb&quot; comes before &quot;Jan&quot;, lexically
speaking. Remember also that map stores pairs of elements, so
naturally its iterators will point to a <tt class=
"classname">pair</tt>. There is a third thing to note there: this
isn't really what you'd normally do with a <tt class=
"classname">map</tt>. Sure we can use a <tt class=
"classname">map</tt> as a method of sorting elements in a
container, but its best feature is that it associates elements -
keys and values. Being sorted is a side-effect of <tt class=
"classname">map</tt>'s greatest attribute - the ability to
efficiently find a value given its key.</p>
<p>map can be used as an associative array; it directly supports
the array notation (square brackets [] in C++) to provide a key and
obtain its associated value. Because our value type is a built-in
integral type (which the switch statement requires) we can do the
following:</p>
<pre class="programlisting">
void function( const string &amp; s ) {
  switch(months[ s ] ) {
  case 1: // Jan
  case 2: // Feb
  }
}
</pre>
<p>Which is much more like our original requirement<sup>[<a name=
"d0e223" href="#ftn.d0e223" id="d0e223">5</a>]</sup>.</p>
<p>There is a snag to this; you might ask yourself what does the
<tt class="classname">map</tt> do if the element identified by s
doesn't exist? We appear to have no way of knowing whether the
lookup failed, because it isn't reasonable for <tt class=
"function">operator[]</tt> to return an invalid value, since it
cannot know what is valid. This suggests that <tt class=
"function">operator[]</tt> for map never fails, which is in fact
the case. If the lookup fails to locate the key provided, a new
element with that key is created. The value for that key will be
default constructed. <tt class="function">operator[]</tt> actually
returns a reference to the value, so you can change it using
assignment:</p>
<pre class="programlisting">
months[ &quot;Mar&quot; ] = 3;
</pre>
<p>The snag is that this may not do what you expect - you may be
creating a new element and not know it.</p>
<p>There is another method of searching a map which, in keeping
with the standard algorithms, returns an iterator to the found
element, or End if the item is not found. map provides its own
member function for this because firstly the map contains
<tt class="classname">pair</tt>s, so using the <tt class=
"classname">find()</tt> or <tt class="classname">find_if()</tt>
algorithms would be clumsy, and partly because <tt class=
"classname">map</tt> itself is optimised for searching, and the
algorithms don't know about this. Using the member function, you
search directly for the key, and an iterator is returned:</p>
<pre class="programlisting">
map&lt; string, int &gt;::iterator p =
         months.find( &quot;Jan&quot; );
if(p != months.end()) {
  cout &lt;&lt; p-&gt;second;  // the value
}
</pre>
<p>The other useful attribute of <tt class="classname">map</tt> is
that it is a Unique Associative Container, meaning that it does not
store duplicate keys. If you try to insert two values with the same
key, the second insertion will fail. This failure is indicated by
the return value from <tt class="methodname">map::insert()</tt>,
which is a <tt class="classname">pair</tt> object containing a
<tt class="classname">map iterator</tt> and a <span class=
"type">bool</span>. The iterator points either to the newly
inserted element, in which case the <span class="type">bool</span>
is <tt class="literal">true</tt>, or to the already existing
element with the same key, in which case the <span class=
"type">bool</span> is <tt class="literal">false</tt>:</p>
<pre class="programlisting">
pair&lt;map&lt;string, int&gt;::iterator, bool&gt;
 p =  months.insert( make_pair(&quot;Apr&quot;, 4));
if(! p-&gt;second) { // the insertion failed
  cout &lt;&lt; p-&gt;first-&gt;second;   
// print the value associated with &quot;Mar&quot;
// remember an iterator points to a pair 
// element
}
</pre>
<p>In practice, you won't need all this syntax. You will either
want to know whether the insertion took place, or you will want to
do something with value. In the first case, you can use:</p>
<pre class="programlisting">
if(!months.insert( 
      make_pair( &quot;May&quot;, 5)-&gt;second){
// do something if insertion failed
}
</pre>
<p>and in the second, using the array syntax <tt class=
"function">operator[]</tt> works fine:</p>
<pre class="programlisting">
months[ &quot;Jun&quot; ] = 6;
</pre>
<p>Whether it is less efficient to overwrite the value of an
existing element or search for it first will depend upon what the
value type is. As they say on the newsgroups, YMMV.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e305" id="d0e305"></a>Set</h3>
</div>
<p>The <tt class="classname">set</tt> class has similarities with
map; for instance, it contains only unique elements (it's a model
of Unique Associative Container), is sorted by its key and is
optimised for searching. It differs in one very important way: it
contains single elements, rather than <tt class=
"classname">pair</tt>s. It is a Simple Associative Container
[<a href="#Austern1998">Austern1998</a>], which means that its key
and value types are the same thing.</p>
<p>In particular, this difference means you cannot use set as an
associative array; it does not have an <tt class=
"function">operator[]</tt> for array-like access to elements. So
why have a set container at all?</p>
<p>Well, OK now we get into the realms of speculation (by some
well-respected figures, it has to be said). The STL is <span class=
"emphasis"><em>not</em></span> a collection of containers. It has
been said that STL is merely a &quot;specification of a library of
algorithms&quot; [<a href="#Henney2001">Henney2001</a>], not
<span class="emphasis"><em>actually</em></span> a collection of
anything, really. What it means is that if you write your own
container, algorithm, iterator, and it conforms to the requirements
that STL places on such things, then it is just as much a part of
the STL as those components shipped with the C++ Standard
Library.</p>
<p>So back to <tt class="classname">set</tt>, and its <span class=
"emphasis"><em>raison d'&ecirc;tre</em></span>. We need to look
behind the <tt class="classname">set</tt> container, at what makes
it useful, before we can see its utility. We can guess that it
models the concept of Set from mathematics. In fact, that would be
inaccurate; mathematical set theory really concerns the operations
on sets: Union; Intersection; Difference; Contains. In fact the STL
has algorithms for providing these services, and they require that
the range of iterators upon which they operate is sorted. This is a
key issue, which we passed over fairly quickly in our last trip.
The STL algorithms don't work on containers at all, they work with
iterators. The container classes can be easily seen as various
convenient ways of providing iterators and iterator ranges.</p>
<p><tt class="classname">set</tt> is no exception. It is a
container which happily allows the STL set algorithms to work
according to the expected mathematical concepts of those operations
( [<a href="#Austern1998">Austern1998</a>]). It makes those
operations efficient. For example, let's take <tt class=
"function">set_intersection()</tt> as an example.</p>
<pre class="programlisting">
set&lt; string &gt; girl_names;
set&lt; string &gt; boy_names;
set&lt; string &gt; common_names;
set_intersection(
  girl_names.begin(), girl_names.end(),
  boy_names.begin(), boy_names.end(),
  inserter(common_names, 
            common_names.begin()));
</pre>
<p><tt class="function">set_intersection()</tt> takes two input
ranges, and copies its result to an output range. Here we are using
a particular type of iterator adaptor, called inserter, to add new
elements to an output sequence, <i class=
"parameter"><tt>common_names</tt></i> in this case. We'll be
looking at inserters later in our visit.</p>
<p>If you're familiar at all with mathematical sets, you will know
that taking the intersection of two sets results in the elements
common to both. The results we might expect from this operation
could be:</p>
<pre class="programlisting">
Sam
Tracey
Nicola
Max
Eddie
Alex
Hilary
</pre>
<p>The point is that <tt class="function">set_intersection()</tt>
can be used with <tt class="classname">vector</tt>, or <tt class=
"classname">list</tt>, provided they are sorted in advance. The
<tt class="classname">set</tt> container is well suited to this
operation, and the other set operations for the same reason,
because it is always sorted. We could have used <tt class=
"classname">list</tt> as the result sequence; it allows efficient
random insertion of elements, but the fact is that the intersection
of two sets results in a new, possibly empty, set in mathematics.
The purpose of the set algorithms in STL is to closely model the
mathematical meaning of them, and the <tt class=
"classname">set</tt> container provides support for that goal
[<a href="#Austern1998">Austern1998</a>].</p>
<div class="sidebar">
<p class="title c2">Pseudonyms</p>
<p>This last example demonstrates how useful <tt class=
"literal">typedef</tt> can be to make otherwise impenetrable code
readable. It could in fact have been improved, and careful use of
<tt class="literal">typedef</tt> can make code more
maintainable.</p>
<p>In the shown code we have a <tt class="classname">multimap</tt>
containing two types, <tt class="classname">author</tt> and
<tt class="classname">publication</tt>. From use, we can guess that
they are both <tt class="classname">string</tt> types, probably
<tt class="literal">typedef</tt>s. This is in fact pretty
irrelevant, and we're more interested (as maintenance programmers)
in the fact that this multimap contains a mapping for authors to
publications. The (supposed) <tt class="literal">typedef</tt>s of
the identifiers makes our intent (as original programmers) much
more clear. That's one case for <tt class=
"literal">typedef</tt>.</p>
<p>We've discussed some ways of removing the need to use container
iterators directly by using the standard algorithms [<a href=
"#Love2001">Love2001</a>], but there are times, such as here, when
we must use them. In doing so, we have to make the whole type,
including the container and its template type(s) clear to the
compiler, so it knows exactly which iterator we're after. If we'd
made a <tt class="literal">typedef</tt> of the container name, not
only can we shorten the code, we can make it resilient to a change
in the container itself:</p>
<pre class="programlisting">
typedef string author;  
typedef string publication;  
typedef multimap&lt; author, publication &gt; index;
pair&lt; index::iterator, index::iterator &gt; range;
</pre>
<p>The names you use will reflect the scope in which they are used.
Of course, in our example, whatever index becomes, should it
change, will need to have the same facilities as <tt class=
"classname">multimap</tt>, including <tt class=
"methodname">count()</tt> and <tt class=
"methodname">equal_range()</tt></p>
</div>
<p>You may find that the enforced uniqueness of elements in a
<tt class="classname">set</tt>, and possibly the fact it is an
ordered container, useful characteristics in and of themselves,
too. <tt class="classname">set</tt> is by no means limited in its
use to the standard algorithms.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e454" id="d0e454"></a>Refractions -
Multiple Associative Containers</h2>
</div>
<p>As we have seen, map and set both enforce the rule that elements
must be unique. This often makes a great deal of sense. For
example, sets in mathematics have no concept of duplicated members,
and so the STL set container enforces this because it is intended
to support the mathematical definition of a set.</p>
<p>There are two other Associative Containers which differ from set
and map only in that they allow duplicate elements. They are,
appropriately enough, <tt class="classname">multimap</tt> and
<tt class="classname">multiset</tt>.</p>
<p>The set algorithms discussed in the previous section are general
algorithms which work the same way for <tt class=
"classname">multiset</tt> as they do for <tt class=
"classname">set</tt>. They impose no requirement that the input
sequences must be unique ranges, only that they are sorted, and
<tt class="classname">multiset</tt> is a Sorted Associative
Container just as <tt class="classname">set</tt> is. Another useful
member of <tt class="classname">multiset</tt> is the <tt class=
"function">count()</tt> function, which returns the number of items
matching a given key.</p>
<p><tt class="classname">multimap</tt> is useful for one-to-many
relationships (associations) between a key and some values. For
example, a publication will generally run articles by several
authors, each of whom may submit articles to several publications.
A <tt class="classname">multimap</tt> can handle this situation
easily.</p>
<p>Both <tt class="classname">multimap</tt> and <tt class=
"classname">multiset</tt> have two particularly useful member
functions, <tt class="methodname">count()</tt> and <tt class=
"methodname">equal_range()</tt>. The <tt class=
"function">count()</tt> function returns the number of elements
with a given key:</p>
<pre class="programlisting">
multimap&lt; author, publication &gt; articles;
cout &lt;&lt; articles.count(&quot;Gordon Bennett&quot;);
</pre>
<p><tt class="function">equal_range()</tt> returns a pair of
iterators describing all elements with a particular key:</p>
<pre class="programlisting">
typedef multimap&lt; author,     publication &gt;::iterator article_ptr;
pair&lt; article_ptr, article_ptr &gt; range = 
    articles.equal_range(&quot;Arthur Dent&quot;);
for_each(range.begin(), range.end(),
                       print_me);
</pre>
<p>In fact, <tt class="classname">set</tt> and <tt class=
"classname">map</tt> have these, too, but since they store only one
of each key, these functions have little utility directly.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e527" id="d0e527"></a>The False
Stalagmite</h2>
</div>
<p>Finally, the most inconspicuous, quiet mannered, unobtrusive
specimen provided as standard in the C++ Standard Library.
<tt class="classname">bitset</tt> gives you the ability to use
bitmasks without the need to understand hexadecimal numbers. OK,
knowledge of hex notation is probably useful anyway, but the real
utility of <tt class="classname">bitset</tt> is that it provides a
much more convenient way of manipulating individual bits in a
mask.</p>
<p>It is often useful to be able to represent a collection of flags
which can be on or off, set or not set. Computers generally make
extensive use of this (digital computers anyway) and provide the
programmer with tools to emulate it. If you've heard the expression
&quot;it's all ones and zeros&quot;, this is where it comes from - binary
representation of values.</p>
<p>A 32 bit number has, unsurprisingly, 32 bit locations, each of
which can be a one or a zero. <tt class="classname">bitset</tt>
makes it easy to set or unset any one of those values, and to test
an individual bit for its state. Doing this in plain C can be
tortuous - a knowledge of hexadecimal numbers makes it less painful
- because you cannot address an individual bit. A byte is the
smallest addressable location in C and C++ alike.</p>
<p>For example, to set the fifth<sup>[<a name="d0e547" href=
"#ftn.d0e547" id="d0e547">6</a>]</sup> bit to one, you need to
bitwise OR it with 1:</p>
<pre class="programlisting">
mask |= 0x10  // binary 10000
</pre>
<p>to test the same bit, you need a bitwise AND operation:</p>
<pre class="programlisting">
if( mask &amp; 0x10 ){
  //...
}
</pre>
<p>What <tt class="classname">bitset</tt> provides is a way to test
an individual bit by its position in the mask. Thus, the equivalent
forms of the above might be:</p>
<pre class="programlisting">
mask.set( 5 );
if( mask.test( 5 ) ) {
  // ...
}
</pre>
<p>You create a bitset with a fixed number of bits, which, unlike
unsigned numbers in C++, can be any size. Thus, a bitset of 50
locations is declared thus:</p>
<pre class="programlisting">
bitset&lt; 50 &gt; mask;
</pre>
<p>You can create a <tt class="classname">bitset</tt> with a
starting mask by providing a number, which is the equivalent of the
bare numerical bitmask. You can also use a <tt class=
"classname">string</tt> to represent the bitmask, which is much
more convenient, because deciphering which bits are set in 0x06730
(or worse, 26416) can be difficult. The following is very useful
when you are interested not in the value itself, but in which bits
are set and unset:</p>
<pre class="programlisting">
bitset&lt;50&gt; mask(string
            (&quot;110011100110000&quot;));
</pre>
<p>I think you'll agree, it's much easier to see what is going on
here. Note that the provided mask represents the least-significant
bits; this means that the mask will be zero-filled to the left
giving &quot;0...0110011100110000&quot;.</p>
<p><tt class="classname">bitset</tt> also has <tt class=
"methodname">count()</tt> and <tt class="methodname">any()</tt>
member functions which return, respectively, the number of set bits
and whether any bits are set in the mask.</p>
<p>Of course, you may actually be interested in conversions between
binary and decimal representations of numbers, and <tt class=
"classname">bitset</tt> can be used for this, too. You can convert
to an <span class="type">unsigned long integer</span> using the
<tt class="function">to_ulong()</tt> member. You can also write a
<tt class="classname">bitset</tt> to a stream in the usual manner.
Thus, converting a decimal number to its binary representation
might be done like this:</p>
<pre class="programlisting">
bitset&lt; 20 &gt; mask ( 516 );
cout &lt;&lt; mask &lt;&lt; &quot;\n&quot;;
// result
// 00000000001000000100
</pre>
<p>Or conversely, converting a binary number to a number like
this:</p>
<pre class="programlisting">
bitset&lt; 20 &gt; mask ( string( &quot;10110100&quot;));
cout &lt;&lt; mask.to_ulong() &lt;&lt; &quot;\n&quot;;
// result
//  180
</pre>
<p>Now, as to why <tt class="classname">bitset</tt> is a &quot;false&quot;
stalagmite, it isn't really a container in the STL sense, because
it has no iterators. It does have the feel of an associative array
of bits, however, allowing you to do the following:</p>
<pre class="programlisting">
mask[ 5 ] = true;
if( mask[ 5 ] ){
  // ...
}
</pre>
<p>As with map, however, this does <span class=
"emphasis"><em>not</em></span> indicate that <tt class=
"classname">bitset</tt> is a Random Access container.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e625" id="d0e625"></a>Summary</h2>
</div>
<p>The associative containers' single most important difference
from the sequence containers is that they are sorted. This makes it
much more efficient to find a particular element by its key, or in
the case of multiple associative containers, to locate the range of
elements with a given key.</p>
<p><tt class="classname">map</tt> and <tt class=
"classname">set</tt> serve specific purposes, and care should be
taken when using them that you're not just shoe-horning the
requirement because you want a sorted container. It is possible to
sort any of the sequence containers using the standard <tt class=
"function">sort()</tt> algorithm.</p>
<p>Remember, <tt class="classname">pair</tt> is your friend. Even
though it looks like it's a library internal type to make things
like <tt class="classname">map</tt> work, as you use it more,
you'll think of new and interesting ways to use it more...</p>
<p>Finally, prefer <tt class="classname">bitset</tt> over manual
bit-twiddling techniques. It supports the &quot;traditional&quot;
bit-operators &amp;, |, ~ and ^, along with the shift operators
&lt;&lt; and &gt;&gt;, but allows you to specify your intent more
cleanly when you use bitmasks.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e653" id="d0e653"></a>A Look In The
Cauldron - More Adaptors</h2>
</div>
<p>As we saw in a previous visit here, the adaptors are what make
the STL such an interesting brew, and adding your own ingredients
is not just allowed, but encouraged. We mentioned in passing an
iterator adaptor called inserter. We've already seen Function
Adaptors in action ( [<a href="#Love2001">Love2001</a>]). Here
we'll look at some iterator adaptors and container adaptors to
complete our investigation of the containers in STL. For this part,
we reopen our boundaries to include the sequence containers once
more.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e661" id="d0e661"></a>Waterfalls... -
Container Adaptors</h3>
</div>
<p>There are three container adaptors provided with the C++
Standard Library:</p>
<pre class="programlisting">
stack
queue
priority_queue
</pre>
<p>In Computer Science, these are discrete data structures in their
own right, each with distinct properties and behavioural
characteristics. In STL, however, they are implemented in terms of
the other containers, simply because they can.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e670" id="d0e670"></a>Stack</h3>
</div>
<p>Consider the operations on a <tt class="classname">stack</tt>.
You, as the programmer, get access only to the top element in the
container. It is a LIFO (last-in, first-out) container. You can add
(push) an element, and remove (pop) that element. You cannot look
at every element (without popping each one in turn), nor can you
sort a stack. In this case, then, we could easily use one of the
three sequence containers, and use only those operations permitted
by the interface of a <tt class="classname">stack</tt>.</p>
<p>The stack container does this for you, and &quot;adapts&quot; the
interface of <tt class="classname">deque</tt>, such that <tt class=
"methodname">push()</tt> becomes <tt class=
"methodname">deque&lt;&gt;::push_back()</tt>, and <tt class=
"methodname">pop()</tt> becomes <tt class=
"methodname">deque&lt;&gt;::pop_back()</tt>. The main utility in
using the STL stack adaptor over using deque directly is that you
are stating your intent: this program requires a <tt class=
"classname">stack</tt>, and is restricting its usage to that of a
<tt class="classname">stack</tt>.</p>
<p>There is one catch to this. Normally a stack's <tt class=
"methodname">pop()</tt> method will remove the top element of the
<tt class="classname">stack</tt> and return it:</p>
<pre class="programlisting">
my_stack&lt; int &gt; s;
// ...
int p = s.pop();
</pre>
<p>The STL stack adaptor's <tt class="methodname">pop()</tt> method
returns nothing, and this may seem rather odd. Looking deeper,
however, we notice that there is a method <tt class=
"methodname">top()</tt> which returns a reference to the top
element.</p>
<p>The reason for this is efficiency. One of the things you may
need to do with a <tt class="classname">stack</tt> is to pop off a
number of elements, discarding each one. Since the element is being
removed, if <tt class="methodname">pop()</tt> were to return it, it
would have to make a copy and return it by value. If that value is
then discarded, at least one copy of a possibly expensive object
has been made. The alternative is to <span class=
"emphasis"><em>allow</em></span> a copy to be made prior to its
removal if that value is needed:</p>
<pre class="programlisting">
stack&lt; string &gt; s;
// ...
string name = s.top();
s.pop();
</pre></div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e735" id="d0e735"></a>Queue</h3>
</div>
<p>Again, a queue is a well-known data structure which allows
first-in, first-out (FIFO) functionality. This means it operates
like a double-ended <tt class="classname">stack</tt>, where you
have access to the elements at either end of the container. As with
<tt class="classname">stack</tt>, you cannot access the other
elements. <tt class="methodname">push()</tt> adds a new element to
the back of the <tt class="classname">queue</tt>, <tt class=
"methodname">pop()</tt> removes one from the front of the
<tt class="classname">queue</tt>. The methods <tt class=
"methodname">back()</tt> and <tt class="methodname">front()</tt>
provide references to those elements respectively, allowing you to
make copies if required.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e764" id="d0e764"></a>Priority
Queue</h3>
</div>
<p><tt class="classname">priority_queue</tt> is like <tt class=
"classname">stack</tt> more than <tt class="classname">queue</tt>,
the difference being that the top element is always the largest (it
doesn't have a good four letter acronym!). Hence, <tt class=
"methodname">push()</tt> adds a new element to the <tt class=
"classname">priority_queue</tt>, and <tt class=
"methodname">pop()</tt> removes the top one, which will always be
the largest, according to some comparison. The default comparison
is the standard functor <tt class="classname">less&lt;&gt;</tt>,
which by default uses <tt class="function">operator&lt;()</tt> for
the given type, but you can provide your own comparator if
required.</p>
<p>All of the container adaptors allow you to choose your own
underlying container upon which they will be based. By default,
<tt class="classname">stack</tt> and <tt class=
"classname">queue</tt> use <tt class="classname">deque</tt> as
their implementation, and <tt class="classname">priority_queue</tt>
uses <tt class="classname">vector</tt>. Any container you provide
must conform to the adaptor's requirements. These are ( [<a href=
"#Austern1998">Austern1998</a>]):</p>
<div class="table"><a name="d0e812" id="d0e812"></a>
<p class="title c2">Table 1. </p>
<table border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>stack</td>
<td>Back Insertion Sequence</td>
</tr>
<tr>
<td rowspan="2">queue</td>
<td>Back Insertion Sequence</td>
</tr>
<tr>
<td>Front Insertion Sequence</td>
</tr>
<tr>
<td>priority_queue</td>
<td>Random Access Container</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e833" id="d0e833"></a>...And Rapids
- Iterator Adaptors</h2>
</div>
<p>All of the standard algorithms work with iterators. The reason
for this is generality (or more accurately, genericity). It means
the algorithms can work for any container whose iterators fulfil
the requirements of the algorithm.</p>
<p>So far, so good.</p>
<p>Consider then the standard algorithm <tt class=
"function">copy()</tt>. In its simplest form, it takes three
iterators, the first two describing the source range, and the third
describing the start position (iterator) of the target. Here is a
plausible example:</p>
<pre class="programlisting">
list&lt; string &gt; names;
// initialise names with suitable elements
list&lt; string &gt;target( names.size() );
copy( names.begin(), names.end(),
               target.begin() );
</pre>
<p>Note that we must make enough space in target before copying the
elements from names to it, because the default assignment for an
iterator is to overwrite the previous contents. This makes perfect
sense, usually. However, what we'd really like to achieve here is
to make <tt class="function">copy()</tt> perform its actions by
using <tt class="methodname">push_back()</tt> on the list, thereby
removing the need to pre-size the target container.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e855" id="d0e855"></a>Insert
Iterators</h3>
</div>
<p>We cannot change the behaviour of <tt class=
"function">copy()</tt> easily, but we could easily create an
iterator type that overrides the default meaning of <tt class=
"methodname">operator=()</tt> to perform <tt class=
"methodname">push_back()</tt> on the container.</p>
<p>Enter the Back Insert Iterator. It works with any Back Insertion
Sequence (a container with a <tt class=
"methodname">push_back()</tt> method, which means <tt class=
"classname">list</tt>, <tt class="classname">vector</tt> and
<tt class="classname">deque</tt> only, in the C++ Standard Library)
and works like this:</p>
<pre class="programlisting">
list&lt; string &gt; names;
// initialise names with suitable elements
list&lt; string &gt;target;
copy( names.begin(), names.end(),
           back_inserter( target ) );
</pre>
<p>There is a corresponding Front Insert Iterator which works for
containers with <tt class="methodname">push_front()</tt>, which
means <tt class="classname">list</tt> and <tt class=
"classname">deque</tt> only.</p>
<p>However, what happens if your target isn't a Back Insertion
Sequence, or you wish to add elements to the middle of the
container? Ah well, these library designers they think of
everything. There is another Insert Iterator called, well,
inserter. As well as the container itself, you provide an iterator
to where you would like insertions to start. Thus, the following
snippet is equivalent to our previous example:</p>
<pre class="programlisting">
list&lt; string &gt; names;
// initialise names with suitable elements
list&lt; string &gt;target;
copy( names.begin(), names.end(), inserter( target, target.end() ) );
</pre>
<p>The difference being that inserter requires that the target
container has an insert method which takes an iterator as the
insertion point as well as the value to be added. This includes the
Associative Containers we looked at previously. You will recall
that the Associative Containers are all sorted. In their case, this
insertion-point iterator is merely a hint - the position at which
to start looking for an existing element. When using an inserter
into a <tt class="classname">map</tt> or <tt class=
"classname">set</tt>, it's usually best where possible to use
<tt class="methodname">begin()</tt> or <tt class=
"methodname">end()</tt> as the iterator argument to inserter.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e914" id="d0e914"></a>Reverse
Iterators</h3>
</div>
<p>It sometimes makes sense to read a container backwards. It may
be for the purposes of finding any subliminal messages, or perhaps
reading in reverse alphabetical order. For whatever reason, there
are two ways to do this.</p>
<p>The first and simplest method is when you are working directly
with a container. Each of the standard containers provide an
<tt class="methodname">rbegin()</tt> and <tt class=
"methodname">rend()</tt> method. <tt class=
"methodname">rbegin()</tt> returns a Reverse Iterator, which is the
equivalent in position of <tt class="methodname">end()</tt>, except
that incrementing it moves it backwards. Similarly, <tt class=
"methodname">rend()</tt> has the equivalent position of <tt class=
"methodname">begin()</tt> in the container.</p>
<p>The difficulty comes when you have an algorithm which accepts
Forward Iterators, but for reasons of efficiency perhaps wants to
work with Reverse Iterators. Consider a function called <tt class=
"function">find_last_of()</tt>. It takes an iterator range and a
value to find. You don't really wish to expose the implementation
too much by requiring Reverse Iterators, but then searching the
entire range from start to finish might be very inefficient. You
need some way of converting a Forward Iterator into a Reverse
Iterator. Once again, the C++ Standard is ahead of you, and
provides an Iterator Adaptor for exactly this purpose.</p>
<p><tt class="classname">reverse_iterator</tt> changes the
direction of a given Bi-Directional Iterator. All of the standard
containers provide Bi-Directional Iterators from their <tt class=
"methodname">begin()</tt> and <tt class="methodname">end()</tt>
methods. This means you can now write <tt class=
"function">find_last_of()</tt> like this:</p>
<pre class="programlisting">
template&lt;typename Iterator, typename Value&gt;
Iterator find_last_of(Iterator begin,
       Iterator end, Value v ){
  reverse_iterator&lt;Iterator&gt; rbegin(end);
  reverse_iterator&lt;Iterator&gt; rend(begin);
  reverse_iterator&lt; Iterator &gt; result = 
              find( rbegin, rend, v );
  return result.base();
}
</pre>
<p>That last line which returns the <tt class=
"methodname">base()</tt> of the Reverse Iterator is needed because
a Reverse Iterator cannot just convert back to a normal one. For
technical reasons. It has to do with the fact that an iterator
range is a half-open range, meaning it contains the first but not
the last elements [<a href="#Love2001">Love2001</a>], [<a href=
"#Josuttis1999">Josuttis1999</a>]. The reversed End (one after the
end) is now one before the start, which is not a valid position.
Therefore, some hand-waving is needed to make it valid.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e970" id="d0e970"></a>Summary</h2>
</div>
<p>The C++ Standard Library provides you with three common data
structures which use the basic containers in specialised ways. The
reason for this is mainly documentation: by using a stack instead
of a vector, you are making your intentions clear to any programmer
reading that code. Where possible, you should follow this lead, and
instead of writing your own container from scratch, consider
re-using one of the standard containers, and adapting it. That way,
you save yourself time and effort, at the same time as the time and
effort of the poor old maintainer - which might be you. What's not
to like about that?</p>
<p>The Iterator Adaptors give you amazing flexibility when using
the standard algorithms, and when writing your own algorithms.
Insert Iterators really are the last word in decoupling algorithms
from data structures, which in turn has the fortunate effect of
giving you the ability to use those algorithms in ways which
otherwise may have required a custom function. Reusing tried and
tested code is a cornerstone of software development, because if
something is already tested, you (shouldn't) don't have to test it
again.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e977" id="d0e977"></a>Back Safe and
Sound</h2>
</div>
<p>As an end-note, I will mention a discussion I had regarding the
number of lines of code required for a particular algorithm. This
discussion revolved around the pros and cons of STL in particular,
but was really focussed on using STL containers over C style
arrays. I calculated that using only C arrays it would need 60
lines of code to perform a job that I could perform in 2 using STL.
The response was that I was forgetting about the (probably more
than) 58 lines in the STL code that were still required. My closing
response to that was, &quot;well, that's 58 lines of code I don't have
to write - or test.&quot;</p>
<p>The moral of the story - write less code. Go on. You know you
want to.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e984" id=
"d0e984"></a>Acknowledgements</h2>
</div>
<p>Thanks once again to Nigel Dickens for reading through the first
drafts of this article. Thank you also to Kevlin Henney who once
again sent me advance copies of yet-to-be published articles of
his. Also during the writing of this, I find myself time and again
referring to both [<a href="#Josuttis1999">Josuttis1999</a>] and
[<a href="#Austern1998">Austern1998</a>]. They deserve more than a
biblio entry. Perhaps a beer or two if we ever meet face to
face...</p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e995" id=
"d0e995"></a>Bibliography</h2>
</div>
<div class="bibliomixed"><a name="ISO1998" id="ISO1998"></a>
<p class="bibliomixed">[ISO1998] International Standard:
Programming Language - C++, ISO/IEC 14882:1998(E), 1998</p>
</div>
<div class="bibliomixed"><a name="Austern1998" id=
"Austern1998"></a>
<p class="bibliomixed">[Austern1998] Matthew H Austern, Generic
Programming And The STL,Addison Wesley 1998</p>
</div>
<div class="bibliomixed"><a name="Josuttis1999" id=
"Josuttis1999"></a>
<p class="bibliomixed">[Josuttis1999] Nicolai Josuttis, The C++
Standard Library - A Tutorial and Reference,Addison Wesley 1999</p>
</div>
<div class="bibliomixed"><a name="Love2001" id="Love2001"></a>
<p class="bibliomixed">[Love2001] Steve Love, Are You Afraid Of The
Dark?,Overload 43, 2001</p>
</div>
<div class="bibliomixed"><a name="Stroustrup2000" id=
"Stroustrup2000"></a>
<p class="bibliomixed">[Stroustrup2000] Bjarne Stroustrup, The C++
Programming Language, Special Edition,Addison Wesley 2000</p>
</div>
<div class="bibliomixed"><a name="STLPort2001" id=
"STLPort2001"></a>
<p class="bibliomixed">[STLPort2001] <span class=
"bibliomisc"><a href="http://www.stlport.org" target=
"_top">www.stlport.org</a></span>, Portable and free STL
implementation,</p>
</div>
<div class="bibliomixed"><a name="Henney2001" id="Henney2001"></a>
<p class="bibliomixed">[Henney2001] Various - background</p>
</div>
</div>
<div class="footnotes"><br>
<hr class="c3" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e77" href="#d0e77" id=
"ftn.d0e77">1</a>]</sup> std::string exposes Random Access
iterators [<a href="#Josuttis1999">Josuttis1999</a>], and so also
meets the requirements of Reversible Container [<a href=
"#ISO1998">ISO1998</a>]</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e93" href="#d0e93" id=
"ftn.d0e93">2</a>]</sup> In addition, multiple iterations over the
sequence must return the elements in the same order [<a href=
"#Austern1998">Austern1998</a>]</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e158" href="#d0e158" id=
"ftn.d0e158">3</a>]</sup> Exactly what it says; for some x and y, x
&lt; y must be well-defined.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e191" href="#d0e191" id=
"ftn.d0e191">4</a>]</sup> Note we've used a normal function rather
than a function object here. for_each will receive a pointer to a
function, which can be dereferenced (called) in the same way as a
function. See [<a href="#Stroustrup2000">Stroustrup2000</a>] for
details.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e223" href="#d0e223" id=
"ftn.d0e223">5</a>]</sup> Don't confuse this with Random Access as
typified by vector and deque. It is merely a notational convenience
in map.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e547" href="#d0e547" id=
"ftn.d0e547">6</a>]</sup> From the right - the first bit is the
least significant bit.</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
