    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: A More Flexible Container</title>
        <link>https://members.accu.org/index.php/articles/321</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Overload Journal #58 - Dec 2003</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c154/">58</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-154/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+154/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;A More Flexible Container</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 02 December 2003 21:55:55 +00:00 or Tue, 02 December 2003 21:55:55 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a>The Standard
Set</h2>
</div>
<p>Suppose I want to store pointers to some <tt class=
"classname">Employee</tt> records in a set, and then order that set
by <tt class="classname">Employee</tt> name. That is easy; just
make a functor that compares two <tt class=
"classname">Employee</tt> names, and apply it to a set. The snippet
below shows how. The example stores bare pointers, but smart
pointers are often a better choice for storing in containers.</p>
<pre class="programlisting">
class Employee {
public:
  const std::string&amp; GetName() const;
  // ... other functions
};

struct CompareEmployees {
  inline bool operator()(const Employee* l,
                         const Employee* r) const {
    return (l-&gt;GetName() &lt; r-&gt;GetName());
  }
};

typedef std::set&lt;Employee*, CompareEmployees&gt; EmployeeSet;
EmployeeSet employees;
</pre>
<p>Now, suppose I want to find a certain Employee record that
matches any given name. Well, to do that, I have to pass in an
<tt class="classname">Employee</tt> pointer with the properties I
am looking for. This is because the set functions that search -
<tt class="function">find</tt>, <tt class="function">count</tt>,
<tt class="function">lower_bound</tt>, <tt class=
"function">upper_bound</tt>, <tt class="function">equal_range</tt>,
and <tt class="function">erase</tt> - require a reference to the
same type as the key of the set. These functions, except <tt class=
"function">erase</tt>, are commonly called the &quot;Special Set
Operations&quot;. Using the example above, I have to pass in a pointer
to a bogus <tt class="classname">Employee</tt> record. The bogus
object only exists as a comparison value for finding the real
record. Assuming an <tt class="classname">Employee</tt> object can
be constructed using just a name, then the code looks something
like this:</p>
<pre class="programlisting">
Employee* FindEmployee(const std::string&amp; name) {
  Employee bogus(name);
  EmployeeSet::iterator it = employees.find(&amp;bogus);
  return (employees.end() == it) ? 0 : *it;
}
</pre>
<p>But, what if I can't make an Employee object so easily? Perhaps
no <tt class="classname">Employee</tt> constructor accepts just a
name. Or maybe constructing any <tt class="classname">Employee</tt>
object is so expensive that making a temporary on the stack is not
worth the effort. Why is it necessary to make the bogus <tt class=
"classname">Employee</tt> record anyway? It would be so much
simpler to just pass in the name itself to the <tt class=
"function">set::find</tt> function and have it return the iterator
to the target object.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e82" id="d0e82"></a>Template Member
Functions</h2>
</div>
<p>The <tt class="function">std::set</tt>'s search functions
typically require a reference to the same type as the set's key.
They look like this:</p>
<pre class="programlisting">
template&lt;class Key,class Compare,class Alloc&gt;
class set {
  // ... other parts of set class
public:
  template&lt;class Key&gt;
    size_type erase(const Key&amp; x);
};
</pre>
<p>Passing any type into these functions is possible only if the
functions themselves are templates. These functions are not
templated in the STL's associative containers - and would cause
code bloat if they were, which I shall discuss later. The next code
snippet shows the declarations of these functions as if they are
template member functions. Notice that the <tt class=
"classname">Compare_Type</tt> used for each templated function is
not among the templates for the class itself. As long as the
<tt class="classname">Compare_Type</tt> is comparable to the
<tt class="classname">Key</tt> type, it can be used by the functor.
For completeness, listed below are both the <tt class=
"function">const</tt> and non-<tt class="function">const</tt>
versions. I chose the unimaginative name of <tt class=
"function">flex_set</tt> for the container class. Other than the
name, and templated member functions, it behaves just like
<tt class="function">std::set</tt>.</p>
<pre class="programlisting">
template&lt;class Key,class Compare,class Alloc&gt;
class flex_set {
  // ... other parts of flex_set class
public:
  template&lt;class Compare_Type&gt;
    size_type erase(const Compare_Type&amp; x);

  template&lt;class Compare_Type&gt;
    size_type count(const Compare_Type&amp; x) const;

  template&lt;class Compare_Type&gt;
    iterator find(const Compare_Type&amp; x);

  template&lt;class Compare_Type&gt;
    const_iterator find(
      const Compare_Type&amp; x) const;

  template&lt;class Compare_Type
    iterator lower_bound(
      const Compare_Type&amp; x);

  template&lt;class Compare_Type&gt;
    const_iterator lower_bound(
      const Compare_Type&amp; x) const;

  template&lt;class Compare_Type&gt;
    iterator upper_bound(
      const Compare_Type&amp; x);

  template&lt;class Compare_Type&gt;
    const_iterator upper_bound(
      const Compare_Type&amp; x) const;

  template&lt;class Compare_Type&gt;
    pair&lt;iterator,iterator&gt; equal_range(
      const Compare_Type&amp; x);

  template&lt;class Compare_Type&gt;
    pair&lt;const_iterator,const_iterator&gt;
      equal_range(const Compare_Type&amp; x) const;
};
</pre>
<p>This looks good so far, but how will these template member
functions know how to compare some arbitrary type to the set's key
type? The answer to that lies within the comparison functor, or
comparator, used to order the set's elements. The comparator is
overloaded to compare an <tt class="classname">Employee</tt> record
to a <tt class="function">const std::string</tt>. There are
overloads so the name can be on the right or left side for
symmetric comparisons. (Indeed, some member functions of <tt class=
"classname">flex_set</tt> require both.) The result is shown below
along with a more efficient <tt class="function">FindEmployee</tt>
function.</p>
<pre class="programlisting">
std::binary_function&lt;const Employee*,
                     const Employee*,
                     bool&gt; {
  inline bool operator()(const Employee* l,
                         const Employee* r) const {
    return (l-&gt;GetName() &lt; r-&gt;GetName());
  }

  inline bool operator()(const Employee* l,
                         const std::string&amp; r) const {
    return (l-&gt;GetName() &lt; r );
  }

  inline bool operator()(const std::string&amp; l,
                         const Employee* r) const {
    return ( l &lt; r-&gt;GetName() );
  }
};

Employee* FindEmployee(const std::string&amp; name) {
  EmployeeSet::iterator
               it(employees.find(name));
    return (employees.end() == it) ? 0 : *it;
}
</pre>
<p>Okay, this is much better. No need to make a bogus <tt class=
"classname">Employee</tt> object just to find the real <tt class=
"classname">Employee</tt>. Just pass in the employee name, and the
templated <tt class="function">flex_set::find</tt> function returns
the proper iterator.</p>
<p>For every possible type that can search through the container,
simply add that type to the comparator and the compiler
automatically makes the templated search functions. Using that
idea, the example above can be further overloaded to compare a
C-style string as shown here. I typically need only one additional
type in the functor besides the key type, so here I would have
chosen either <tt class="function">std::string</tt> or <tt class=
"function">const char *</tt>, but probably not both. (Admittedly,
having a C-style string overload means I do not have to convert a
<tt class="function">const char *</tt> to <tt class=
"function">std::string</tt> before doing the search, so one less
object to construct.) Now that I can search using a type other than
the key type of the set, I often use the key type only when
inserting, and so the <tt class=
"classname">Employee</tt>-to-<tt class="classname">Employee</tt>
comparison in the functor is only used for inserting.</p>
<pre class="programlisting">
struct CompareEmployees :
  std::binary_function&lt;const Employee*,
                       const Employee*,
                       bool&gt; {
  // rest of functor as shown above.

  inline bool operator()(const Employee* l,
                         const char* r) const {
    return (l-&gt;GetName() &lt; r );
  }

  inline bool operator()(const char* l,
                         const Employee* r) const {
    return (l &lt; r-&gt;GetName());
  }
};
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e166" id="d0e166"></a>An Alternate
Solution</h2>
</div>
<p>There is another method for searching through associative
containers using types other than the key type. A bunch of
wrappers, like that shown below, can be stored inside a <tt class=
"function">std::set</tt>, and allow us to do searches by name.</p>
<pre class="programlisting">
struct EmployeeWrap {
  EmployeeWrap(Employee*);                // not explicit
  EmployeeWrap(const char*);              // not explicit
  EmployeeWrap(const std:string&amp;);        // not explicit

  // Either of these is used, but not both.
  Employee* m_Employee;
  const std::string m_name;
  inline const std::string&amp; GetName(void) const {
    return (0 == m_Employee) ? m_name : m_Employee-&gt;GetName();
  }

  bool operator&lt;(const EmployeeWrap&amp; l,
                 const EmployeeWrap&amp; r);
};

  typedef std::set&lt;EmployeeWrap&gt; EmployeeWrapperSet;

  Employee* FindEmployee(const std::string&amp; name) {
    EmployeeWrapperSet::iterator
                        it(employeeWrappers.find(name));
    return (employeeWrappers.end() == it)
           ? 0 : (*it).m_Employee;
}
</pre>
<p>This wrapper has several disadvantages. One is that it requires
a special wrapper class to hold all the types needed for comparing.
To compare <tt class="classname">Employee</tt> records with
additional types, those types would have to be added to the
wrapper, instead of just overloading the functor as needed for a
<tt class="function">flex_set</tt>. The example above has 2
elements, and the <tt class="function">std::string</tt> has to be
instantiated even if it is never used. The <tt class=
"function">sizeof(std::string)</tt> varies from 4 bytes to 28 bytes
depending upon the implementation. This increases the overall
memory consumed by <tt class="classname">std::set</tt> even though
all instances of <tt class="classname">EmployeeWrap</tt> within the
set will not use the <tt class="classname">std::string</tt>.
Another disadvantage is that the code above constructs an unnamed
temporary of <tt class="classname">EmployeeWrap</tt> to pass into
the <tt class="function">std::set::find</tt>. This construction
cost is small, and most of the cost can be optimized away. Lastly,
<tt class="function">EmployeeWrap::GetName</tt> imposes a small
runtime cost to determine which data member has the name - a cost
not required by the <tt class="function">flex_set</tt> method. The
argument against the wrapper method becomes: &quot;If <tt class=
"function">flex_set</tt> is available, then why pay for several
costs that are not needed?&quot;</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e214" id="d0e214"></a>Other
Associative Containers</h2>
</div>
<p>Another alternate solution is, &quot;Why not just use map instead of
set?&quot; The <tt class="classname">Employee</tt> container will
be:</p>
<pre class="programlisting">
std::map&lt;std::string,Employee*&gt; employees
</pre>
<p>There are two answers for that, one is complicated, and the
other is more complicated. The complicated answer is that although
that provides the desired ordering, it also requires storing a copy
of the <tt class="classname">Employee</tt> name as a key for the
map. (The <tt class="classname">EmployeeWrap</tt> example above
also stores one <tt class="classname">std::string</tt> with an
<tt class="classname">Employee</tt> pointer, but <tt class=
"classname">std::map</tt> actually uses each <tt class=
"classname">std::string</tt>.) An intuitive reason to avoid this
practice is that the key, <tt class="classname">Employee</tt> name,
is part of the <tt class="classname">Employee</tt> object, and it
does not makes sense to store the key separately. A more practical
reason is that storing the name separately is inefficient. Another
practical reason is that someday, an <tt class=
"classname">Employee</tt> name will change, but the copy won't.
That copy is the key within a map, and people will be reluctant to
change the key in a map, but may not know that a property of the
value is the key, and so change name of the <tt class=
"classname">Employee</tt> without updating the container.</p>
<p>The more complicated answer is that <tt class=
"classname">std::map</tt> cannot provide the same flexible search
capabilities as those afforded by <tt class=
"classname">flex_set</tt>. To provide that flexibility for the
other associative containers, <tt class="classname">flex_map</tt>,
<tt class="classname">flex_multiset</tt>, and <tt class=
"classname">flex_multimap</tt> are needed, which are based upon
same idea. Many STL implementations use the same underlying
implementation for all four associative containers. By changing the
implementation to make any associative container more flexible, the
other three also become more flexible with little extra effort.</p>
<p>Using the templated <tt class="function">flex_map::find</tt>
member function allows for searches in a map of <tt class=
"classname">Employee</tt>s to <tt class=
"classname">Assignment</tt>s using just the name. This code shows a
<tt class="classname">flex_map</tt> of <tt class=
"classname">Employee</tt>s to their current <tt class=
"classname">Assignment</tt>s.</p>
<pre class="programlisting">
typedef flex_map&lt;Employee*,Assignment*,
                 CompareEmployees&gt; EmployeeAssignments;
EmployeeAssignments employeeAssignments;

Assignment* GetEmployeeCurrentTask(
            const std::string&amp; employeeName) {
  EmployeeAssignments::iterator it(
                       employeeAssignments.find(employeeName));
    return (employeeAssignments.end() == it)
           ? 0 : it-&gt;second;
}
</pre>
<p>If <tt class="classname">std::map</tt> were used, then something
similar to this is required.</p>
<pre class="programlisting">
typedef std::map&lt;Employee*,Assignment*,
                 CompareEmployees&gt; EmployeeAssignments;
EmployeeAssignments employeeAssignments;

Assignment* GetEmployeeCurrentTask(
            const std::string&amp; employeeName) {
  Employee bogus(employeeName);
  mployeeAssignments::iterator it(
                      employeeAssignments.find(&amp;bogus));
    return (employeeAssignments.end() == it)
           ? 0 : it-&gt;second;
}
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e302" id=
"d0e302"></a>Considerations</h2>
</div>
<p><tt class="classname">CompareEmployees</tt> inherits from
<tt class="classname">binary_function</tt> for use with adapters
such as <tt class="classname">not2</tt>, <tt class=
"classname">bind1st</tt>, and <tt class="classname">bind2nd</tt>.
The adapters will only work with the functor's function that has
the types specified as templates for <tt class=
"classname">binary_function</tt> and ignore the overloaded
functions. If an adaptable functor is needed that accepts some
other type, the simple solution is to make such a functor. The code
below shows a functor that derives from <tt class=
"classname">binary_function</tt> and can be used with <tt class=
"classname">std::string</tt>. Assuming that the <tt class=
"varname">hourlyEmployees</tt> container is sorted by name, the
<tt class="function">find_if</tt> call will locate the first
<tt class="classname">Employee</tt> with a name less than the given
name. One of the costs of the increased container flexibility is
that more functors are needed. But, these additional functors are
often needed for other purposes and searches on other containers
anyway, such as vector.</p>
<pre class="programlisting">
struct CompareEmployeeToName :
std::binary_function&lt;const Employee*,
                     const std::string&amp;, bool&gt; {
  inline bool operator()(const Employee* l,
                         const std::string&amp; r) const {
    return (l-&gt;GetName() &lt; r);
  }
};

typedef std::vector&lt;Employee*&gt; EmployeeVector;

EmployeeVector hourlyEmployees;
// Populate vector ...
EmployeeVector::iterator it(
  find_if(hourlyEmployees.begin(),
          hourlyEmployees.end(),
          bind2nd(CompareEmployeeToName(),
                  name)));
</pre>
<p>Another consideration is code bloat. Many C++ developers
complain that templates cause code bloat, and making template
functions out of otherwise normal functions allows for bloat.
Instead of having just one <tt class="function">flex_set::find
function</tt>, there can now be several. (My own experience is that
I rarely use more than one type of a function, so I am not paying
for more than one version of <tt class=
"function">flex_set::find</tt> anyway.) Fortunately, most search
functions in the associative containers are small and inline, so
code bloat will be minimal for them. Unfortunately, some functions
in the underlying implementation are not as small, but still
manageable. Still, the more flexible search abilities are worth a
little extra bloat because temporary objects are no longer created.
Since the temporary object is no longer needed, the cost of
constructing and destroying it goes away, which may actually shrink
the overall code size. Whether the flexible searching is worth the
little extra bloat is a decision that must be made for each type
and container.</p>
<p>The <tt class="classname">flex_set</tt> is not the ideal
container for primitive types. A container of type <tt class=
"classname">flex_set&lt;long&gt;</tt> allows both <tt class=
"function">flex_set::erase(const long&amp; x)</tt> and <tt class=
"function">flex_set::erase(const short&amp; x)</tt>. The compiler
created both functions instead of promoting the <tt class=
"classname">short</tt> to a <tt class="classname">long</tt> and
using only one function. This kind of code bloat can easily be
avoided by using <tt class="classname">std::set&lt;long&gt;</tt>
instead. A corollary to this is that member functions in <tt class=
"classname">std::set</tt> should not be templated. (Some simple
experiments with changing <tt class="classname">std::set</tt>
convinced me that making a separate <tt class=
"classname">flex_set</tt> class was a better solution.) The
<tt class="classname">flex_set</tt> is useful for containers that
store objects or store pointers to objects, but I prefer <tt class=
"classname">std::set</tt> for containers that store primitives.</p>
<p>Instead of making another container class, perhaps std::set
itself can be extended by adding additional functions that use
predicates. The algorithm functions <tt class=
"function">std::find_if</tt> and <tt class=
"function">std::count_if</tt> are similar to <tt class=
"function">std::find</tt> and <tt class="function">std::count</tt>
except that they use a predicate instead of a value. Would a
putative <tt class="function">std::set::find_if</tt> member
function that accepts a predicate work? Not really, because the
predicate needs to compare an element to a value, and so <tt class=
"function">std::set::find_if</tt> will need to receive both the
value and the predicate. Which means the compare value passed into
<tt class="function">std::set::find_if</tt> would have to be
constructed. This just reintroduces the original problem of
constructing a temporary. Nor could adding predicates to the
<tt class="function">erase</tt>, <tt class=
"function">lower_bound</tt>, <tt class="function">upper_bound</tt>,
<tt class="function">equal_range</tt>, and <tt class=
"function">binary_search</tt> member functions of <tt class=
"classname">std::set</tt> provide any greater efficiency than what
<tt class="classname">flex_set</tt> already provides for these
functions. Using a unary predicate which stores the compare value
does not work either, since unary predicates cannot change the
order for the compare and key values, and the predicate must be
able to use the compare value on both the left and right side.</p>
<p>Could any other member functions of <tt class=
"classname">flex_set</tt> be templatized in the same way? Only one
other function accepts a reference to a <tt class=
"classname">const</tt> key type as a parameter, and that function
is <tt class="function">insert</tt>. Inserting anything but a key
value is meaningless, so the answer is negative. The special set
operations, and one of the erase functions, are the only candidates
for becoming templates.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e442" id="d0e442"></a>Summary</h2>
</div>
<p>More flexible variations of the four associative containers are
possible by changing the search functions into template member
functions. Each data type used as a parameter to these functions
requires overloading the comparator to compare these data types to
the key value of the container's elements. A caveat that goes with
the overloading by type is that each type has to be comparable to
the element type. The flexible containers have a beneficial side
effect of resulting in more efficient code because named temporary
variables are no longer necessary. Nothing is gained by changing
std::set or the other STL associative containers. Changing member
functions into templated functions causes bloat for containers of
primitive types, and passing predicates as parameters into
<tt class="classname">std::set</tt> functions does not work.</p>
<p>You may use the source code provided at: <a href=
"http://www.richsposato.com/software.html" target=
"_top">http://www.richsposato.com/software.html</a> as a
replacement for the associative containers provided with the GCC
compiler.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e455" id=
"d0e455"></a>Acknowledgements</h2>
</div>
<p>Many thanks to Don Organ, Gerald Chan, Phil Bass, Alan
Griffiths, Juan Carlos Aguilar, and Cherryl Smith for their
feedback and insightful comments.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
