    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Code Critique Competition 93</title>
        <link>https://members.accu.org/index.php/articles/2101</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + CVu Journal Vol 27, #2 - May 2015</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c349/">272</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-349/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+349/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Code Critique Competition 93</h1>
<p><strong>Author:</strong>&nbsp;Martin Moene</p>
<p>
<strong>Date:</strong> 01 May 2015 17:21:12 +01:00 or Fri, 01 May 2015 17:21:12 +01:00</p>
<p><strong>Summary:</strong>&nbsp;Set and collated by Roger Orr. A book prize is awarded for the best entry.</p>
<p><strong>Body:</strong>&nbsp;<p class="EditorIntro">Participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.</p>

<p class="EditorIntro">Note: If you would rather not have your critique visible online, please inform me. (We will remove email addresses!)</p>

<h2>Last issueâ€™s code</h2>

<p class="blockquote">Iâ€™m trying to use the new(ish) <code>unordered_map.</code> and my simple test program works with some compilers and options but not with others. I canâ€™t see why it wouldnâ€™t work consistently â€“ are some of the compilers just broken? I do get a compiler warning about conversion from string constant to cstring â€“ is that whatâ€™s wrong?</p>

<p>Can you explain what is wrong and help this programmer to fix (and improve) their simple test program? The code is in Listing 1.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &lt;iostream&gt;
#include &lt;string&gt;
#include &lt;unordered_map&gt;
typedef char *cstring;
typedef std::unordered_map&lt;cstring, cstring&gt; AddressMap;
// Hold email addresses
class Addresses
{
public:
  // add addr for name
  void add(cstring name, cstring addr)
  {
    addresses[name] = addr;
  }
  // get addr from name or null if not found
  cstring get(cstring name)
  {
    if (addresses.count(name))
    {
      return addresses[name];
    }
    return 0;
  }
private:
  AddressMap addresses;
};

int main()
{
  Addresses addr;
  addr.add(&quot;roger&quot;,
    &quot;rogero@howzatt.demon.co.uk&quot;);
  addr.add(&quot;chris&quot;,
    &quot;chris@gmail.com&quot;);
  cstring  myadd = addr.get(&quot;roger&quot;);
  if (myadd == 0)
  {
    std::cout &lt;&lt; &quot;Didn't find details&quot;
      &lt;&lt; std::endl;
    return 1;
  }
  std::cout &lt;&lt; &quot;Found &quot; &lt;&lt; myadd &lt;&lt; std::endl;
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 1</td>
	</tr>
</table>

<h2>Critiques</h2>

<h3>Tom BjÃ¶rkholm</h3>

<p>There are several problems with this code in Code Critique 92.</p>

<p>The main problem is a confusion about values and pointers. The way the code is written it stores pointers (addresses to memory locations) not string values in the unordered map. This is a fundamental problem. The C++ standard containers are designed to hold values not pointers.</p>

<p>More specifically the values stored in the container <code>addresses</code> are the pointers (memory address values) not the strings at those memory locations. This is fundamentally flawed as in any real program those memory locations would probably be used for other data later in the life of the program, causing the program to crash (or exhibit other strange behavior such as incorrect results).</p>

<p>So why does this work at all? Why doesnâ€™t this program crash at once? The reason is that only compile time constant strings are used as data in this example. The strings <code>&quot;roger&quot;</code>, <code>&quot;chris&quot;</code>, etc. in the main program are all compile time constant strings. These strings are stored by the compiler in a (usually read-only) data segment. Thus, in this example the data on the memory locations used, does not change.</p>

<p>The fact that the compiler is supposed to store these compile-time constant strings in read-only memory is the reason for the compiler warning. Having a non-<code>const</code> pointer to data that is a compile-time constant is not good. The warning can be fixed by changing</p>

&lt;CodeListing&gt;
  typedef char *cstring;&lt;/CodeListing&gt;

<p>to</p>

&lt;CodeListing&gt;
  typedef const char *cstring;&lt;/CodeListing&gt;

<p>With this change, there are no compiler warnings, but the program is still fundamentally broken as the key used for lookup in the map is the value of a memory address, not the name string.</p>

<p>So why does this produce the expected result on some compilers? Some compilers notice that the identical string <code>&quot;roger&quot;</code> appears both in one of the <code>add()</code> function calls and in the <code>get()</code> function call. The compiler is then free to store the string only once in memory, and use the same address (i.e. pointer value) in both function calls. However, the compiler is also allowed to store both strings <code>&quot;roger&quot;</code> in memory at different addresses. With a compiler that stores the string <code>&quot;roger&quot;</code> at only one memory location, the program produces the expected result. But that is because of a comparison of memory addresses used as keys, not because of a comparison on key strings.</p>

<p>The correct way to fix this program is to use <code>std::string</code> instead of character pointers. Then the values stored in the <code>unordered_map</code> are real string values, and the keys that are compared are the strings (not the memory locations).</p>

<p>The <code>get</code> function can also be improved. As it is written now the lookup is done twice, once to count the number of matching elements (zero or one) and once to find the element to return. (In general it is a good habit to avoid <code>count()</code> and <code>size()</code> if the task can be solved by using <code>empty()</code> or <code>find()</code> instead.)</p>

<p>When <code>std::string</code> is used for the keys and values, it will no longer be possible to indicate â€˜not foundâ€™ as a null pointer. Not found can either be indicated using an empty string, or the <code>get()</code> method could return a <code>struct</code> with a string value and a <code>bool</code> flag.</p>

<p>Personally I would have used a <code>using</code> alias inside the <code>Addresses</code> class instead of a global <code>typedef</code>, but this is a matter of personal style preferences. To keep the program recognizable to the original author, I have opted to not make changes based on personal style preferences. I have opted to let the empty string indicate not found here (as no valid email address can be the empty string). The complete fixed program would then be:</p>

<pre class="programlisting">
  #include &lt;iostream&gt;
  #include &lt;string&gt;
  #include &lt;unordered_map&gt;
  typedef std::unordered_map&lt;std::string,
          std::string&gt; AddressMap;
  // Hold email addresses
  class Addresses{
  public:
    // add addr for name
    void add(const std::string &amp; name,
            const std::string &amp;addr){
      addresses[name] = addr;
    }
    // get addr from name or null if not found
    std::string get(const std::string &amp; name) const
    {
      const AddressMap::const_iterator it =
          addresses.find(name);
      if (addresses.end() != it) {
          return it-&gt;second;
      }
      return &quot;&quot;;
    }

  private:
    AddressMap addresses;
  };

  int main() {
    Addresses addr;
    addr.add(&quot;roger&quot;,
             &quot;rogero@howzatt.demon.co.uk&quot;);
    addr.add(&quot;chris&quot;,
             &quot;chris@gmail.com&quot;);
    std::string myadd = addr.get(&quot;roger&quot;);
    if (myadd == &quot;&quot;) {
      std::cout &lt;&lt; &quot;Didn't find details&quot;
                  &lt;&lt; std::endl;
      return 1;
    }
    std::cout &lt;&lt; &quot;Found &quot; &lt;&lt; myadd
              &lt;&lt; std::endl;
  return 0;
}</pre>


<h3>Simon Brand</h3>

<p>Note: Any standards references are from C++ draft N4296 (<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf</a>).</p>

<p>You are coming across an issue because you are attempting to use <code>cstrings</code> as keys.</p>

<p>Why is this an issue? Because <code>cstrings</code> are simply pointers to a contiguous area of memory which is null-terminated. If you attempt to check two <code>cstrings</code> for equality using <code>operator==</code> you are checking if they point to exactly the same area of memory, not if the data they point to is the same.</p>

<p>To see why this causes an issue with <code>unordered_map</code>, we need to know a little about how it is implemented. An <code>unordered_map</code> is an implementation of a hash map (Â§23.2.5.9) (some nice explanations here: <a href="http://stackoverflow.com/questions/730620/how-does-a-hash-table-work">http://stackoverflow.com/questions/730620/how-does-a-hash-table-work</a>), so it needs a hash function to convert a key into a bucket index, then a comparison function to resolve collisions where keys hash to the same bucket.</p>

<p>Â§23.5.4 outlines the <code>unordered_map</code> class template, which can optionally take in functors as template arguments to use as the hashing and equality functions. Now we can see that <code>unordered_map</code> uses <code>std::equal_to&lt;Key&gt;</code> as its default equality function and <code>std::hash&lt;Key&gt;</code> as the hashing function. Unless otherwise specialised, <code>equal_to</code> simply invokes <code>operator==</code>, and it has no specialisation for <code>char*</code> (Â§20.9.6). This means that whenever you try and perform a key comparison in the map, you are just comparing the pointers. <code>std::hash</code> is implemented such that if <code>k1 == k2</code>, <code>hash(k1) == hash(k2)</code> (Â§23.2.5.1.4), so the same applies. As such, using <code>char*</code> as the key type in <code>unordered_map</code> means that the keys must be pointing to the same part of memory.</p>

<p>So why does this work for some compilers and not for others? The answer is optimisation. Most compilers will see that you have used the same string literal twice and so will just use the same memory for any instances of that literal. In this example, both times you use <code>&quot;roger&quot;</code>, you could be getting pointers to completely different areas of memory, but the compiler optimises and just reuses the same data. Now when you hash or compare your keys, you just so happen to be passing the same pointer in due to a compiler optimisation, so you get the expected result. If you were, for example, reading the string in at runtime, this would fail every time on a conforming compiler.</p>

<p>By now you might be thinking â€œWhy didnâ€™t I notice I was comparing pointers?â€. I think your problem is that you <code>typedef</code>â€™d <code>char*</code> to <code>cstring</code>. Hiding implementation details like that is often a good idea, but using raw pointers is often error prone, so you just make yourself complacent by trying to forget that youâ€™re using pointers.</p>

<p>In the case of <code>AddressMap</code>, this abstraction is a really good idea because if you want to change the implementation of your class to use ordered maps instead, you just need to change it in one place. A possible improvement to this would be:</p>

<pre class="programlisting">
  class Addresses
  {
  private:
    using map_type =
      std::unordered_map&lt;cstring, cstring&gt;;
  }</pre>

<p>This is called a nested type name (Â§9.9) and is preferable to your version because it doesn't pollute the global namespace. Making it private ensures that you don't leak implementation details. Substituting the <code>typedef</code> for an alias declaration (Â§7.1.3.2) is mostly just for consistency here. This syntax is more powerful as it allows you to use templates in your type alias, which you canâ€™t do with <code>typedef</code>s. Because it provides a superset of functionality, I prefer to use it everywhere Iâ€™d usually use a <code>typedef</code> (Scott Meyers discusses this in Item 9 of his book <em>Effective Modern C++</em>).</p>

<p>Iâ€™ll now outline a couple of ways to fix your code.</p>

<h4>Version 1</h4>

<p>As a preface to this fix, this is more an example for completeness of how you could use a <code>char*</code> in an <code>unordered_map</code>. As Iâ€™ll show later, you should not do this and use <code>std::string</code> instead.</p>

<p>As stated above, the issue is that <code>unordered_map</code> compares your <code>char*</code>s by their pointer, not by the string they point to. In order to modify this, we can pass a hash and comparison function in to our map as either a template or constructor argument.</p>

<pre class="programlisting">
  struct AddressHash
  {
    //taken from Stroustrup's The C++
    //Programming Language
    size_t operator()(const char* ptr) const {
      size_t h = 0;
      while (*ptr)
      {
        h = h &lt;&lt; 1 ^ *ptr;
        ++ptr;
      }
      return h;
    }
  };

  struct AddressComparison {
    bool operator()(const char* str1,
                    const char* str2) const {
        return strcmp(str1, str2) == 0;
    }
  };</pre>
  
<p>This all looks fine, but because we are using pointers, this will fail if we add a pointer in which is later invalidated. For example:</p>

<pre class="programlisting">
  void addstuff(Addresses &amp;addr)
  {
    char rog[6] = &quot;roger&quot;;
    addr.add(rog,
      &quot;rogero@howzatt.demon.co.uk&quot;);
  }</pre>
  
<p>That call to <code>addr.add</code> will decay the <code>char[]</code> to a <code>char*</code>, copy the pointer (not the data) and store it in the map. Then when the function exits, the <code>char[]</code> which holds the string is destroyed, so trying to access it through the pointer we saved is undefined. Oops! We could probably solve this issue by adding a level of abstraction and managing copies to these strings, but therein lies madness. If only there were a class which is like a <code>char*</code> but has sensible value semantics...</p>

<h4>Version 2</h4>

<p>I know, weâ€™ll use <code>std::string</code> instead!</p>

<p><code>std::string</code> already has the comparison and hashing functions we need, so simply modifying your interface like so fixes most of our issues:</p>

<pre class="programlisting">
  typedef  std::unordered_map&lt;std::string,
           std::string&gt; AddressMap;

  class Addresses {
  public:
    void add(const std::string &amp;name,
      const std::string *addr)
    std::string &amp;get(const std::string &amp;name)
  };</pre>

<p>Another problem is the use of flag values for the return of <code>Addresses::get</code>. Returning a value which means an error has occurred and assuming that the client will remember to check for it is usually ill-advised. Itâ€™s better to throw an exception so that they <strong>need</strong> to deal with it unless they want their application to crash and die. Fortunately, <code>unordered_map::at</code> already does this, throwing an <code>out_of_range</code> exception if the key doesnâ€™t exist. Now we can modify our <code>get</code> function like so:</p>

<pre class="programlisting">
  std::string &amp;get(const std::string &amp;name)
  {
    return addresses.at(name);
  }</pre>
  
<p>You should always ensure that your functions are named in a way which correctly reflects their semantics. The function <code>add</code> is maybe a bit misleading, because you could pass in a key which already has a value and it would be updated rather than adding a new entry. We could call it something like <code>addOrUpdate</code>, but thereâ€™s already a more natural version:</p>

<pre class="programlisting">
  std::string &amp;operator[] (
    const std::string&amp; key)
  {
    return addresses[key];
  }</pre>
  

<p>Now we can update our code like so:</p>

<pre class="programlisting">
  int main()
  {
    Addresses addr;
    addr[&quot;roger&quot;] =
      &quot;rogero@howzatt.demon.co.uk&quot;;
    addr[&quot;chris&quot;] = &quot;chris@gmail.com&quot;;
    try
    {
      std::string &amp;myadd = addr.get(&quot;roger&quot;);
      std::cout &lt;&lt; &quot;Found &quot; &lt;&lt; myadd
        &lt;&lt; std::endl;
    }
    catch (const std::out_of_range&amp; oor)
    {
      std::cout &lt;&lt; &quot;Didn't find details&quot;
        &lt;&lt; std::endl;
      return 1;
    }
  }</pre>
 
<p>We should now check our functions to ensure that they are const-correct, flexible and fast. Our <code>operator[]</code> takes a <code>const std::string&amp;</code>, returns a <code>std::string&amp;</code> and is not marked <code>const</code>. The lack of <code>const</code> is correct, because calling <code>operator[]</code> on an <code>unordered_map</code> adds a new key if one does not exist. Returning a <code>std::string&amp;</code> is correct, because it allows us to update the object held in the map. Taking the argument as a <code>const</code> reference, however, can be improved. Imagine we are populating our map with data parsed from a large external file like this:</p>

<pre class="programlisting">
  while (stillHaveData())
    addr[getNextKey()] = getNextValue();</pre>
	
<p>The only use for the return value of <code>getNextKey()</code> is to store it in our internal map. It canâ€™t possibly be used anywhere else, because we donâ€™t store a reference or pointer to the data. Unfortunately, because <code>operator[]</code> is taking its argument as a <code>const&amp;</code>, we are going to be copying every single key string into the map. This could get expensive; weâ€™d much rather just reuse the data rather than copying it. To achieve this we can implement perfect forwarding (http://thbecker.net/articles/rvalue_references/section_07.html) on <code>operator[]</code>: if it is passed an <code>lvalue</code>, copy it; if it is passed an <code>rvalue</code>, move it.</p>

<pre class="programlisting">
  template &lt;typename T&gt;
  std::string&amp; operator[] (T&amp;&amp; key)
  {
    return addresses[std::forward&lt;T&gt;(key)];
  }</pre>
  
<p>This is much more efficient, as <code>std::forward</code> (Â§20.2.4) ensures that the reference type is preserved through our call, so the correct version of <code>std::unordered_map::operator[]</code> is called and the copy is avoided if necessary. We get this efficiency for free with the actual assignment, because <code>std::string</code> provides a move constructor which will be called when we attempt to assign an <code>rvalue</code>.</p>
<p>Now for our get function. It takes a <code>const std:string&amp;</code>, returns a <code>std::string&amp;</code> and is not marked <code>const</code>. We want to be able to check the contents of the map when we have a <code>const Addresses</code> object, but also to use it to update values when itâ€™s non-<code>const</code>, so we need a <code>const</code> and non-<code>const</code> version of the function. Taking the argument by <code>const&amp;</code> is fine here, as we arenâ€™t going to be doing a copy. We could optimize by using <code>std::experimental::string_view</code>, and you should certainly have a look at that class, but you can be forgiven for waiting until itâ€™s actually in the standard. Returning by <code>std::string&amp;</code> is correct as we want to be able to update our value if possible, but it needs to be a <code>const&amp;</code> when using a <code>const Addresses</code>. Now the function looks like:</p>

<pre class="programlisting">
  const std::string &amp;get(
    const std::string &amp;name) const
  {
    return addresses.at(name);
  }
  std::string &amp;get(const std::string &amp;name)
  {
    return addresses.at(name);
  }</pre>
  
<p>Now the final version of our class:</p>

<pre class="programlisting">
  class Addresses
  {
  public:
    template &lt;typename T&gt;
    std::string &amp;operator[] (T&amp;&amp; rhs)
    { return addresses[rhs]; }
    const std::string &amp;get(
      const std::string &amp;name) const
    { return addresses.at(name); }
    std::string &amp;get(const std::string &amp;name)
    { return addresses.at(name); }
  private:
    using map_type = std::unordered_map&lt;
       std::string, std::string&gt;;
    map_type addresses;
  };</pre>

<h3>Jim Segrave</h3>

<p>The problem of compiler errors is trivial to fix â€“ string literals have the type <code>char const *</code>, but the <code>add()</code> member function expects arguments of <code>char *</code>. A <code>char const *</code> can only be used with a cast. Changing the <code>typedef</code> of <code>cstring</code> to be <code>typedef char const *</code> will stop the compiler errors and the program will work.</p>

<p>But there are larger problems.</p>

<p>The keys to the map are pointers to <code>char</code> arrays. Two separate character arrays with identical contents will produce two separate entries in the map. For example, a <code>get()</code> for the name <code>&quot;roger&quot;</code> which uses a different character array to hold the text <code>&quot;roger&quot;</code> than the one used for inserting will not find the entry. The key needs to be the actual value, not a pointer to the value. Further the key has to remain valid and unchanged as long as the map exists, which is not a guarantee that a pointer can make.</p>

<p>The values are also stored as pointers to character arrays, which opens new opportunities for failures. What happens if the array passed as a value for the address goes out of scope? You now have an invalid pointer in the map and if you do a lookup and get that pointer returned, any use of it leads to undefined behaviour. What happens if you look up an entry and keep the pointer to the address character array and that array goes out of scope? You now have a dangling pointer which is an invitation to undefined behaviour.</p>

<p>My updated version stores a copy of the text of the name, so the lifetime of the name passed to the <code>add()</code> member function no longer affects the map. Since Iâ€™m copying the text, using a <code>std::string</code> instead of a character pointer adds flexibility â€“ the name can be a string literal, a <code>char</code> array or a <code>std::string</code>.</p>

<p>I changed the map so that the address stored is a <code>shared_ptr</code> to a newly allocated copy of the string originally passed to the <code>add()</code> member function. Now if the string passed to <code>add()</code> is deleted, the contents of the map are still valid. If you use <code>get()</code> to obtain a pointer to an entry in the map, that pointer will remain valid even if the map itself is deleted. Only when the map is deleted and all the pointers to data that was in the map are deleted do all the address strings get removed, but you donâ€™t have to do the bookkeeping to make that happen.</p>

<p>In order to test the map fully, I added a function to exercise the map. It uses an array of <code>struct</code>s holding a name and address, so adding additional test cases requires simply inserting a {name, address} braced pair into the array. It inserts all the pairs from the array, then looks up each name in the array to see that the address data is correct. It correctly flags when a name is entered twice with different addresses.</p>

<p>It then looks up a name known not to be in the map to test the not-found behaviour. It also deliberately updates an entry with an address taken from a local variable which will go out of scope when the function returns. In <code>main</code>, that entry is looked up and the address is checked to see that it matches what was in the local variable.</p>

<p>My updated version</p>

<pre class="programlisting">
  #include &lt;iostream&gt;
  #include &lt;string&gt;
  #include &lt;unordered_map&gt;
  #include &lt;memory&gt;
  using std::string;
  using shp = std::shared_ptr&lt;string&gt;;
  typedef std::unordered_map&lt;string, shp&gt;
    AddressMap;
  // Hold email addresses
  class Addresses
  {
  public:
    // add addr for name
    void add(const string&amp; name,
      const string&amp; addr)
    {
      addresses[name] = 
      std::make_shared&lt;string&gt;(addr); 
    }
  // get shared ptr to addr from name or null 
  // if not found
  shp get(string name)
  {
    if (addresses.count(name))
    {
      return addresses[name];
    }
    return shp(nullptr);
  }

  private:
    AddressMap addresses;
  };

  // pairs of names and addresses for testing
  struct test_val {
    string name;
    string addr;
  };
  test_val tests[] = {
    {&quot;roger&quot;, &quot;rogero@howzatt.demon.co.uk&quot;},
    {&quot;chris&quot;, &quot;chris@gmail.com&quot;},
    // duplicate name with different address
    {&quot;roger&quot;, &quot;rogero@gmail.com&quot;},
  };

  // test insertions and lookups, including a
  // deliberate lookup failure
  // Update an entry with local auto strings for 
  // the name and address
  void test_map(Addresses&amp; a) {
    // remember longest name so we can create a 
    // name known not to be in the map
    string longest = &quot;&quot;;
    for(auto test: tests) {

    if(longest.size() &lt; test.name.size()) {
        longest = test.name;
    }
    a.add(test.name, test.addr);
  }
  // check that all the names have been 
  // inserted correctly
  // (the duplicated name will generate an
  // error report)
  for(auto test: tests) {
    shp addr = a.get(test.name);
    if(addr == nullptr) {
      std::cout &lt;&lt; &quot;Insert of &quot; &lt;&lt; test.name 
                &lt;&lt; &quot; failed&quot; &lt;&lt; std::endl;
    }

    if(*addr != test.addr) {
      std::cout &lt;&lt; &quot;Expected lookup of &quot;
                &lt;&lt; test.name
                &lt;&lt; &quot; to return &quot; &lt;&lt; test.addr 
                &lt;&lt; &quot; but returned &quot;
                &lt;&lt; *addr &lt;&lt; std::endl;
    }
  }
  // update an entry using a local auto
  // variable
  string new_val{&quot;auto_variable&quot;};
  string new_name{&quot;test_auto&quot;};
  a.add(new_name, new_val);

  // generate a name known not to be in the
  // map
  if(a.get(longest + &quot;x&quot;) != nullptr) {
    std::cout &lt;&lt; &quot;Map reported that &quot;
              &lt;&lt; longest &lt;&lt; &quot; is in the map, &quot;
                 &quot;although it should not be&quot;
                 &lt;&lt; std::endl;
    }
  }

  int main()
  {
    Addresses addr;
    test_map(addr);
    // at this call to addr.get, the string used
    // to set the address is gone
    shp check = addr.get(&quot;test_auto&quot;);

    if(check == nullptr) {
      std::cout &lt;&lt; &quot;Failed to find entry for&quot;
                   &quot; \&quot;test_auto\&quot;&quot;
                &lt;&lt; std::endl;
    }

    if(*check != &quot;auto_variable&quot;) {
      std::cout &lt;&lt; &quot;Expected lookup of&quot;
                   &quot; \&quot;test_auto\&quot; to be &quot;
                &lt;&lt; &quot;\&quot;auto_variable\&quot;&quot;
                &lt;&lt; &quot; but got &quot;
                &lt;&lt; *check &lt;&lt; std::endl;
    }
  }</pre>

<h3>James Holland</h3>

<p>I can understand the programmer being baffled by the program behaving differently when using different compilers. After all, although it may not be perfect, the source code seems reasonably well constructed and should work as expected. So what is going on? Before we dig deeper, letâ€™s get rid of the warning message issued by some compilers.</p>

<p>Literal strings such as <code>&quot;roger&quot;</code> are considered to be constant and, therefore, should be incapable of being modified. The warning is saying that a pointer of type <code>char *</code> is being assigned the address of the first character of a constant string thus allowing the string to be modified by use of the pointer. This makes a mockery of the fact that the string is meant to be constant. Such assignments have been deprecated since C++98 and so compilers should, at least, issue a warning. Sadly, not all do. What is required is a pointer that is incapable of changing what it is pointing to. This is easily achieved, in this case, by adding a <code>const</code> to the declaration of <code>cstring</code> giving <code>typedef const char * cstring</code>. This will keep compilers happy.</p>

<p>Unfortunately, removing the warning has not altered the behaviour of the code. It still produces different results on different compilers. So something else must be causing the problem. It turns out to be a combination of the type of the unordered map key and the way in which some compilers optimise the code. Letâ€™s start with the key type.</p>

<p>From the sample code, it can be seen that <code>Addresses</code> member functions <code>add</code> and <code>get</code> have parameters of type <code>cstring</code> where <code>cstring</code> is a <code>typedef</code> for <code>const char *</code> (the <code>const</code> has just been added as described above). The strings that are passed to the functions are of type <code>const char[n]</code> where <code>n</code> is the length of the string. For example, <code>&quot;roger&quot;</code> is of type <code>const char[6]</code>. This difference in type is permitted because array types decay into pointers as they are assigned to the function parameters. The body of the functions, therefore, deal with pointers to strings.</p>

<p>Also, the unordered map has been defined to have both key and value of type <code>cstring</code>. In other words, the key is a pointer to a string and not the string itself. This is important because the unordered map determines whether an entry already exists by discovering whether any of the stored pointers have the same value as the address of the string passed (in our case) to its <code>operator[]</code> function. The address of a string is simply the location in memory where it is stored. So, the question is at what memory location is a string constant stored. This brings us back to compiler optimisations.</p>

<p>One compiler optimisation technique is called string pooling. This involves placing only one copy of identical string constants in memory instead of having multiple copies of the same string. If this optimisation takes place, p and q in the following code would be equal in value.</p>

<pre class="programlisting">
  const char * p = &quot;This is a string&quot;;
  const char * q = &quot;This is a string&quot;;</pre>

<p>If this optimisation is not performed, the value of <code>p</code> and <code>q</code> would be different as two separate, but identical, strings would be stored.</p>

<p>This optimisation is critical to the program as to whether strings will be found. If string pooling takes place, the string <code>&quot;roger&quot;</code> used in the add function will have the same address as the identical string used in the <code>get</code> function. The unordered mapâ€™s count function will, therefore, consider the string as found. If string pooling does not take place, the two strings, despite being identical in value, will have different addresses and so a match will not be found.</p>

<p>Incidentally, and ignoring the identified problems for a moment, there is an inefficiency in the <code>get</code> function. Two searches of the unordered map are made; one in the <code>count</code> function and one in the <code>operator[]</code> function. The hash calculation is, therefore, performed twice. Clearly, it would be more efficient if the hash value was calculated once. This can be achieved by using the unordered mapâ€™s <code>find</code> function as follows.</p>

<pre class="programlisting">
  cstring get(cstring name)
  {
    return addresses.find(name) ==
      addresses.end() ? nullptr : name;
  }</pre>

<p>Clearly, it is not acceptable for the behaviour of the program to depend on compiler optimisations. Fortunately, there are a few things that can be done to remedy the situation.</p>

<p>As stated above, the unordered map compares the value of pointers to strings to determine whether strings are identical and (as also explained above) this only works when string pooling takes place. What is really wanted is to compare the strings themselves, not the pointers to the strings. Doing this would result in the software behaving as expected irrespective of whether string pooling was in effect. This can be achieved by providing a user-defined hash function and equivalence criterion. These take the form of function objects that are passed to the constructor of the unordered map.</p>

<p>A suitable function object that defines the hash function is shown below.</p>

<pre class="programlisting">
  class String_hash
  {
    public:
    std::size_t operator()(cstring s) const
    {
      return std::hash&lt;std::string&gt;()(s);
    }
  };</pre>

<p>Providing a good hash function can be difficult so I have cheated somewhat by using the hash function supplied by the standard library. The library provides hash functions for most common types. In this case I have used <code>std::string</code>. The constructor of <code>std::string</code> takes the pointer to a <code>char</code> and creates an <code>std::string</code> object that is passed to <code>std::hash</code> from which the hash code is generated.</p>

<p>A suitable equivalence criterion function object is shown below.</p>

<pre class="programlisting">
  class String_equal
  {
    public:
    bool operator()(cstring s1, cstring s2)
      const
    {
      return strcmp(s1, s2) == 0;
    }
  };</pre>

<p>The <code>strcmp</code> function takes a pointer to each of the strings to be compared and returns a value of zero if the two strings are equal. This value is compared with zero to give a return value of <code>true</code> if the two strings are identical and <code>false</code> otherwise. Note that the <code>strcmp</code> function is made available by adding <code>#include &lt;cstring&gt;</code> to the program.</p>

<p>The two function objects are passed, as template arguments, to the unordered map constructor as shown below.</p>

<pre class="programlisting">
  typedef std::unordered_map&lt;cstring, cstring,
  String_hash, String_equal&gt; AddressMap;</pre>

<p>The program will now work as expected and does not depend on any compiler optimisation techniques. It has, however, been quite an effort to get to this stage and the modifications were not all that intuitive. There must be a simpler way of achieving our goals.</p>

<p>The problem lies in not choosing language features that provide the right level of abstraction. The student programmer has chosen <code>cstring</code>s (as provided by the original C programming language) to represent ordinary textual strings. While these can be used, they do have disadvantages. One problem is that they do not behave quite as the novice may expect. For example, I am sure the student was under the impression that the strings themselves were being stored in the map. This illusion may have been reinforced by naming the first deftype <code>cstring</code>. The name implying, perhaps, that strings are being stored, whereas the deftype actually refers to a pointer to <code>char</code>s. Also, the student included the unnecessary library header <code>&lt;string&gt;</code> thus giving the impression that strings were being manipulated.</p>

<p>The standard library provides a string class (<code>std::string</code>) that is designed to be intuitive and to present few or no problems to the user. It is this class that should be used in place of <code>cstring</code>s. The <code>std::string</code> class is a more appropriate level of abstraction for this application (and just about all others).</p>

<p>Rewriting the program to use <code>std::string</code>s gives the following.</p>

<pre class="programlisting">
  #include &lt;iostream&gt;
  #include &lt;string&gt;
  #include &lt;unordered_map&gt;

  typedef std::unordered_map&lt;std::string,
    std::string&gt; AddressMap;
  class Addresses
  {
    public:
    void add(std::string name, std::string addr)
    {
      addresses[name] = addr;
    }
    std::string get(std::string name)
    {
      return addresses.find(name) ==
        addresses.end() ? &quot;&quot; : name;
    }
    private:
    AddressMap addresses;
  };

  int main()
  {
    Addresses addr;
    addr.add(&quot;roger&quot;,
      &quot;rogero@howzatt.demon.co.uk&quot;);
    addr.add(&quot;chris&quot;, &quot;chris@gmail.com&quot;);
    std::string myadd = addr.get(&quot;roger&quot;);

    if (myadd == &quot;&quot;)
    {
      std::cout &lt;&lt; &quot;Didn't find details&quot; 
        &lt;&lt; std::endl;
      return 1;
    }
    std::cout &lt;&lt; &quot;Found &quot; &lt;&lt; myadd &lt;&lt; std::endl;
  }</pre>

<p>Making the interface of <code>Addresses</code> as similar as possible to the original has resulted in slightly awkward code in the <code>get</code> function. The unordered mapâ€™s <code>find</code> function returns an iterator that is used to determine whether the string was found. If the iterator has a value of â€˜one passed the endâ€™(indicating that the string was not found), a null string is returned from <code>get</code>; otherwise the located string is returned. In the <code>main</code> program, the string returned from <code>get</code> is compared with a null string to determine whether the email details were found. Despite this, the software is easy to understand and behaves as expected regardless of compiler optimisations; something that could not be said for the original version.</p>

<h3>Paul Floyd &lt;paulf@free.fr&gt;</h3>

<p>The non-answer which fixes the warning is to change the <code>cstring typedef</code> to use <code>const char*</code>.</p>

<p>The fundamental issue is that the <code>std::unordered_map</code> does not have a hash function that hashes the string content of pointers to character arrays (C strings if you prefer). It will hash pointers and <code>std::string</code>s. So in this case, it is the string pointer that is being hashed. Assuming that itâ€™s the C string contents that you want to use as the key, then this will work as long as there is a 1:1 relationship between the pointer and the string. If string literals (as in this case) are de-duplicated, then it will work. This is not guaranteed, resulting in implementation defined behaviour. I did try this with a few compilers (and debug/optimized modes), and it worked in all cases.</p>

<p>To make this work reliably, use a <code>std::string</code>, which does have a hash function defined for it, e.g.,</p>

<pre class="programlisting">
  typedef std::unordered_map&lt;
    std::string, std::string&gt; AddressMap;</pre>

<p>I donâ€™t think that the <code>add</code> function serves much purpose.</p>

<p>The <code>get</code> function does avoid inserting/returning a default element as <code>operator[]</code> would do. However, it could also be modified to avoid two lookups:</p>

<pre class="programlisting">
  std::string get(const std::string&amp; name)
  {
    AddressMap::const_iterator citer =
       addresses.find(name);
    if (citer != addresses.end())
    {
      return citer-&gt;second;
    }
    else
    {
       return std::string();
    }
  }</pre>

<p>Incidentally, I donâ€™t like inline bodies in class definitions. I always try to put them outside as â€˜inlineâ€™.</p>

<p>It isnâ€™t clear to me whether empty strings should be allowed in the map. If so, this would cause problems as here an empty string is being used to test whether the key was found in the map. Here the <code>add</code> function could be useful, by not allowing empty strings to be added.</p>

<p>Testing wise, at least one test that is expected to fail should be added.</p>

<h2>Commentary</h2>

<p>This problem amused me because the code worked because of string pooling â€“ this is a well known problem with Java code (where using <code>==</code> on strings has the same sort of issues as in this case) but less common in C/C++!</p>

<p>I donâ€™t think there is much left to add to the comments made in the critiques above. I was interested to notice that after, converting the solution to use <code>std::string</code>, two of the entries check for failure of <code>get()</code> by using <code>if (value == &quot;&quot;)</code> â€“ I feel in this case it is more readable to use <code>if (value.empty())</code> but â€˜your mileage may varyâ€™.</p>

<p>I was a little surprised that none of the entries added any input validation to the <code>add</code> method â€“ in particular it would seem sensible to prevent adding the â€˜not foundâ€™ value!</p>

<h2>The winner of CC92</h2>

<p>There were some good critiques in this batch and it wasnâ€™t easy to decide which one was best. Nearly everyone explained about the need for adding <code>const</code> to the <code>typedef</code>, but then explained that this wouldnâ€™t solve the underlying issue. I particularly liked Tomâ€™s well-placed reminder that the â€˜standard containers are designed to hold values not pointersâ€™ as this gives the programmer a higher level explanation of what the issue is.</p>

<p>Jimâ€™s approach of using a <code>shared_ptr</code> was quite elegant, and nicely solves the problem of reporting failure from <code>get</code>, but I fear this makes the solution harder to use. Is in this case an empty email address seems a perfectly good way to indicate â€˜not foundâ€™. I liked Simonâ€™s approach to handling failure in the <code>get</code> function by using the <code>at</code> method in <code>std::unordered_map</code>, this also allowed him to support <code>const</code> correctness which was something other solutions didnâ€™t explain (even when <code>const</code> was added to a signature.)</p>

<p>Overall I felt Simonâ€™s solution was the most complete (although it did omit explaining the compiler warning) and so I have awarded him the prize against fairly stiff competition. Thanks to all the entrants!</p>

<h2>Code Critique 93</h2>

<p>(Submissions to scc@accu.org by June 1st)</p>

<p class="blockquote">Iâ€™m trying to write a simple program to demonstrate the Mandelbrot set and Iâ€™m hoping to get example output like this</p>

<pre class="programlisting">
                                      *
                                    ****
                                    ****
                                    ****

                              *  **********
                             ******************
                              *****************
                             *****************
                            *******************
                           **********************
                           *********************
                 * **     **********************
                 *******  **********************
                ********* **********************
                ********* **********************
              * ********* *********************
 *********************************************
              * ********* *********************
                ********* **********************
                ********* **********************
                 *******  **********************
                 * **     **********************
                           *********************
                           **********************
                            *******************
                             *****************
                              *****************
                             ******************
                              *  **********

                                    ****
                                    ****
                                    ****
                                      *
</pre>

<p class="blockquote">but I just get 6 lines of stars and the rest blank. Can you help me get this program working?&quot;</p>

&lt;IMAGE xml:link=&quot;simple&quot; href=&quot;CCC-1.gif&quot; show=&quot;embed&quot; actuate=&quot;auto&quot;/&gt;

<p>Can you find out what is wrong and help this programmer to fix (and improve) their simple test program? The code is in Listing 2.</p>

<p>You can also get the current problem from the accu-general mail list (next entry is posted around the last issueâ€™s deadline) or from the ACCU website (http://www.accu.org/journals/). This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &lt;array&gt;
#include &lt;complex&gt;
#include &lt;iostream&gt;

using row = std::array&lt;bool, 60&gt;;
using matrix = std::array&lt;row, 40&gt;;

std::ostream&amp; operator&lt;&lt;(std::ostream &amp;os, row const &amp;rhs)
{
  for (auto const &amp;v : rhs)
    os &lt;&lt; &quot;* &quot;[v];
  return os;
}
std::ostream&amp; operator&lt;&lt;(std::ostream &amp;os, matrix const &amp;rhs)
{
  for (auto const &amp;v : rhs)
    os &lt;&lt; v &lt;&lt; '\n';
  return os;  
}

class Mandelbrot
{
  matrix data = {};
  std::complex&lt;double&gt; origin = (-0.5);
  double scale = 20.0;
public:
  matrix const &amp; operator()()
  {
    for (int y(0); y != data.size(); ++y)
    {
      for (int x(0); x != data[y].size(); ++x)
      {
        std::complex&lt;double&gt;
          c = (xcoord(x), ycoord(y)), z = c;
        int k = 0;
        for (; k &lt; 200; ++k)
        {
          z = z*z + c;
          if (abs(z) &gt;= 2)
          {
            data[y][x] = true;
            break;
          }
        }
      }
    }
    return data;
  }
private:
  double xcoord(int xpos)
  {
     return origin.real() + (xpos -
       data[0].size()/2) / scale;
  }
  double ycoord(int ypos)
  {
     return origin.imag() + (data.size()/2 -
       ypos) / scale;
  }
};
int main()
{
  Mandelbrot set;
  std::cout &lt;&lt; set() &lt;&lt; std::endl;
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 2</td>
	</tr>
</table>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
