Title: Code Critique Competition 93

Author: Martin Moene

Date: 01 May 2015 17:21:12 +01:00 or Fri, 01 May 2015 17:21:12 +01:00

Summary: Set and collated by Roger Orr. A book prize is awarded for the best entry.

Body:

Participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.

Note: If you would rather not have your critique visible online, please inform me. (We will remove email addresses!)

Last issueâ€™s code

Iâ€™m trying to use the new(ish) unordered_map. and my simple test program works with some compilers and options but not with others. I canâ€™t see why it wouldnâ€™t work consistently â€“ are some of the compilers just broken? I do get a compiler warning about conversion from string constant to cstring â€“ is that whatâ€™s wrong?

Can you explain what is wrong and help this programmer to fix (and improve) their simple test program? The code is in Listing 1.

#include <iostream>
#include <string>
#include <unordered_map>
typedef char *cstring;
typedef std::unordered_map<cstring, cstring> AddressMap;
// Hold email addresses
class Addresses
{
public:
  // add addr for name
  void add(cstring name, cstring addr)
  {
    addresses[name] = addr;
  }
  // get addr from name or null if not found
  cstring get(cstring name)
  {
    if (addresses.count(name))
    {
      return addresses[name];
    }
    return 0;
  }
private:
  AddressMap addresses;
};

int main()
{
  Addresses addr;
  addr.add("roger",
    "rogero@howzatt.demon.co.uk");
  addr.add("chris",
    "chris@gmail.com");
  cstring  myadd = addr.get("roger");
  if (myadd == 0)
  {
    std::cout << "Didn't find details"
      << std::endl;
    return 1;
  }
  std::cout << "Found " << myadd << std::endl;
}

Listing 1

Critiques

Tom BjÃ¶rkholm

There are several problems with this code in Code Critique 92.

The main problem is a confusion about values and pointers. The way the code is written it stores pointers (addresses to memory locations) not string values in the unordered map. This is a fundamental problem. The C++ standard containers are designed to hold values not pointers.

More specifically the values stored in the container addresses are the pointers (memory address values) not the strings at those memory locations. This is fundamentally flawed as in any real program those memory locations would probably be used for other data later in the life of the program, causing the program to crash (or exhibit other strange behavior such as incorrect results).

So why does this work at all? Why doesnâ€™t this program crash at once? The reason is that only compile time constant strings are used as data in this example. The strings "roger", "chris", etc. in the main program are all compile time constant strings. These strings are stored by the compiler in a (usually read-only) data segment. Thus, in this example the data on the memory locations used, does not change.

The fact that the compiler is supposed to store these compile-time constant strings in read-only memory is the reason for the compiler warning. Having a non-const pointer to data that is a compile-time constant is not good. The warning can be fixed by changing

<CodeListing> typedef char *cstring;</CodeListing>

<CodeListing> typedef const char *cstring;</CodeListing>

With this change, there are no compiler warnings, but the program is still fundamentally broken as the key used for lookup in the map is the value of a memory address, not the name string.

So why does this produce the expected result on some compilers? Some compilers notice that the identical string "roger" appears both in one of the add() function calls and in the get() function call. The compiler is then free to store the string only once in memory, and use the same address (i.e. pointer value) in both function calls. However, the compiler is also allowed to store both strings "roger" in memory at different addresses. With a compiler that stores the string "roger" at only one memory location, the program produces the expected result. But that is because of a comparison of memory addresses used as keys, not because of a comparison on key strings.

The correct way to fix this program is to use std::string instead of character pointers. Then the values stored in the unordered_map are real string values, and the keys that are compared are the strings (not the memory locations).

The get function can also be improved. As it is written now the lookup is done twice, once to count the number of matching elements (zero or one) and once to find the element to return. (In general it is a good habit to avoid count() and size() if the task can be solved by using empty() or find() instead.)

When std::string is used for the keys and values, it will no longer be possible to indicate â€˜not foundâ€™ as a null pointer. Not found can either be indicated using an empty string, or the get() method could return a struct with a string value and a bool flag.

Personally I would have used a using alias inside the Addresses class instead of a global typedef, but this is a matter of personal style preferences. To keep the program recognizable to the original author, I have opted to not make changes based on personal style preferences. I have opted to let the empty string indicate not found here (as no valid email address can be the empty string). The complete fixed program would then be:

  #include <iostream>
  #include <string>
  #include <unordered_map>
  typedef std::unordered_map<std::string,
          std::string> AddressMap;
  // Hold email addresses
  class Addresses{
  public:
    // add addr for name
    void add(const std::string & name,
            const std::string &addr){
      addresses[name] = addr;
    }
    // get addr from name or null if not found
    std::string get(const std::string & name) const
    {
      const AddressMap::const_iterator it =
          addresses.find(name);
      if (addresses.end() != it) {
          return it->second;
      }
      return "";
    }

  private:
    AddressMap addresses;
  };

  int main() {
    Addresses addr;
    addr.add("roger",
             "rogero@howzatt.demon.co.uk");
    addr.add("chris",
             "chris@gmail.com");
    std::string myadd = addr.get("roger");
    if (myadd == "") {
      std::cout << "Didn't find details"
                  << std::endl;
      return 1;
    }
    std::cout << "Found " << myadd
              << std::endl;
  return 0;
}

Simon Brand

Note: Any standards references are from C++ draft N4296 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf).

You are coming across an issue because you are attempting to use cstrings as keys.

Why is this an issue? Because cstrings are simply pointers to a contiguous area of memory which is null-terminated. If you attempt to check two cstrings for equality using operator== you are checking if they point to exactly the same area of memory, not if the data they point to is the same.

To see why this causes an issue with unordered_map, we need to know a little about how it is implemented. An unordered_map is an implementation of a hash map (Â§23.2.5.9) (some nice explanations here: http://stackoverflow.com/questions/730620/how-does-a-hash-table-work), so it needs a hash function to convert a key into a bucket index, then a comparison function to resolve collisions where keys hash to the same bucket.

Â§23.5.4 outlines the unordered_map class template, which can optionally take in functors as template arguments to use as the hashing and equality functions. Now we can see that unordered_map uses std::equal_to<Key> as its default equality function and std::hash<Key> as the hashing function. Unless otherwise specialised, equal_to simply invokes operator==, and it has no specialisation for char* (Â§20.9.6). This means that whenever you try and perform a key comparison in the map, you are just comparing the pointers. std::hash is implemented such that if k1 == k2, hash(k1) == hash(k2) (Â§23.2.5.1.4), so the same applies. As such, using char* as the key type in unordered_map means that the keys must be pointing to the same part of memory.

So why does this work for some compilers and not for others? The answer is optimisation. Most compilers will see that you have used the same string literal twice and so will just use the same memory for any instances of that literal. In this example, both times you use "roger", you could be getting pointers to completely different areas of memory, but the compiler optimises and just reuses the same data. Now when you hash or compare your keys, you just so happen to be passing the same pointer in due to a compiler optimisation, so you get the expected result. If you were, for example, reading the string in at runtime, this would fail every time on a conforming compiler.

By now you might be thinking â€œWhy didnâ€™t I notice I was comparing pointers?â€. I think your problem is that you typedefâ€™d char* to cstring. Hiding implementation details like that is often a good idea, but using raw pointers is often error prone, so you just make yourself complacent by trying to forget that youâ€™re using pointers.

In the case of AddressMap, this abstraction is a really good idea because if you want to change the implementation of your class to use ordered maps instead, you just need to change it in one place. A possible improvement to this would be:

  class Addresses
  {
  private:
    using map_type =
      std::unordered_map<cstring, cstring>;
  }

This is called a nested type name (Â§9.9) and is preferable to your version because it doesn't pollute the global namespace. Making it private ensures that you don't leak implementation details. Substituting the typedef for an alias declaration (Â§7.1.3.2) is mostly just for consistency here. This syntax is more powerful as it allows you to use templates in your type alias, which you canâ€™t do with typedefs. Because it provides a superset of functionality, I prefer to use it everywhere Iâ€™d usually use a typedef (Scott Meyers discusses this in Item 9 of his book Effective Modern C++).

Iâ€™ll now outline a couple of ways to fix your code.

Version 1

As a preface to this fix, this is more an example for completeness of how you could use a char* in an unordered_map. As Iâ€™ll show later, you should not do this and use std::string instead.

As stated above, the issue is that unordered_map compares your char*s by their pointer, not by the string they point to. In order to modify this, we can pass a hash and comparison function in to our map as either a template or constructor argument.

  struct AddressHash
  {
    //taken from Stroustrup's The C++
    //Programming Language
    size_t operator()(const char* ptr) const {
      size_t h = 0;
      while (*ptr)
      {
        h = h << 1 ^ *ptr;
        ++ptr;
      }
      return h;
    }
  };

  struct AddressComparison {
    bool operator()(const char* str1,
                    const char* str2) const {
        return strcmp(str1, str2) == 0;
    }
  };

This all looks fine, but because we are using pointers, this will fail if we add a pointer in which is later invalidated. For example:

  void addstuff(Addresses &addr)
  {
    char rog[6] = "roger";
    addr.add(rog,
      "rogero@howzatt.demon.co.uk");
  }

That call to addr.add will decay the char[] to a char*, copy the pointer (not the data) and store it in the map. Then when the function exits, the char[] which holds the string is destroyed, so trying to access it through the pointer we saved is undefined. Oops! We could probably solve this issue by adding a level of abstraction and managing copies to these strings, but therein lies madness. If only there were a class which is like a char* but has sensible value semantics...

Version 2

I know, weâ€™ll use std::string instead!

std::string already has the comparison and hashing functions we need, so simply modifying your interface like so fixes most of our issues:

  typedef  std::unordered_map<std::string,
           std::string> AddressMap;

  class Addresses {
  public:
    void add(const std::string &name,
      const std::string *addr)
    std::string &get(const std::string &name)
  };

Another problem is the use of flag values for the return of Addresses::get. Returning a value which means an error has occurred and assuming that the client will remember to check for it is usually ill-advised. Itâ€™s better to throw an exception so that they need to deal with it unless they want their application to crash and die. Fortunately, unordered_map::at already does this, throwing an out_of_range exception if the key doesnâ€™t exist. Now we can modify our get function like so:

  std::string &get(const std::string &name)
  {
    return addresses.at(name);
  }

You should always ensure that your functions are named in a way which correctly reflects their semantics. The function add is maybe a bit misleading, because you could pass in a key which already has a value and it would be updated rather than adding a new entry. We could call it something like addOrUpdate, but thereâ€™s already a more natural version:

  std::string &operator[] (
    const std::string& key)
  {
    return addresses[key];
  }

Now we can update our code like so:

  int main()
  {
    Addresses addr;
    addr["roger"] =
      "rogero@howzatt.demon.co.uk";
    addr["chris"] = "chris@gmail.com";
    try
    {
      std::string &myadd = addr.get("roger");
      std::cout << "Found " << myadd
        << std::endl;
    }
    catch (const std::out_of_range& oor)
    {
      std::cout << "Didn't find details"
        << std::endl;
      return 1;
    }
  }

We should now check our functions to ensure that they are const-correct, flexible and fast. Our operator[] takes a const std::string&, returns a std::string& and is not marked const. The lack of const is correct, because calling operator[] on an unordered_map adds a new key if one does not exist. Returning a std::string& is correct, because it allows us to update the object held in the map. Taking the argument as a const reference, however, can be improved. Imagine we are populating our map with data parsed from a large external file like this:

  while (stillHaveData())
    addr[getNextKey()] = getNextValue();

The only use for the return value of getNextKey() is to store it in our internal map. It canâ€™t possibly be used anywhere else, because we donâ€™t store a reference or pointer to the data. Unfortunately, because operator[] is taking its argument as a const&, we are going to be copying every single key string into the map. This could get expensive; weâ€™d much rather just reuse the data rather than copying it. To achieve this we can implement perfect forwarding (http://thbecker.net/articles/rvalue_references/section_07.html) on operator[]: if it is passed an lvalue, copy it; if it is passed an rvalue, move it.

  template <typename T>
  std::string& operator[] (T&& key)
  {
    return addresses[std::forward<T>(key)];
  }

This is much more efficient, as std::forward (Â§20.2.4) ensures that the reference type is preserved through our call, so the correct version of std::unordered_map::operator[] is called and the copy is avoided if necessary. We get this efficiency for free with the actual assignment, because std::string provides a move constructor which will be called when we attempt to assign an rvalue.

Now for our get function. It takes a const std:string&, returns a std::string& and is not marked const. We want to be able to check the contents of the map when we have a const Addresses object, but also to use it to update values when itâ€™s non-const, so we need a const and non-const version of the function. Taking the argument by const& is fine here, as we arenâ€™t going to be doing a copy. We could optimize by using std::experimental::string_view, and you should certainly have a look at that class, but you can be forgiven for waiting until itâ€™s actually in the standard. Returning by std::string& is correct as we want to be able to update our value if possible, but it needs to be a const& when using a const Addresses. Now the function looks like:

  const std::string &get(
    const std::string &name) const
  {
    return addresses.at(name);
  }
  std::string &get(const std::string &name)
  {
    return addresses.at(name);
  }

Now the final version of our class:

  class Addresses
  {
  public:
    template <typename T>
    std::string &operator[] (T&& rhs)
    { return addresses[rhs]; }
    const std::string &get(
      const std::string &name) const
    { return addresses.at(name); }
    std::string &get(const std::string &name)
    { return addresses.at(name); }
  private:
    using map_type = std::unordered_map<
       std::string, std::string>;
    map_type addresses;
  };

Jim Segrave

The problem of compiler errors is trivial to fix â€“ string literals have the type char const *, but the add() member function expects arguments of char *. A char const * can only be used with a cast. Changing the typedef of cstring to be typedef char const * will stop the compiler errors and the program will work.

But there are larger problems.

The keys to the map are pointers to char arrays. Two separate character arrays with identical contents will produce two separate entries in the map. For example, a get() for the name "roger" which uses a different character array to hold the text "roger" than the one used for inserting will not find the entry. The key needs to be the actual value, not a pointer to the value. Further the key has to remain valid and unchanged as long as the map exists, which is not a guarantee that a pointer can make.

The values are also stored as pointers to character arrays, which opens new opportunities for failures. What happens if the array passed as a value for the address goes out of scope? You now have an invalid pointer in the map and if you do a lookup and get that pointer returned, any use of it leads to undefined behaviour. What happens if you look up an entry and keep the pointer to the address character array and that array goes out of scope? You now have a dangling pointer which is an invitation to undefined behaviour.

My updated version stores a copy of the text of the name, so the lifetime of the name passed to the add() member function no longer affects the map. Since Iâ€™m copying the text, using a std::string instead of a character pointer adds flexibility â€“ the name can be a string literal, a char array or a std::string.

I changed the map so that the address stored is a shared_ptr to a newly allocated copy of the string originally passed to the add() member function. Now if the string passed to add() is deleted, the contents of the map are still valid. If you use get() to obtain a pointer to an entry in the map, that pointer will remain valid even if the map itself is deleted. Only when the map is deleted and all the pointers to data that was in the map are deleted do all the address strings get removed, but you donâ€™t have to do the bookkeeping to make that happen.

In order to test the map fully, I added a function to exercise the map. It uses an array of structs holding a name and address, so adding additional test cases requires simply inserting a {name, address} braced pair into the array. It inserts all the pairs from the array, then looks up each name in the array to see that the address data is correct. It correctly flags when a name is entered twice with different addresses.

It then looks up a name known not to be in the map to test the not-found behaviour. It also deliberately updates an entry with an address taken from a local variable which will go out of scope when the function returns. In main, that entry is looked up and the address is checked to see that it matches what was in the local variable.

My updated version

  #include <iostream>
  #include <string>
  #include <unordered_map>
  #include <memory>
  using std::string;
  using shp = std::shared_ptr<string>;
  typedef std::unordered_map<string, shp>
    AddressMap;
  // Hold email addresses
  class Addresses
  {
  public:
    // add addr for name
    void add(const string& name,
      const string& addr)
    {
      addresses[name] = 
      std::make_shared<string>(addr); 
    }
  // get shared ptr to addr from name or null 
  // if not found
  shp get(string name)
  {
    if (addresses.count(name))
    {
      return addresses[name];
    }
    return shp(nullptr);
  }

  private:
    AddressMap addresses;
  };

  // pairs of names and addresses for testing
  struct test_val {
    string name;
    string addr;
  };
  test_val tests[] = {
    {"roger", "rogero@howzatt.demon.co.uk"},
    {"chris", "chris@gmail.com"},
    // duplicate name with different address
    {"roger", "rogero@gmail.com"},
  };

  // test insertions and lookups, including a
  // deliberate lookup failure
  // Update an entry with local auto strings for 
  // the name and address
  void test_map(Addresses& a) {
    // remember longest name so we can create a 
    // name known not to be in the map
    string longest = "";
    for(auto test: tests) {

    if(longest.size() < test.name.size()) {
        longest = test.name;
    }
    a.add(test.name, test.addr);
  }
  // check that all the names have been 
  // inserted correctly
  // (the duplicated name will generate an
  // error report)
  for(auto test: tests) {
    shp addr = a.get(test.name);
    if(addr == nullptr) {
      std::cout << "Insert of " << test.name 
                << " failed" << std::endl;
    }

    if(*addr != test.addr) {
      std::cout << "Expected lookup of "
                << test.name
                << " to return " << test.addr 
                << " but returned "
                << *addr << std::endl;
    }
  }
  // update an entry using a local auto
  // variable
  string new_val{"auto_variable"};
  string new_name{"test_auto"};
  a.add(new_name, new_val);

  // generate a name known not to be in the
  // map
  if(a.get(longest + "x") != nullptr) {
    std::cout << "Map reported that "
              << longest << " is in the map, "
                 "although it should not be"
                 << std::endl;
    }
  }

  int main()
  {
    Addresses addr;
    test_map(addr);
    // at this call to addr.get, the string used
    // to set the address is gone
    shp check = addr.get("test_auto");

    if(check == nullptr) {
      std::cout << "Failed to find entry for"
                   " \"test_auto\""
                << std::endl;
    }

    if(*check != "auto_variable") {
      std::cout << "Expected lookup of"
                   " \"test_auto\" to be "
                << "\"auto_variable\""
                << " but got "
                << *check << std::endl;
    }
  }

James Holland

I can understand the programmer being baffled by the program behaving differently when using different compilers. After all, although it may not be perfect, the source code seems reasonably well constructed and should work as expected. So what is going on? Before we dig deeper, letâ€™s get rid of the warning message issued by some compilers.

Literal strings such as "roger" are considered to be constant and, therefore, should be incapable of being modified. The warning is saying that a pointer of type char * is being assigned the address of the first character of a constant string thus allowing the string to be modified by use of the pointer. This makes a mockery of the fact that the string is meant to be constant. Such assignments have been deprecated since C++98 and so compilers should, at least, issue a warning. Sadly, not all do. What is required is a pointer that is incapable of changing what it is pointing to. This is easily achieved, in this case, by adding a const to the declaration of cstring giving typedef const char * cstring. This will keep compilers happy.

Unfortunately, removing the warning has not altered the behaviour of the code. It still produces different results on different compilers. So something else must be causing the problem. It turns out to be a combination of the type of the unordered map key and the way in which some compilers optimise the code. Letâ€™s start with the key type.

From the sample code, it can be seen that Addresses member functions add and get have parameters of type cstring where cstring is a typedef for const char * (the const has just been added as described above). The strings that are passed to the functions are of type const char[n] where n is the length of the string. For example, "roger" is of type const char[6]. This difference in type is permitted because array types decay into pointers as they are assigned to the function parameters. The body of the functions, therefore, deal with pointers to strings.

Also, the unordered map has been defined to have both key and value of type cstring. In other words, the key is a pointer to a string and not the string itself. This is important because the unordered map determines whether an entry already exists by discovering whether any of the stored pointers have the same value as the address of the string passed (in our case) to its operator[] function. The address of a string is simply the location in memory where it is stored. So, the question is at what memory location is a string constant stored. This brings us back to compiler optimisations.

One compiler optimisation technique is called string pooling. This involves placing only one copy of identical string constants in memory instead of having multiple copies of the same string. If this optimisation takes place, p and q in the following code would be equal in value.

  const char * p = "This is a string";
  const char * q = "This is a string";

If this optimisation is not performed, the value of p and q would be different as two separate, but identical, strings would be stored.

This optimisation is critical to the program as to whether strings will be found. If string pooling takes place, the string "roger" used in the add function will have the same address as the identical string used in the get function. The unordered mapâ€™s count function will, therefore, consider the string as found. If string pooling does not take place, the two strings, despite being identical in value, will have different addresses and so a match will not be found.

Incidentally, and ignoring the identified problems for a moment, there is an inefficiency in the get function. Two searches of the unordered map are made; one in the count function and one in the operator[] function. The hash calculation is, therefore, performed twice. Clearly, it would be more efficient if the hash value was calculated once. This can be achieved by using the unordered mapâ€™s find function as follows.

  cstring get(cstring name)
  {
    return addresses.find(name) ==
      addresses.end() ? nullptr : name;
  }

Clearly, it is not acceptable for the behaviour of the program to depend on compiler optimisations. Fortunately, there are a few things that can be done to remedy the situation.

As stated above, the unordered map compares the value of pointers to strings to determine whether strings are identical and (as also explained above) this only works when string pooling takes place. What is really wanted is to compare the strings themselves, not the pointers to the strings. Doing this would result in the software behaving as expected irrespective of whether string pooling was in effect. This can be achieved by providing a user-defined hash function and equivalence criterion. These take the form of function objects that are passed to the constructor of the unordered map.

A suitable function object that defines the hash function is shown below.

  class String_hash
  {
    public:
    std::size_t operator()(cstring s) const
    {
      return std::hash<std::string>()(s);
    }
  };

Providing a good hash function can be difficult so I have cheated somewhat by using the hash function supplied by the standard library. The library provides hash functions for most common types. In this case I have used std::string. The constructor of std::string takes the pointer to a char and creates an std::string object that is passed to std::hash from which the hash code is generated.

A suitable equivalence criterion function object is shown below.

  class String_equal
  {
    public:
    bool operator()(cstring s1, cstring s2)
      const
    {
      return strcmp(s1, s2) == 0;
    }
  };

The strcmp function takes a pointer to each of the strings to be compared and returns a value of zero if the two strings are equal. This value is compared with zero to give a return value of true if the two strings are identical and false otherwise. Note that the strcmp function is made available by adding #include <cstring> to the program.

The two function objects are passed, as template arguments, to the unordered map constructor as shown below.

  typedef std::unordered_map<cstring, cstring,
  String_hash, String_equal> AddressMap;

The program will now work as expected and does not depend on any compiler optimisation techniques. It has, however, been quite an effort to get to this stage and the modifications were not all that intuitive. There must be a simpler way of achieving our goals.

The problem lies in not choosing language features that provide the right level of abstraction. The student programmer has chosen cstrings (as provided by the original C programming language) to represent ordinary textual strings. While these can be used, they do have disadvantages. One problem is that they do not behave quite as the novice may expect. For example, I am sure the student was under the impression that the strings themselves were being stored in the map. This illusion may have been reinforced by naming the first deftype cstring. The name implying, perhaps, that strings are being stored, whereas the deftype actually refers to a pointer to chars. Also, the student included the unnecessary library header <string> thus giving the impression that strings were being manipulated.

The standard library provides a string class (std::string) that is designed to be intuitive and to present few or no problems to the user. It is this class that should be used in place of cstrings. The std::string class is a more appropriate level of abstraction for this application (and just about all others).

Rewriting the program to use std::strings gives the following.

  #include <iostream>
  #include <string>
  #include <unordered_map>

  typedef std::unordered_map<std::string,
    std::string> AddressMap;
  class Addresses
  {
    public:
    void add(std::string name, std::string addr)
    {
      addresses[name] = addr;
    }
    std::string get(std::string name)
    {
      return addresses.find(name) ==
        addresses.end() ? "" : name;
    }
    private:
    AddressMap addresses;
  };

  int main()
  {
    Addresses addr;
    addr.add("roger",
      "rogero@howzatt.demon.co.uk");
    addr.add("chris", "chris@gmail.com");
    std::string myadd = addr.get("roger");

    if (myadd == "")
    {
      std::cout << "Didn't find details" 
        << std::endl;
      return 1;
    }
    std::cout << "Found " << myadd << std::endl;
  }

Making the interface of Addresses as similar as possible to the original has resulted in slightly awkward code in the get function. The unordered mapâ€™s find function returns an iterator that is used to determine whether the string was found. If the iterator has a value of â€˜one passed the endâ€™(indicating that the string was not found), a null string is returned from get; otherwise the located string is returned. In the main program, the string returned from get is compared with a null string to determine whether the email details were found. Despite this, the software is easy to understand and behaves as expected regardless of compiler optimisations; something that could not be said for the original version.

Paul Floyd <paulf@free.fr>

The non-answer which fixes the warning is to change the cstring typedef to use const char*.

The fundamental issue is that the std::unordered_map does not have a hash function that hashes the string content of pointers to character arrays (C strings if you prefer). It will hash pointers and std::strings. So in this case, it is the string pointer that is being hashed. Assuming that itâ€™s the C string contents that you want to use as the key, then this will work as long as there is a 1:1 relationship between the pointer and the string. If string literals (as in this case) are de-duplicated, then it will work. This is not guaranteed, resulting in implementation defined behaviour. I did try this with a few compilers (and debug/optimized modes), and it worked in all cases.

To make this work reliably, use a std::string, which does have a hash function defined for it, e.g.,

  typedef std::unordered_map<
    std::string, std::string> AddressMap;

I donâ€™t think that the add function serves much purpose.

The get function does avoid inserting/returning a default element as operator[] would do. However, it could also be modified to avoid two lookups:

  std::string get(const std::string& name)
  {
    AddressMap::const_iterator citer =
       addresses.find(name);
    if (citer != addresses.end())
    {
      return citer->second;
    }
    else
    {
       return std::string();
    }
  }

Incidentally, I donâ€™t like inline bodies in class definitions. I always try to put them outside as â€˜inlineâ€™.

It isnâ€™t clear to me whether empty strings should be allowed in the map. If so, this would cause problems as here an empty string is being used to test whether the key was found in the map. Here the add function could be useful, by not allowing empty strings to be added.

Testing wise, at least one test that is expected to fail should be added.

Commentary

This problem amused me because the code worked because of string pooling â€“ this is a well known problem with Java code (where using == on strings has the same sort of issues as in this case) but less common in C/C++!

I donâ€™t think there is much left to add to the comments made in the critiques above. I was interested to notice that after, converting the solution to use std::string, two of the entries check for failure of get() by using if (value == "") â€“ I feel in this case it is more readable to use if (value.empty()) but â€˜your mileage may varyâ€™.

I was a little surprised that none of the entries added any input validation to the add method â€“ in particular it would seem sensible to prevent adding the â€˜not foundâ€™ value!

The winner of CC92

There were some good critiques in this batch and it wasnâ€™t easy to decide which one was best. Nearly everyone explained about the need for adding const to the typedef, but then explained that this wouldnâ€™t solve the underlying issue. I particularly liked Tomâ€™s well-placed reminder that the â€˜standard containers are designed to hold values not pointersâ€™ as this gives the programmer a higher level explanation of what the issue is.

Jimâ€™s approach of using a shared_ptr was quite elegant, and nicely solves the problem of reporting failure from get, but I fear this makes the solution harder to use. Is in this case an empty email address seems a perfectly good way to indicate â€˜not foundâ€™. I liked Simonâ€™s approach to handling failure in the get function by using the at method in std::unordered_map, this also allowed him to support const correctness which was something other solutions didnâ€™t explain (even when const was added to a signature.)

Overall I felt Simonâ€™s solution was the most complete (although it did omit explaining the compiler warning) and so I have awarded him the prize against fairly stiff competition. Thanks to all the entrants!

Code Critique 93

(Submissions to scc@accu.org by June 1st)

Iâ€™m trying to write a simple program to demonstrate the Mandelbrot set and Iâ€™m hoping to get example output like this

                                      *
                                    ****
                                    ****
                                    ****

                              *  **********
                             ******************
                              *****************
                             *****************
                            *******************
                           **********************
                           *********************
                 * **     **********************
                 *******  **********************
                ********* **********************
                ********* **********************
              * ********* *********************
 *********************************************
              * ********* *********************
                ********* **********************
                ********* **********************
                 *******  **********************
                 * **     **********************
                           *********************
                           **********************
                            *******************
                             *****************
                              *****************
                             ******************
                              *  **********

                                    ****
                                    ****
                                    ****
                                      *

but I just get 6 lines of stars and the rest blank. Can you help me get this program working?"

Can you find out what is wrong and help this programmer to fix (and improve) their simple test program? The code is in Listing 2.

You can also get the current problem from the accu-general mail list (next entry is posted around the last issueâ€™s deadline) or from the ACCU website (http://www.accu.org/journals/). This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.

#include <array>
#include <complex>
#include <iostream>

using row = std::array<bool, 60>;
using matrix = std::array<row, 40>;

std::ostream& operator<<(std::ostream &os, row const &rhs)
{
  for (auto const &v : rhs)
    os << "* "[v];
  return os;
}
std::ostream& operator<<(std::ostream &os, matrix const &rhs)
{
  for (auto const &v : rhs)
    os << v << '\n';
  return os;  
}

class Mandelbrot
{
  matrix data = {};
  std::complex<double> origin = (-0.5);
  double scale = 20.0;
public:
  matrix const & operator()()
  {
    for (int y(0); y != data.size(); ++y)
    {
      for (int x(0); x != data[y].size(); ++x)
      {
        std::complex<double>
          c = (xcoord(x), ycoord(y)), z = c;
        int k = 0;
        for (; k < 200; ++k)
        {
          z = z*z + c;
          if (abs(z) >= 2)
          {
            data[y][x] = true;
            break;
          }
        }
      }
    }
    return data;
  }
private:
  double xcoord(int xpos)
  {
     return origin.real() + (xpos -
       data[0].size()/2) / scale;
  }
  double ycoord(int ypos)
  {
     return origin.imag() + (data.size()/2 -
       ypos) / scale;
  }
};
int main()
{
  Mandelbrot set;
  std::cout << set() << std::endl;
}

Listing 2

Notes:

More fields may be available via dynamicdata ..