Title: C++ Antipatterns

Author: Martin Moene

Date: 03 August 2016 22:15:32 +01:00 or Wed, 03 August 2016 22:15:32 +01:00

Summary: Certain mistakes crop up frequently in C++. Jonathan Wakely offers some pro-tips to help you avoid common errors.

Body:

This article documents some common mistakes that I see again and again in bug reports and requests for help on sites like StackOverflow.

Reading from an istream without checking the result

A very frequently asked question from someone learning C++ looks like this:

I wrote this program but it doesnâ€™t work. It reads from the file correctly but does the calculation wrong.

    #include <iostream>
    #include <fstream>
    
    int calculate(int a, int b, int c)
    {
      // blah blah blah complex calculation
      return a + b + c;
    }

    int main()
    {
      std::ifstream in("input.txt");
      if (!in.is_open())
      {
        std::cerr << "Failed to open file\n";
        return 1;
      }

      int i, j, k;
      in >> i >> j >> k;
      std::cout << calculate(i, j, k);
    }

Why doesnâ€™t the calculation work?

In many, many cases the problem is that the in >> ... statement failed, so the variables contain garbage values and so the inputs to the calculation are garbage.

The program has no way to check the assumption â€˜it reads from the file correctlyâ€™, so attempts to debug the problem are often just based on guesswork.

The solution is simple, but seems to be rarely taught to beginners: always check your I/O operations.

The improved version of the code in Listing 1 only calls calculate(i, j, k) if reading values into all three variables succeeds.

int i, j, k;

if (in >> i >> j >> k)
{
  std::cout << calculate(i, j, k);
}

else
{
  std::cerr << 
    "Failed to read values from the file!\n";
  throw std::runtime_error("Invalid input file");
}

Listing 1

Now if any of the input operations fails you donâ€™t get a garbage result, you get an error that makes the problem clear immediately. You can choose other forms of error handling rather than throwing an exception, the important bit is to check the I/O and not just keep going regardless when something fails.

Recommendation: always check that reading from an istream succeeds.

Locking and unlocking a std::mutex

This is always wrong:

  std::mutex mtx;
  void func()
  {
    mtx.lock();
    // do things
    mtx.unlock();
  }

It should always be done using one of the RAII scoped lock types such as lock_guard or unique_lock e.g.

  std::mutex mtx;
  void func()
  {
    std::lock_guard<std::mutex> lock(mtx);
    // do things
  }

Using a scoped lock is exception-safe, you cannot forget to unlock the mutex if you return early, and it takes fewer lines of code.

Recommendation: always use a scoped lock object to lock and unlock a mutex.

Be careful that you donâ€™t forget to give a scoped lock variable a name! This will compile, but doesnâ€™t do what you expect:

  std::mutex mtx;
  void func()
  {
    std::unique_lock<std::mutex> (mtx); // OOPS!
    // do things, but the mutex is not locked!
  }

This default-constructs a unique_lock object called mtx, which has nothing to do with the global mtx object (the parentheses around (mtx) are redundant and so itâ€™s equivalent to simply std::unique_lock<std::mutex> mtx;).

A similar mistake can happen using braces instead of parentheses:

  std::mutex mtx;
  void func()
  {
    std::unique_lock<std::mutex> {mtx};  // OOPS!
    // do things, but the mutex is not locked!
  }

This does lock the global mutex mtx, but it does so in the constructor of a temporary unique_lock, which immediately goes away and unlocks the mutex again.

Testing for istream.eof() in a loop

A common mistake when using istreams is to try and use eof() to detect when there is no more input:

  while (!in.eof())
  {
    in >> x;
    process(x);
  }

This doesnâ€™t work because the eofbit is only set after you try to read from a stream that has already reached EOF. When all the input has been read, the loop will run again, reading into x will fail, and then process(x) is called even though nothing was read.

The solution is to test whether the read succeeds, instead of testing for EOF:

  while (in >> x)
  {
    process(x);
  }

You should never read from an istream without checking the result anyway, so doing that correctly avoids needing to test for EOF.

Recommendation: test for successful reads instead of testing for EOF.

Inserting into a container of smart pointers with emplace_back(new X)

When appending to a std::vector<std::unique_ptr<X>>, you cannot just say v.push_back(new X), because there is no implicit conversion from X* to std::unique_ptr<X>.

A popular solution is to use v.emplace_back(new X) because that compiles (emplace_back constructs an element in-place from the arguments, and so can use explicit constructors).

However, this is not safe. If the vector is full and needs to reallocate memory, that could fail and throw a bad_alloc exception, in which case the pointer will be lost and will never be deleted.

The safe solution is to create a temporary unique_ptr that takes ownership of the pointer before the vector might try to reallocate:

  v.push_back(std::unique_ptr<X>(new X))

(You could replace push_back with emplace_back but there is no advantage here because the only conversion is explicit anyway, and emplace_back is more typing!)

In C++14 you should just use std::make_unique and itâ€™s a non-issue:

  v.push_back(std::make_unique<X>())

Recommendation: do not prefer emplace_back just because it allows you to call an explicit constructor. There might be a good reason the class designer made the constructor explicit that you should think about and not just take a short cut around it.

(Scott Meyers discusses this point as part of Item 42 in Effective Modern C++.)

Defining â€˜less thanâ€™ and other orderings correctly

When using custom keys in maps and sets, a common mistake is to define a â€˜less thanâ€™ operation as in Listing 2.

struct X
{
  int a;
  int b;
};

inline bool operator<(const X& l, const X& r)
{
  if (l.a < r.a)
    return true;
  if (l.b < r.b)
    return true;
  return false;
}

Listing 2

This operator< does not define a valid ordering. Consider how it behaves for X{1, 2} and X{2, 1}:

  X x1{1, 2};
  X x2{2, 1};
  assert( x1 < x2 );
  assert( x2 < x1 );

The operator< defined above means that x1 is less than x2 but also that x2 is less than x1, which should be impossible!

The problem is that the operator< says that l is less than r if any member of l is less than the corresponding member of r. Thatâ€™s like saying that 20 is less than 11 because when you compare the second digits '0' is less than '1' (or similarly, that the string "20" should be sorted before "11" because the second character '0' is less than the second character '1').

In technical terms the operator< above fails to define a Strict Weak Ordering.

Another way to define a bogus order is:

  inline bool operator<(const X& l, const X& r)
  {
    return l.a < r.a && l.b < r.b;
  }

Where the first example gave the nonsensical result:

  x1 < x2 && x2 < 1

this definition gives the result:

  !(x1 < x2) && !(x1 < x2)

In other words, the two values are considered to be equivalent, and so only one of them could be inserted into a unique associative container such as a std::set. But then if you compare those values to X x3{1, 0}, you find that x1 and x3 are equivalent, but x1 and x2 are not. So depending which of x1 and x2 is in the std::set affects whether or not you can add x3!

An invalid order like the ones above will cause undefined behaviour if it is used where the Standard Library expects a correct order, e.g. in std::sort, std::set, or std::map. Trying to insert the x1 and x2 values above into a std::set<X> will give strange results, maybe even crashing the program, because the invariant that a setâ€™s elements are always sorted correctly is broken if the comparison function is incapable of sorting correctly.

This is discussed further in Effective STL by Scott Meyers, and in the CERT C++ Coding Standard.

A correct implementation of the function would only consider the second member when the first members are equal i.e.

  inline bool operator<(const X& l, const X& r)
  {
    if (l.a < r.a)
      return true;
    if (l.a == r.a && l.b < r.b)
      return true;
    return false;
  }

Since C++11 defining an order correctly is trivial, just use std::tie:

  inline bool operator<(const X& l, const X& r)
  {
    return std::tie(l.a, l.b) < std::tie(r.a, r.b);
  }

This creates a tuple referring to the members of l and a tuple referring to the members of r, then compares the tuples, which does the right thing (only comparing later elements when the earlier ones are equal).

Recommendation: When writing your own less-than operator make sure it defines a Strict Weak Ordering, and write tests to ensure that it doesnâ€™t give impossible results like (a < b && b < a) or (a < a). Prefer using std::tie() to create tuples which can be compared, instead of writing your own error-prone comparison logic.

Consider whether defining a hash function and using either

std::unordered_set
std::unordered_map

would be better anyway than

std::set
std::map

Unordered containers require an equality operator, but thatâ€™s harder to get wrong.

In summary

The only common theme to the items above is that I see these same mistakes made again and again. I hope documenting them here will help you to avoid them in your own code, and in code you review or comment on. If you know of other antipatterns that could be covered let me or the Overload team know about them, so they can be included in a follow-up piece (or write a follow up piece yourself !).

Notes:

More fields may be available via dynamicdata ..