Journal Articles

CVu Journal Vol 26, #6 - January 2015
Browse in : All > Journals > CVu > 266 (9)

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Code Critique Competition 91

Author: Martin Moene

Date: 02 January 2015 21:27:18 +00:00 or Fri, 02 January 2015 21:27:18 +00:00

Summary: Set and collated by Roger Orr. A book prize is awarded for the best entry.

Body: 

Participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.

Note: we are investigating putting code critique articles online: if you would rather not have your critique visible please inform me. (We will remove email addresses!)

Last issue’s code

I’m trying to instrument some code to find out how many iterator operations are done by some algorithms I use that operate on vectors. I’ve got some code that worked with C++03 and I’m trying to get it working with the new C++11 features. Mostly the C++11 code is just nicer but I can’t get my wrapper vector to work with the new style for loop. I’ve stripped out the instrumentation for other methods in this example code so it only counts the increment operations and when I run the two simple examples below I get this output:

     C: >example03.exe
     Increments: 3

     C: >example11.exe
     Increments: 0

I expected the same output from both examples as all I’ve done is re-write the for loop using the new style for syntax.’

Can you find out why it doesn’t work as expected and suggest some ways to improve (or replace) the mechanism being used?

The code is in Listing 1.

// -- wrapped_vector.hxx --
#include <vector>

template<typename T>
struct wrapped_vector : std::vector<T>
{
  typedef typename std::vector<T> vector;
  typedef typename vector::size_type
    size_type;
  // Sort the constructors
  #if __cplusplus >= 201103
    using vector::vector; // Nice :-)

  #else
    // legacy: need to spell them out ...
    wrapped_vector() {}
    explicit wrapped_vector(size_type n,
      const T& value = T())
    : vector(n, value) {}
    template <class InputIterator>
    wrapped_vector(InputIterator first,
      InputIterator last)
    : vector(first, last) {}
    // ...
  #endif // C++11
  // print the stats
  static void dump();
  // instrumented iterator
  struct iterator : vector::iterator
  {
    typedef typename vector::iterator base;
    iterator() {}
    iterator(base it) : base(it) {}
    iterator& operator ++();
    // (other instrumented methods removed)
    static int increments;
  };
  // const_iterator (unused so removed)
};
#include "wrapped_vector.inl"
// -- wrapped_vector.inl --
#include <iostream>
template <typename T>
void wrapped_vector<T>::dump()
{
  std::cout
    << "Increments: "
    << iterator::increments
    << std::endl;
}
template <typename T>
typename wrapped_vector<T>::iterator& wrapped_vector<T>::iterator::operator ++()
{
  base::operator ++();
  ++increments;
  return *this;
}
template <typename T>
int wrapped_vector<T>::iterator::increments;
// -- example03.cpp –
// (also works with C++11)
#include <cassert>
#include "wrapped_vector.hxx"
int main()
{
  wrapped_vector<int> iVec;
  iVec.push_back(1);
  iVec.push_back(0);
  iVec.push_back(2);
  int total(0);
  for (wrapped_vector<int>::iterator
    iter(iVec.begin()), past(iVec.end());
    iter != past; ++iter)
  {
    total += *iter;
  }
  assert(total == 3);
  wrapped_vector<int>::dump();
}
// -- example11.cpp –- (C++11 example)
#include <cassert>
#include "wrapped_vector.hxx"
int main()
{
  wrapped_vector<int> iVec;
  iVec.push_back(1);
  iVec.push_back(0);
  iVec.push_back(2);
  int total(0);
  for (auto & p : iVec)
  {
     total += p;
  }
  assert(total == 3);
  wrapped_vector<int>::dump();
}
			
Listing 1

Critiques

Alex Paterson

The problem area is around analysis of algorithm performance, with this case specifically looking at iterator usage. Analysis of code in this manner often proves tricky, but can provide vital insights into how efficient our code actually is rather than just how efficient we think it is.

The C++03 and C++11 code specimens have one major difference that is the root of the problem; the type of iterator used to iterate over the collection is different – one of them using the instrumented iterator and the other is not. This problem is introduced by the different method of iterating over the collection in the C++11 code. The C++03 method of declaring and incrementing an iterator is replaced with the range-based for loop.

The range-based for statement (RBF) is type of statement that can be referred to as ‘syntactic sugar’; it is a construct that performs some task that is already possible using other language features, but does so with simpler and/or a more succinct syntax. In other words, the RBF does not add any new functionality to C++, but it reduces the amount or code we have to write (and thus read and maintain). The C++ standard specifies how RBF usage translates into the more long-winded syntax, encouraging compiler writers to add support for the new syntax by implementing it in terms of existing functionality. If I ignore usage for array types, the standard states that the compiler will use argument-dependent lookup to try and find suitable begin() and end() functions for the given type; if all goes well, the type of iterator used in the RBF loop will be the type returned from begin().

  class X { /* … */ };
  vector<X> container_of_x;

  // C++11 Range Based For
  for (auto& x : container_of_x)
  { /* do something with x */ }

  //Long Winded Equivalent
  for (vector<X>::iterator it =
    begin(container_of_x);
    it != end(container_of_x); ++it)
  {
    X& x = *it;
    /* do something with x */
  }

The C++03 code compiles and produces the expected output when compiled against either the C++03 or C++11 standard, which reinforces the suspicion that the two problem code specimens are not equivalent. Making a small change in the C++03 code specimen to use auto as the iterator type reveals the problem behaviour.

  //C++03 Code
  for (wrapped_vector<int>::iterator
    iter(iVec.begin()), past(iVec.end());
    iter != past; ++iter)
  {
    total += *iter;
  }
  // Output: Increments: 3

  //C++03 Code Modified
  for (auto iter(iVec.begin()),
      past(iVec.end());
      iter != past; ++iter)
    {
      total += *iter;
    }
  // Output: Increments: 0

The conclusion from the output is that the loop using auto is not using the instrumented iterator. In case this is not obvious, we can stream out the type of iterator using typeid(iter).name(), which shows different type names. [Editor: it may show different type names – but the usefulness of the output from name() varies widely.]

In both cases iVec.begin() is used to initialise the iterator (which returns std::vector<T>::iterator), but in the first loop this is wrapped in the instrumented iterator (of type wrapped_vector<T>::iterator). In the second loop, the wrapper is not present, so the instrumentation code never gets run. Additionally, a better name for the iterator type in wrapper_vector could have perhaps made it easier to spot this issue; as the code stands, there are two definitions of iterator in wrapped_vector<T>, one at the wrapped_vector<T> scope and another one at the std::vector<T> scope.

This explains the behaviour we are seeing, but the question now is how can we change things so that we can still use the pretty new syntax and be able to determine how many iterator increments occur?

Inheriting from std::vector in this manner is not recommended (see stackoverflow – http://stackoverflow.com/questions/ 6806173/subclass-inherit-standard-containers, http://stackoverflow.com/ questions/4353203/thou-shalt-not-inherit-from-stdvector). Using private inheritance in the original code improves the situation [slightly] and reveals that three functions are being used from the base class: begin(), end() and push_back(). using push_back; can be used to pull the private base class declaration into public visibility, however we need to write our own implementations of begin() and end() in order to provide an instrumented iterator using our wrapper class, as follows (remembering that at the wrapped_vector<T> scope, the type ‘iterator’ is our instrumented iterator);

  template<typename T>
  struct wrapped_vector : private std::vector<T>
  {
  //…
    using std::vector<T>::push_back;
    iterator begin()
    { return iterator(std::vector<T>::begin()); }

    iterator end()
    { return iterator(std::vector<T>::end()); }
    //…
  };

This gives us the expected output for our C++11 loop, however it is still inheriting from std::vector so is far from ideal. Extending std::vector in this manner is not recommend, even if it does yield the right results. The standard library collection classes are not designed to be extended. std::vector<T> does not have a virtual destructor and we therefore run the risk of the instrumented vector’s destructor not getting run.

This would lead to a resource leak if our vector class had any non-static member variables. E.g.

  std:vector<int>* my_naked_vector_ptr
    = new wrapped_vector<int>;
  delete my_naked_vector_ptr; // <-- risk of
                              // resource leak!

So if inheritance is not an option, then how can we analyse how efficient our algorithms are? Here are two options that I have used: alter the behaviour of the container or alter the behaviour of the contained objects. To alter the behaviour of the container, we can provide a custom allocator to track allocations of the contained objects. To alter the behaviour of the contained objects, we can substitute a test object that can track events like construction, destruction, assignment and comparison.

I’ve found that using a test object is the preferred technique as it provides far more opportunities to examine different behaviours, but there are cases where it is not desirable/possible to change the type of the contained objects, so altering the container is still a useful technique to know.

Altering the container

The template parameters of the standard containers provide the opportunity to specify not only the type of object that is contained, but also how they are allocated.

We implement an instrumented allocator by placing our counter increment code before we delegate the allocate and construct calls to their default implementations.

  template<typename T> struct wrapped_allocator
    : std::allocator<T>
  {
    typedef T* pointer;
    pointer allocate(
      typename std::allocator<T>::size_type n,
      std::allocator<void>::const_pointer hint
         = 0)
    {
      allocations++;
      return std::allocator<T>::allocate(n,
        hint);
    }
    template< class U, class... Args >
    void construct( U* p, Args&&... args )
    {
      constructions++;
      return std::allocator<T>::construct(p,
        std::forward<Args>(args)...);
    }
    static void reset_counters()
    { allocations = constructions = 0; }
    static void dump_counters()
    {
      std::cout
        << "allocations: " << allocations 
        << ", constructions: " << constructions
        << std::endl;
    }
    static int allocations;
    static int constructions;
  };
  template<typename T> int
    wrapped_allocator<T>::allocations;
  template<typename T> int
    wrapped_allocator<T>::constructions;
  int main()
  {
    std::vector<int, wrapped_allocator<int>> 
      v{5,4,3,2,1};
    wrapped_allocator<int>::reset_counters();
    std::sort(begin(v), end(v));
    wrapped_allocator<int>::dump_counters();
  }

Output: allocations: 0, constructions: 0

Oh, so maybe not so useful in this case, but what about comparisons? For this we need to move to our own object where we can define our own comparison operator.

Using a test object

A test object provides a way to examine the behaviour of our algorithm by instrumenting various functions. I usually add instrumentation to the constructors, assignment operator and inequality operators, but of course what gets instrumented depends on what we are trying to find out.

Combining these typical traces together, we can get a pretty good picture of what the algorithm is doing in terms of object creation and comparison.

  class CTestObject
  {
  public:
    CTestObject(int n) :
      m_payload(n)
    { constructions++; }
    CTestObject(const CTestObject& src) :
      m_payload(src.m_payload)
    { constructions++; }
    CTestObject& operator=(
      const CTestObject& rhs)
    {
      assignments++;
      m_payload = rhs.m_payload;
      return *this;
    }
  friend bool operator<(const CTestObject& lhs,
    const CTestObject& rhs)
  {
    CTestObject::comparisons++;
    return lhs.m_payload < rhs.m_payload;
  }
  friend bool operator==(const CTestObject& lhs,
    const CTestObject& rhs)
  {
    CTestObject::comparisons++;
    return lhs.m_payload == rhs.m_payload;
  }
  static void reset_counters()
  {
    assignments = constructions = comparisons=0;
  }
  static void dump_counters()
  {
    std::cout << "CTestObject, assignments: " 
      << assignments << ", constructions: "
      << constructions << ", comparisons: "
      << comparisons << std::endl;
  }
  private:
    int m_payload;
    static int assignments;
    static int constructions;
    static int comparisons;
  };
  int CTestObject::assignments;
  int CTestObject::constructions;
  int CTestObject::comparisons;
  int main()
  {
    std::vector<CTestObject> v{5,4,3,2,1};
    CTestObject::reset_counters();
    std::sort(begin(v), end(v));
    CTestObject::dump_counters();
  }

Output: CTestObject, assignments: 16, constructions: 8, comparisons: 9

Of course, the number of assignments, constructions or comparisons is not necessarily the be-all and end-all of whether you have a good algorithm. It may be more important in certain cases to ensure that your algorithm is cache-friendly, which may mean extra allocations or a significant increase in the number of comparisons, but it may ultimately be faster.

James Holland

I may have the wrong end of the stick, but I do not think there is much wrong with the author’s code. It does exactly what is required. Specifically, it counts and displays the number of iterator increment operations. The confusing thing, perhaps, is that the C++11 style range-based for loop shown in the code listing does not use iterators. It works directly on the type the vector contains, namely, ints. This is why a count of zero iterator increment operations are displayed when using the C++11 loop.

If the author wants to monitor the operations performed on ints, one way of proceeding is to replace the plain ints contained in the vector with a class with the same behaviour as ints but with additional instrumentation. This will allow the number of operations performed on the instrumented class to be counted. One version of such a class, named My_int is listed below.

  class My_int
  {
    int i;
    static int operator_plus_plus;
    static int operator_plus_equals;
    public:
    My_int(int ii):i(ii){}
    My_int & operator+=(My_int my_int)
    {
      i += my_int.i;
      ++operator_plus_equals;
      return *this;
    }
    My_int & operator++()
    {
      ++i;
      ++operator_plus_plus;
      return *this;
    }
    static void dump_operator_plus_plus()
    {
      std::cout << "operator++(): "
        << operator_plus_plus << std::endl;
    }
    static void dump_operator_plus_equals()
    {
      std::cout << "operator+=(): "
        << operator_plus_equals << std::endl;
    }
  };
  int My_int::operator_plus_plus = 0;
  int My_int::operator_plus_equals = 0;

My_class includes definitions of operator+=() and operator++() that when called increment corresponding static members operator_plus_equals and operator_plus_plus. Other operations could be defined if required.

All the author has to do is replace the vector of ints with a vector of My_ints, execute the algorithms that use the vector and then call My_int member functions dump_operator_plus_plus() and dump_operator_plus_equals() to display the total count of operations.

Incidentally, it is interesting to consider the construction of the C++03 style for loop. The author has declared a variable, named past, and assigned it the value of iVec.end(). The loop then used past, as opposed to iVec.end(), to determine when to terminate. I was wondering what the advantage of this method affords. I can only think it is marginally faster. It would be interesting to hear what others have to say about this construction.

Commentary

This critique shows how hard it can be to derive from a class that is not designed for derivation. In particular, if new methods are added to the base class the assumptions used to originally produce the derived class may fail completely. C++11 added methods to the many of the standard classes, such as vector and its iterators, as well as mechanisms to support the ‘range-base for’ syntax.

The fundamental problem is that none of the methods of vector and the iterator are virtual, and so it is all too easy to call one of the original methods being overloaded. In the presenting case that's what has happened to the iterator type: the fragment for (auto & p : iVec) deduces the type of the underlying iterator as a ‘vanilla’ vector iterator rather than the wrapped iterator type.

In this case the fix is simple: add member functions begin() and end() to the wrapped_vector:

  iterator begin() { return vector::begin(); }
  iterator end() { return vector::end(); }

to ensure – in this case, anyway – that the range-based for loop uses the desired iterator.

Alex’s suggestion of using private inheritance is a good solution to the difficulty as there is then no implicit conversion (outside the wrapped type) to the underlying class, and so missing methods in the wrapped class will normally cause compilation errors rather than silently using methods from the base – as happened here. (This also prevents the dangerous conversion to a base class without a virtual destructor.)

The presenting problem, “I’m trying to instrument some code to find out how many iterator operations are done by some algorithms I use that operate on vectors” begs the question of why – I strongly suspect there’s a performance or scalability problem somewhere.

It is often better to instrument the whole program, or a small test program, using standard tools (you may recall articles about the valgrind set of tools in recent issues of Overload) as otherwise the danger is, as Alex points out, that the measure you are collecting is not the right one.

The winner of CC90

Alex and James both suggest instrumenting the payload rather than the iterator; although this mechanism may not permit exactly the same measurement it does provide a clean way to instrument other operations.

Alex also gave clear explanation of why the code was broken so I am happy to award him the prize for this critique.

CC89 reprise…

Silas Brown wrote in, following his entry in CVu 26.5, to say:

In Code Critique 89 I incorrectly stated that all cards with odd value will also have colour set to red, whereas all cards with even value will also have colour set to black. This is not the case: as pointed out by Paul Floyd in his critique on the same page, the single loop works because 4 and 13 are relatively prime. The value is taken modulus 13, so the iterations are out of phase and eventually all combinations will be assigned. We can confirm that simply enough by typing the following into a Python prompt:

      sorted((i%4,i%13) for i in range(52)) == \
      [(x,y) for x in range(4) for y in range(13)]

and Python will reply:

      True

Silly me didn’t see it: that’s what sometimes happens when you think about code without a compiler on hand to actually try it. However, the rest of what I said still stands, including the part about doing it with multiple loops being much clearer. If I can misunderstand the code in this way after more than 20 years of programming, then your colleagues might misunderstand it as well. The only disadvantage of having multiple loops that I can see is that more CPU registers would be required, but that’s only an issue nowadays if you’re programming a microcontroller. And if that’s what you’re doing, taking modulus 13 on every step might turn out to cost more CPU resources than the nested loops would anyway.

Code Critique 91

(Submissions to scc@accu.org by February 1st)

I’m trying to migrate my skill set from C to C++ so thought I’d get started with a simple program to fill in a set of strings and print them. But I’m getting a compilation problem on the call to std::copy that makes no sense to me, although I thought I’d copied it from some working code on the Internet. Can you help me get it to compile?

Can you help this programmer to get past this presenting problem and help them to identify any other issues with their code? The code is in Listing 2.

#include <iostream>
#include <iterator>
#include <set>
// compare *contents* not raw pointers
bool string_less(char const *, char const *);
template <class T, class U>
void test(std::set<T, U> s, T p)
{
  if (s.insert(p).second)
    std::cout << "Added " << p << std::endl;
  else
    std::cout << p << " already present";
}

int main()
{
  std::set<char const *,
    decltype(&string_less)> s(string_less);
  test(s, "A");
  test(s, "B");
  test(s, "AB");
  std::copy(s.begin, s.end,
    std::ostream_iterator<
      char const *>(std::cout , " "));
  return 0;
}
bool string_less(const char *s, const char *t)
{
  while (*s == *t)
  {
    s++;
    t++;
  }
  return(*s < *t);
}
			
Listing 2

You can also get the current problem from the accu-general mail list (next entry is posted around the last issue’s deadline) or from the ACCU website (http://www.accu.org/journals/). This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.

Notes: 

More fields may be available via dynamicdata ..