Journal Articles

CVu Journal Vol 30, #1 - March 2018 + ACCU Topics + Student Code Critiques from CVu journal.

Browse in :

All > Journals > CVu > 301 (10)
All > Topics (2)
All > Journal Columns > Code Critique (70)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Code Critique Competition 110

Author: Bob Schmidt

Date: 04 March 2018 00:00:00 +00:00 or Sun, 04 March 2018 00:00:00 +00:00

Summary: Set and collated by Roger Orr. A book prize is awarded for the best entry.

Body:

Please note that participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.

Note: If you would rather not have your critique visible online, please inform me. (Email addresses are not publicly visible.)

Last issueâ€™s code

Thanks to Jason Spencer for the random number generator suggestion leading to this issueâ€™s critique!

Iâ€™m trying to write a very simple dice game where the computer simulates two players each throwing dice. The higher score wins and after a (selectable) number of turns the player whoâ€™s won most times wins the game. (Iâ€™m going to make the game cleverer once itâ€™s working.) But the games always seem to be drawn and I canâ€™t see why. Here is what the program produces:

    dice_game
    Let's play dice
    How many turns? 10
    Drawn!
    How many turns? 8
    Drawn!
    How many turns? ^D

Whatâ€™s going wrong, and how might you help the programmer find the problem? As usual, there may be other suggestions you might make of some other possible things to (re-)consider about their program.

Listing 1 contains zipit.h
Listing 2 contains dice game.cpp

// Class to 'zip' together a pair of iterators
template <typename T>
class zipit : public std::pair<T, T>
{
  zipit &operator+=(std::pair<int,int> const &rhs)
  {
    this->first += rhs.first;
    this->second += rhs.second;
    return *this;
  }

Listing 1

public:
  using std::pair<T, T>::pair;
  zipit &operator+=(int n)
  {
    return *this += std::make_pair(n, n);
  }
  zipit &operator-=(int n)
  {
    return *this += std::make_pair(-n, -n);
  }
  zipit &operator++()
  {
    return *this += 1;
  }
  zipit &operator--()
  {
    return *this += -1;
  }
  auto operator*()
  {
    return std::make_pair(
      *this->first, *this->second);
  }

Listing 2

Critiques

James Holland

In main(), the user has created an object named generator of type randomize and then has sensibly passed it by reference to play(). Within play() the randomize object is then passed to the two generate() functions by value. Passing by value makes a copy of the object and this where problems begin. The randomize objects contain state that is used in a predetermined way to generate the pseudo random numbers. Two randomize objects that start in the same state will produce the same sequence of numbers. The two calls to generate() each make use of identical copies of randomize and so both will produce exactly the same sequence of random numbers. This is why all runs of the dice game end in a draw.

What we would like to happen is for both calls of generate() to use the same randomize object. In this way when one call to generate() has obtained a set of random numbers, the other call to generate() will continue to use the same randomize object and so will obtain a different sequence of numbers.

One way to achieve this is to pass the randomize object to generate() by reference thus ensuring there is only ever one randomize object. This can be done by employing the standard library reference wrapper ref(). This wrapper conveys references to function templates that take the parameters by value. Fortunately, generate() is such a function. The calls to generate() simply become as shown below.

  std::generate(player1.begin(), player1().end(),
    std::ref(generator);
  std::generate(player2.begin(), player2().end(),
    std::ref(generator);

While we are talking about random number generators, it is recommended not to use one of the standard library generators, such as MT19937, on its own but instead to use it with a distribution, also available from the standard library. There are problems associated with dividing the output from a generator by a constant and taking the reminder. Using a distribution will ensure a high quality series of random numbers.

Running dice_game will now probably produce the desired result. I say â€˜probablyâ€™ because the studentâ€™s software contains a serious flaw. The free functions begin() and end() pass their values by value. This means their parameters are copied. The functions begin() and end() construct a zipit object from iterators that point to copies of player1 and player2 arrays. When the functions return, the copies of player1 and player2 will be destroyed. The zipit object will be returned to its caller but the iterators it contains will be invalid as they point to objects that no longer exist. It may be the case that the memory locations where the copies of player1 and player2 existed maintain their values for as long as the program requires them. It is in this way that the program appears to work correctly. The behaviour of software should not be left to chance and so this bug must be corrected. Fortunately, this is a relatively simple matter.

Instead of passing player1 and player2 to begin() and end() by value, they should be passed by reference. In this way no copies are made and the iterators returned in the zipit object will point to the original (and still existing) versions of player1 and player2. The program is now running with a firm footing!

Although the software now works correctly, it seems quite complicated for what it does. Perhaps there is another way of structuring the program that is shorter and easier to understand. The studentâ€™s software involves the creation of two vectors that contain the results of rolling the dice, one for each of the players. There is also quite an elaborate arrangement of operator functions that are required to manipulate iterators that point to the two vectors. Instead of having a pair of vectors, each element of which contains a single value, I propose having a single vector that contains a pair of dice values, one for each player. Iterating through a single vector is much simpler that trying to iterate through two vectors at the same time. There is no danger of getting out of step. I suggest my version of the software is also considerably simpler as none of the code within zipit.h is required. I have kept the interface to Play() the same as in the studentâ€™s code and so main() can remain the same. I have modified the Randomize class to take advantage of the standard libraryâ€™s uniform integer distribution as shown below.

  #include <algorithm>
  #include <functional>
  #include <iostream>
  #include <random>
  #include <vector>
  class Randomize
  {
    public:
    Randomize():d(1, 6){}
    std::pair<int, int> operator()()
    {
      return {d(e), d(e)};
    }
    private:
    std::mt19937 e;
    std::uniform_int_distribution<> d;
  };
  void play(int turns, Randomize & generator)
  {
    std::vector<std::pair<int, int>>
      players(turns);
    std::generate(players.begin(),
      players.end(), std::ref(generator));
    int total{};
    for (const auto & turn : players)
    {
      const auto p1_score = turn.first;
      const auto p2_score = turn.second;
      if (p1_score != p2_score)
      {
        const auto diff = p1_score - p2_score;
        total += diff > 0 ? 1 : -1;
      }
    }
    if (total > 0)
    {
      std::cout << "Player 1 wins\n";
    }
    else if (total < 0)
    {
      std::cout << "Player 2 wins\n";
    }
    else
    {
      std::cout << "Drawn!\n";
    }
  }
  int main()
  {
    Randomize generator;
    int turns;
    std::cout << "Let's play dice\n";
    while (std::cout << "How many turns? ",
      std::cin >> turns)
    {
      play(turns, generator);
    }
  }

Randomize now simulates rolling the dice for both players within one call of operator()(). I felt that copysign() is a slightly strange function to use in this context as it converts its integer parameters to doubles before copying the sign of the second parameter to the first. The resulting double is then converted back to an integer when added to total. The same effect can be simply achieved by a conditional operator as I have used. No conversion to and from double is required.

Paul Floyd

There are two main problems with this code. The first is the cause of the draws, the second results in crashes.

For the cause of the draws, we have a misuse of the interface of std::generate:

  template< class ForwardIt, class Generator >
  void generate( ForwardIt first,
    ForwardIt last, Generator g );

(from http://en.cppreference.com/w/cpp/algorithm/generate)

So in the original code, where the call to this is

  std::generate(player1.begin(),
    player1.end(), generator);
  std::generate(player2.begin(),
    player2.end(), generator);

then the variable generator is passed to std::generate by value (the template deduces a type of randomize).

This is borne out by nm (output slightly reformatted for printing):

  nm -C cc109 | grep generate
  00000000004013d3 W void std::generate
  <__gnu_cxx::__normal_iterator<int*,
   std::vector<int, std::allocator<int>>>,
   randomize>(
    __gnu_cxx::__normal_iterator<int*,
    std::vector<int, std::allocator<int>>>,
    __gnu_cxx::__normal_iterator<int*,
    std::vector<int, std::allocator<int>>>,
    randomize)

This means that the original copy does not get updated by the first call, so the second call to std::generate gets a copy of the same object. The fix for this would be either to make the call to the template function explicit (not nice) or to just use std::ref so that the template will be initialised with a reference type rather than passing by value. Now with a reference the variable generator does have its state modified by the first call.

Hereâ€™s nm again when using std::ref:

  nm -C cc109 | grep generate
  00000000004013ca W void std::generate
  <__gnu_cxx::__normal_iterator<int*,
   std::vector<int, std::allocator<int>>>,
   std::reference_wrapper<randomize>>(
    __gnu_cxx::__normal_iterator<int*, 
    std::vector<int, std::allocator<int>>>,
    __gnu_cxx::__normal_iterator<int*, 
    std::vector<int, std::allocator<int>>>,
    std::reference_wrapper<randomize>)

At first I thought that was the end of the story. But then I ran the code through AddressSanitizer (compiled with clang++ and the -fsanitize=address option), and I saw:

Let's play dice
How many turns? 10
===================================================
==2657==ERROR: AddressSanitizer: heap-use-after-free on address 0x6040000003d0 at pc 0x0001085aeca3 bp 0x7ffee7654010 sp 0x7ffee7654008
READ of size 4 at 0x6040000003d0 thread T0
    #0 0x1085aeca2 in zipit<std::__1::__wrap_iter<int*> >::operator*() .zipit.h:32
    #1 0x1085ad4f0 in play(int, randomize&) dice_game.cpp:30
    #2 0x1085af3e6 in main dice_game.cpp:59
    #3 0x7fff5c710114 in start (libdyld.dylib:x86_64+0x1114)
0x6040000003d0 is located 0 bytes inside of 40-byte region [0x6040000003d0,0x6040000003f8)
freed by thread T0 here:
    #0 0x1086226ab in wrap__ZdlPv (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x646ab)
    #1 0x1085af8f4 in std::__1::__vector_base<int, std::__1::allocator<int> >::~__vector_base() vector:445
    #2 0x1085af4c4 in std::__1::vector<int, std::__1::allocator<int> >::~vector() iterator:1386
    #3 0x1085ae384 in std::__1::vector<int, std::__1::allocator<int> >::~vector() iterator:1386
    #4 0x1085ad0b5 in play(int, randomize&) dice_game.cpp:27
    #5 0x1085af3e6 in main dice_game.cpp:59
    #6 0x7fff5c710114 in start (libdyld.dylib:x86_64+0x1114)

previously allocated by thread T0 here:
    #0 0x1086220ab in wrap__Znwm (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x640ab)
    #1 0x1085b0279 in std::__1::vector<int, std::__1::allocator<int> >::allocate(unsigned long) vector:925
    #2 0x1085b2631 in std::__1::vector<int, std::__1::allocator<int> >::vector(std::__1::vector<int, std::__1::allocator<int> > const&) vector:1200
    #3 0x1085ae35c in std::__1::vector<int, std::__1::allocator<int> >::vector(std::__1::vector<int, std::__1::allocator<int> > const&) vector:1193
    #4 0x1085acfcd in play(int, randomize&) dice_game.cpp:27
    #5 0x1085af3e6 in main dice_game.cpp:59
    #6 0x7fff5c710114 in start (libdyld.dylib:x86_64+0x1114)

So there is an error in the dereference operator for a vector that is created and destroyed on the same line. That smells like a local variable being created and destroyed.

The problem is with the zipit begin() and end() functions.

  template <typename T>
  auto begin(T one, T two)
    -> zipit<typename T::iterator>
  â€¦

This is instantiated here

  for (auto it = begin(player1, player2);

Where player1 and player2 are the vectors of int. So T is a vector of int, and one and two are passed by copy. The local copy gets destroyed at the end of the call, leaving the returned pair of iterators dangling.

A third opinion from nm:

  nm -C cc109 | grep begin
  00000000004015a0 W zipit<std::vector<int,
    std::allocator<int>>::iterator>
  begin<std::vector<int, std::allocator<int>>>(
    std::vector<int, std::allocator<int>>,
    std::vector<int, std::allocator<int>>)
  0000000000401b9c W std::vector<int, 
    std::allocator<int>>::begin() const
  0000000000401356 W std::vector<int, 
    std::allocator<int>>::begin()

The fix for this is to make the begin/end arguments references:

  template <typename T>
  auto begin(T& one, T& two)
    -> zipit<typename T::iterator>
  {
    return {one.begin(), two.begin()};
  }
  template <typename T>
  auto end(T& one, T& two)
    -> zipit<typename T::iterator>
  {
    return {one.end(), two.end()};
  }

OK, now for the minor nits. I get a warning about the unused cout in the while condition. This can be silenced by a cast

  while((void)(std::cout << "How many turns? "),
    std::cin >> turns)
    {
      play(turns, generator);
    }
  }

A bit ugly.

There are no include guards, and the cpp file does not include zipit.h before the other files. This means that users that include zipit.h will have the misfortune of playing â€˜guess the header dependencyâ€™ (or fixing the header).

Lastly, the read of the int for the number of turns does not prevent against reading negative values. This then gets converted to size_t in the vector constructor. If the allocation succeeds then itâ€™s going to be a very long game.

Russel Winder

First and foremost why try to invent a custom zip capability, and risk getting it wrong? Where are the tests? I am sure there is a good criticism to be made of the code in zipit.h, but I shall just circumvent all that and say: use boost::combine to undertake the zip and delete zipit.h.

The original code also has output scattered around. OK so only in two functions, but still, the design principle applies: keep all input and output in one place and have functions return values to represent the result. Doing this enhances testability, so must be a good thing. In this case play should not do output but should return a â€˜who wonâ€™ result. Seems like a good place for an enumerated type. The output of the result then happens in main.

So ignoring the time it took to make those changes and get the code to actually work (because the developer failed to read the documentation correctly), we find it still delivers a draw in all cases. A bit of a WTF moment, but surely one it was intended one would have. So what is the problem? If you print out the value of the two std::vectors of rolls, you find that you always get the same sequence of values. Always. Fairly obvious then the same start state is the case for all calls to the Mersenne Twister generator. It turns out that std::generate seems to clone the generator passed so the player1 generation and the player2 generation are identical since the randomize class creates a new generator for each instance initialised exactly the same. A cure for this is to make the mt instance shared by making it static. Since this is a single threaded application this does not seem to be a problem, even if global shared state is deemed anathema by some.

With this change in place, and resulting in the program shown in Listing 1, the program seems to work as originally intended.

Of course, one has yet again to ask the question whether C++ is the best programming language to use for this application. Let us assume no. So what to choose? Well to investigate this letâ€™s look at Python, D and Rust.

Python, see Listing 2, has a built in zip function. Also the sum function acting over an iterable makes for a much nicer expression of the core part of the algorithm - which is rolling the dice and checking for who wins. Even though Python is a dynamic language at run time, type annotations have been used so that using the mypy command, we can be sure that all the function calls are correctly statically typed. The random numbers are provided via the random package, using a Mersenne Twister algorithm, which is â€˜globalâ€™ so there are no problems about generating the same sequence of random numbers. It would seem that Python makes it harder to make errors in this sort of application.

D, see Listing 3, also has a builtin zip function, so no difficulties on that front. Comparing the C++, Python, and D, one gets the sneaking suspicion that D is the best language for writing this code: less faff. Well except maybe for the random number generator. Rather than using a random number generator creating integers from the set {1, 2, 3, 4, 5, 6} with each integer result assumed to have equal probability, The std.random dice function is used to create a random number generator of the values with an explicit probability. Equal probabilities are used here, but it allows for experimentation with â€˜loaded diceâ€™. Another â€˜featureâ€™ of the code worth noting is the function call chaining approach as opposed to function application as used in C++ and Python. This is blurring the â€˜function applicationâ€™ and â€˜method call on an objectâ€™ that is quite rigid in C++ and Python. It is an aspect of trying to be as declarative as possible in what is an imperative language.

Of course Rust, see Listing 4, is reputedly the language of the moment, set to replace C, and perhaps C++, as the language of first choice for new projects. Immediately obvious is the very different way enumerations work. C++, Python, and D have very similar ways of defining and working with enumerations, Rust takes a very different approach. In particular an enumeration value can carry data of undefined value, just defined type. Rust, like D, emphasises being declarative, hence the creation of pipelines of data manipulation as seen in the D code. Of course Rust tries to go further and mandates the use of monads to create what is arguably a functional programming language (disguising the imperative language). This is seen most clearly in main. The code seems to a bit bigger, but there an appeal that error handling is enforced but in a quite pleasing way; especially highlighting the nice use of enumerations.. After the interesting dice emulation of D, in this Rust code returns to the generate an integer in a range assuming equal probabilities. The random number generator is not global, but it is shared, so no possibility of generating the same sequence for each player.

So is C++ the best language for this application? Probably not, but for something like this using the language you are most â€˜at homeâ€™ with is likely the best choice. Unless you are challenging yourself as a programmer by using a programming language you are not entirely familiar with. Personally I quite favour the D code.

Why is Go not in this list of languages likely to be better than C++? It should be, but I ran out of time to complete that version of the code. Bad project management I admit.

Nota bene: the code presented here does not represent a reasonable formatting of the code, and indeed violates many de facto and de jure code formatting guidelines. However, the journal formatting structure requires badly formatted code to be presented. :-(

PS There is clearly a difference in the languages behaviour here since the different languages produce different frequencies of draws. Hummmâ€¦

[Editor: since the programs all model the same die throwing game then statistically significant differences imply at least one is buggy.]

Listing 1

#include <algorithm>
#include <chrono>
#include <iostream>
#include <numeric>
#include <random>
#include <boost/range/combine.hpp>
#include <boost/tuple/tuple.hpp>
class randomize {
  private:
    static std::mt19937 mt;
  public:
    int operator()() { return mt() % 6 + 1; }
};
std::mt19937 randomize::mt {
  (unsigned long int)std::chrono::system_clock::now()
  .time_since_epoch()
  .count()
};

enum class result {
  draw = 0,
  player1_win = 1,
  player2_win = 2,
};
typedef boost::tuple<int, int> int_pair;
result play(int const turns, randomize & generator) {
  std::vector<int> player1(turns);
  std::vector<int> player2(turns);
  std::generate(player1.begin(), player1.end(), generator);
  std::generate(player2.begin(), player2.end(), generator);
  auto pairs = boost::combine(player1, player2);
  int_pair const t_t = std::accumulate(
    pairs.begin(),
    pairs.end(),
    int_pair{0, 0},
    [](int_pair t, int_pair i) {
      int a, b;
      boost::tie(a, b) = i;
      return int_pair{
      boost::get<0>(t)
        + (a > b ? 1 : (a < b ? -1 : 0)),
                    0};
            });
    int const total = boost::get<0>(t_t);
    return total == 0
      ? result::draw
      : (total > 0 ? result::player1_win 
      : result::player2_win);
}
int main() {
    randomize generator;
    int turns;
    std::cout << "Let's play dice" << std::endl;
    while (
            std::cout << "How many turns? ",
            std::cin >> turns) {
        auto const result = play(turns, generator);
        if (result == result::draw) {
            std::cout << "Drawn!" << std::endl;
        } else {
            std::cout
              << "Player "
              << (int)result
              << " wins"
              << std::endl;
        }
    }
}

Listing 2

#!/usr/bin/env python3
from enum import Enum
import random
from typing import Tuple
class Result(Enum):
    draw = 0
    player1_win = 1
    player2_win = 2
def player1_win(p: Tuple[int, int]) -> int:
    if p[0] > p[1]:
        return 1
    elif p[0] < p[1]:
        return -1
    return 0
def play(turns: int, gen) -> Result:
    player1_wins = sum(
        player1_win(p) for p in zip(
            (gen() for _ in range(turns)),
            (gen() for _ in range(turns))
        ))
    if player1_wins > 0:
        return Result.player1_win
    elif player1_wins < 0:
        return Result.player2_win
    return Result.draw
def main() -> None:
    print("Let's play dice")
    random.seed()
    while True:
        try:
            turns = int(input('How many turns? '))
        except ValueError:
            return
        result = play(
            turns,
            lambda: random.randint(1, 6)
        )
        if result == Result.draw:
            print('Drawn!')
        else:
            print('Player {} wins.'.format(result.value))
if __name__ == '__main__':
    try:
        main()
    except (EOFError, KeyboardInterrupt):
        print()

Listing 3

#!/usr/bin/env rdmd
import std.algorithm: map, sum;
import std.array: array;
import std.conv: to;
import std.random: dice;
import std.range: iota, zip;
import std.stdio: readln, write, writefln, writeln;
import std.string: strip;
enum Result {
    draw = 0,
    player1_win = 1,
    player2_win = 2,
}
Result play(uint turns, uint delegate(uint) generator) {
    auto player1_wins =
      zip(
          iota(turns).map!generator(),
          iota(turns).map!generator())
      .map!"a[0] > a[1] ? 1 : (a[0] < a[1] ? -1 : 0)"
      .sum;
    return player1_wins == 0
      ? Result.draw
      :  player1_wins > 0
         ? Result.player1_win
         : Result.player2_win;
}
void main() {
    writeln("Let's play dice");
    try {
        while (true) {
            write("How many turns? ");
            uint turns = to!uint(readln().strip());
            auto result = play(turns, delegate uint(uint) {
                    return to!uint(
                            dice(0.0,
                                 16.67,
                                 16.67,
                                 16.67,
                                 16.67,
                                 16.67,
                                 16.67));
                });
            if (result == Result.draw) {
                writeln("Drawn!");
            } else {
                writefln("Player %d wins.", result);
            }
        }
    } catch (Exception) {
        writeln();
    }
}

Listing 4

extern crate itertools;
extern crate rand;
use std::io::{self, Write};
use itertools::Itertools;
use rand::distributions::{Range, Sample};
enum Result {
    PlayerWon(u8),
    Draw
}
fn play(
    turns: u32,
    dice_roll: &mut FnMut()->u8) -> Result
{
    let player1_wins: i32 =
        itertools::repeat_call(dice_roll)
        .tuples::<(_, _)>()
        .map(|p|
             if p.0 > p.1 { 1 }
             else if p.0 < p.1 { -1 }
             else { 0 }
        )
        .take(turns as usize)
        .sum();
    if player1_wins != 0 {
        Result::PlayerWon(
            if player1_wins > 0 { 1 }
            else { 2 }
        )
    }
    else { Result::Draw }
}
fn main() {
    println!("Let's play dice");
    let mut dice = Range::new(1u8, 7u8);
    let mut rng = rand::thread_rng();
    let mut buffer = String::new();
    loop {
        print!("How many turns? ");
        io::stdout()
            .flush()
            .expect("Could not flush stdout.");
        buffer.clear();
        match io::stdin().read_line(&mut buffer) {
            Ok(_) => match buffer.trim().parse::<u32>() {
                Ok(turns) => match play(
                    turns,
                    &mut ||{ dice.sample(&mut rng) }
                ) {
                    Result::PlayerWon(p) =>
                        println!("Player {} wins.", p),
                    Result::Draw =>
                        println!("Drawn!"),
                },
                Err(_) => break
            },
            Err(_) => break
        }
    }
}

Jason Spencer

The reason for all games ending in a draw is that the random number generator actually creates a deterministic sequence of numbers, and since the calls to std::generate copy the generator object, that is to say its internal state, the die throws generated for both players are identical. There is, however, another bug with invalidated iterators that potentially corrupts the result calculation, and may cause a crash or non-draw result.

The issues with the code are as follows:

zipit.h

The file requires the following includes: <utility> for std::pair and std::make_pair. Also <iterator> for std::advance, std::begin and std::end, which will be added next. An include guard for the whole file is also a good idea.
class zipit is derived from std::pair<T,T> with public access. Consider making it protected, otherwise a user can access .first and .second and change their values, putting them out of sync. This might be a wanted behaviour, however. I think it breaks encapsulation and the inheritance should be protected.
Consider renaming the template parameter from T to Iter, to express the meaning. T is used so often that it can get confusing.
Two operator* functions (one const, one non-const) arenâ€™t required, only one â€“ the function doesnâ€™t change the zipit instance â€“ although dereferencing may then lead to a change in the pointed to variable, but that doesnâ€™t mean the zipit instance changes (compare const char * ptr and char * const ptr). So only the const version is needed, and the non-const should be removed.
operator* at the moment returns an rvalue, so zipit only models an input iterator. We may want to return an lvalue to model an output iterator. See the discussion and implementation later.
zipit &operator+=(std::pair<int,int> const &rhs) should use std::advance instead of += to advance this->first and this->second. If the underlying iterator is not a random access iterator it may not have an operator+= overload. std::advance will either use operator++ or operator+=, preferring the latter as it has constant time complexity, while the former is linear.
operator-> is indeed required to at least satisfy the requirements of InputIterator (see later for discussion of iterator types).
The begin(T,T) and end(T,T) free functions should both have their arguments passed by non-const reference. Both of the functions extract iterators from the arguments passed and store them in a zipit instance and return that. However, since the arguments passed to these functions are currently temporary copies the iterators are invalid as soon as the function returns. Since the score is later calculated from dice throws stored in memory that has been deallocated the program either crashes or has invalid results.
In begin(T,T) and end(T,T) the use of T::iterator assumes that the typedef exists in T. By using std::begin(one) instead of one.begin() and changing the return type of these functions they can be adapted to also support C-style arrays, thusly (just s/begin/end/g for the end function):

    template <typename T>
    auto begin(T & one, T & two) -> 
    zipit<decltype(std::begin(std::declval<T&>()))>
    {
      return { std::begin(one), std::begin(two) };
    }

dice_game.cpp

In class randomize the mt19937 instance should not be left un-seeded. Either pass the seed as an argument to the constructor or call mt19937::seed in the randomize constructor, otherwise every time the program is executed the numbers returned by generator will be the same (and therefore the game results will be the same for the same number of games played). I propose two constructors for class randomize â€“ a default constructor which will use a source of entropy (some OS or hardware source, but worst case the current time) to seed, and another constructor that takes the seed as an argument. See below for a sample implementation and later for a discussion on seeding.
In randomize::operator() the use of %6+1 to scale the output of mt() may produce a non-uniform distribution. While it is good enough for this use, thereâ€™s a very tiny bias. If mt() were to generate numbers uniformly distributed over the interval [0,7] then taking modulo 6 of these numbers 0 would map to 0, 1=>1, 2=>2, 3=>3, 4=>4, 5=>5, 6=>0, 7=>1, so the probability of getting a 0 or 1 output is twice the probability of the other numbers. In our case, however, mt() will produce numbers in the range 0â€“2³²-1, so the bias is unnoticeable with a 1.4 * ^10-7% higher probability of output die rolls 1,2,3. Even so, prefer using std::uniform_int_ distribution to change the distribution of the mt19937 instance output â€“ it doesnâ€™t have the bias issue and is more descriptive in terms of intent.
The randomize class then would look something like this:
```
    class randomize {
       std::mt19937 rng;
       std::uniform_int_distribution<int> D6;
    public:
       randomize(decltype(rng)::result_type seed)
       : rng(seed), D6(1,6) { }
       randomize ()
       : randomize ( std::random_device()() ) { }
       int operator()() { return D6(rng); }
    };
```
It might be an idea to log the seed value generated by std::random_device, in case the program has to be debugged later.
In play(...) the last argument of both std::generate calls should be std::ref(generator). Since std::generate takes the third argument by copy std::ref should be used to wrap a reference to generator in an std::reference_wrapper instance. As discussed later the random number generator actually produces a deterministic sequence, rather than true unpredictable random numbers. Without the wrapper the first call to std::generate makes a temporary copy of the generator object for use within the function, and all the internal state is copied with it, afterwards in play(...) the internal state of the generator hasnâ€™t changed, so in the second call to std::generate the sequence of numbers is repeated. The result is that both players get the exact same die throws, hence the repeated draws for the headline bug.
In play(...) this is personal choice, but I prefer int total = 0 over int total{} as the expected value is clear.
In play(...) the use of an if and then a copysign call is perhaps a little confusing. The intent is to do a three way comparison. While thereâ€™s a proposal for such a feature in C++ [1] it doesnâ€™t currently exist. There are a number of ways to do this but my preferred approach is a ternary conditional expression operator â€“ it is the most expressive, and simple to execute:
```
    total += (diff<0) ? -1 : ( (diff==0)? 0 : 1 );
```
Here the intent and the result values are explicit (-1,0,1) and appear in order. If diff is more likely to be in one of the states, 0 for example, then the two ternary conditionals could be re-ordered for performance, so the most common case is dealt with first and the second comparison doesnâ€™t have to be executed, but thatâ€™s not the case here.
copysign is presumably from math.h, but the header isnâ€™t included, yet this compiles (g++ 7.3), so Iâ€™m not sure where itâ€™s from. Consider using std::copysign and include cmath if copysign must be used. Otherwise Iâ€™d recommend the ternary operator approach.

In play(...) consider replacing the whole for loop over the zip iterator with std::accumulate:

    auto get_score_from_throws = [](auto p1,
      auto p2)
    {
      auto diff = p1 - p2;
      return (diff<0) ? -1 : ( (diff==0)? 0 : 1 ); 
    };
    total = std::accumulate (
      begin(player1, player2),
      end(player1, player2), 0,
      [&get_score_from_throws](auto sum,
        const auto & it) {
        return sum += 
        get_score_from_throws(it.first,
          it.second);
      }
    );

or use std::inner_product:

		
      total = std::inner_product (
        std::begin(player1), std::end(player1),
        std::begin(player2), 0, std::plus<>(),
        get_score_from_throws );

The accumulate version uses the zip iterator, and the function name is more self explanatory. The inner_product, while confusingly named, achieves the same thing, and doesnâ€™t even need the zip iterator.

Alternatively std::reduce or std::transform_reduce could be used, but neither libstdc++ (the STL used by g++) [2], nor libc++ (the STL used by CLang) [3] have yet to support these C++17 features, so I couldnâ€™t produce sample code.

In play(...) consider making the turns argument unsigned instead of int. You cannot play a negative number of games.
In main(...) turns should be bounds checked

So that should get the right result with the existing program design, but Iâ€™d make some design changes also.

If a zip iterator is still required consider using Boost. ZipIterator instead â€“ it supports the zipping of more than two iterators and is tested and known to work.
zipit::operator== is a potential source of errors. Since there isnâ€™t an explicit comparison operator the inherited one is used. This is typically the correct behaviour, however if the vectors passed to begin(T,T) and end(T,T) are different lengths there will never be a time when the end zipit iterator is reached, since both first and second have to match. Since theyâ€™re incremented together theyâ€™ll never both match when there are different length vectors. Consider tweaking end(T,T) to set .first and .second to the same offset from the beginning of the vectors. The offset should be the length of the shorter vector.

With its current functionality the randomize class isnâ€™t actually needed at all. Assuming no other features are required the following is a drop-in replacement:

    auto generator = std::bind ( 
    std::uniform_int_distribution<unsigned>(1,6),
      std::mt19937(rng_seed) );

For fun, you could also create a loaded die that has double the chance of rolling a six:

		
    auto loadedD6 = std::bind(
      std::discrete_distribution<unsigned>{
        0,1,1,1,1,1,2}, 
      std::mt19937(rng_seed));

Of course, for one player to have the advantage youâ€™d need each player to have different dice types.

With respect to seeding â€“ it might be an idea to make the seed a command line argument, so that you can have reproducible results in testing, something along the lines of:

    void usage(char * execname) { std::cerr <<
      "Usage: " << execname <<
      "[--seed <unsigned integer>]\n"; }
    int main(int argc, char *argv[]) {
      using namespace std::string_literals;
      using rng_t = std::mt19937;
      rng_t::result_type rng_seed =
        std::random_device()();
      if ( argc == 3 ) {
        if("--seed"s != argv[1]) {
          usage(argv[0]); return -1;
        }
        // could throw invalid_exception or 
        // out_of_range
        int seed = std::stol(argv[2]);
        if(seed < 0) throw std::out_of_range(
          "Seed value must be a positive"
          " integer");
        rng_seed = seed;
      }
      else if(argc!=1) {
        usage(argv[0]); return -1;
      }
      std::cout << "RNG seed = " <<
        rng_seed << '\n';
      auto D6 = std::bind(
        std::uniform_int_distribution<unsigned>
        (1,6), rng_t(rng_seed));
    ...

Something to consider is whether we actually need to keep the die throws in memory. No game is dependant on the result of any other, so thereâ€™s no need to store the throws or even the score per round. A loop could make die throws, calculating the score and keeping a running score, without any vectors.
Of course it can be shown that for two N sided fair dice the chance of throwing the same value is 1/N, the chance the first die throws a smaller number is (N-1)/2N, same for the second die. So we could get the game result with the following generator:
```
    const unsigned die_sides = 6;
    auto game_point_generator = std::bind ( \
      std::discrete_distribution<unsigned>{ 1.0, 
    (static_cast<double>(die_sides)-1.0)/2.0, 
    (static_cast<double>(die_sides)-1.0)/2.0 }, \
      std::mt19937(std::random_device()()) \
      );
```
A call to game_point_generator() will generate 0 if the game is a draw, a 1 if player1 wins, and 2 if player 2 wins.
If the die throws do have to be kept in memory then consider a vector of std::pairs to store the results â€“ this way there is no chance of one vector in the zip iterator being smaller than the other. We also don't need the zip iterator then.
zipit could be generalised by using a tuple to allow more than two iterators to be zipped; and thereâ€™s no reason for all iterator types to be the same.
zipit should consider SFINAE/enable_if to enable the operator overloads based on the type of the zipped iterators.
std::iterator_traits<zipit> should be implemented so that zipit can be better integrated into the STL (for example if it happens to model a random access iterator and states so via iterator_traits then std::advance will be optimised).
In the spirit of C++20 zip_it can be implemented with constexpr modifiers.

Points for discussion

Random number generation

Random number generators fall into two categories â€“ deterministic and non-deterministic. Since CPUs are deterministic state machines they cannot, without special hardware or external input, produce true entropy. â€˜Randomâ€™ numbers are therefore typically produced by pseudo-random algorithms that are actually deterministic. If you know the internal state and the algorithm, you can predict the next number exactly. This is how the rand() library call works (thatâ€™s typically a linear congruential generator [4]), as well as mt19937 which is available since C++11. To change the sequence produced every time the program is run it is typical (although perhaps not always correct) to bootstrap the generator from the current time. Since C++11 std::random_device can be used to seed the deterministic generator. std::random_device takes entropy from an implementation dependant source (for example RDRAND on Intel/AMD, or /dev/urandom under linux â€“ both use external events such as interrupts to accumulate entropy) to try and generate non-deterministic random numbers, but the non-determinism is not guaranteed by std::random_device (26.5.6.2 in [5]).

Creating custom iterators

Weâ€™re not so much creating a custom iterator here, as we are an iterator adaptor. And we also cannot assume the underlying iterator type â€“ it may be a forward input, forward output, bidirectional, or random iterator [6]. And then there are the const and reverse variants. But I think the intention of zipit is still to behave like an STL iterator. One thing is for sure though, writing iterators isnâ€™t easy. Thereâ€™s a fair amount of responsibility that goes into writing one.

If we try to model an input iterator then according to [6] zipit is still missing a post-increment operator and an operator->. The post-increment operator isnâ€™t that difficult and can be expressed in terms of the pre-increment operator:

  zipit operator++(int)
  {
    auto temp = *this;
    ++*this;
    return temp;
  }

operator-> makes my head hurt, however. It should allow iter->member ... but how do we do that while managing two iterators?

  zipit<Iter> const * operator->() const
  {
    return this;
  }

works, but then usage is ugly:

  std::vector<std::string> sv1 = {"a","b","c"};
  std::vector<std::string> sv2 = {"c","b","a"};
  auto sv_zip = begin(sv1,sv2);
  std::cout << sv_zip->first->length() << '\n';

and possibly incorrect. Should usage be sv_zip->first.length()? Or sv_zip.first->length()? The latter wonâ€™t call operator-> and is available for free as long as the inheritance from std::pair is public.

Alas itâ€™s hard to know what to return from a zip iterator when pointers or references are involved.

If we want to take it further, and model an output iterator then we must make the iterator dereferenceable as an lvalue (i.e. operator* returns a type convertible to T &). This leads to a similar problem.

Bear in mind also that the iterator weâ€™re storing is not necessarily a pointer, itâ€™s something that behaves like a pointer, but may not be a raw pointer. Case in point â€“ vector<bool>::iterator is a proxy class that accesses the underlying storage through something sufficiently advanced (magic?). Dereferencing it returns a vector<bool>::reference.

Using this for inspiration we can make an attempt at operator* for output iterators:

  // references cannot be stored in std::pair, 
  // so we create a wrapper
  template <typename T> class ref_wrap {
    private:
      T & p;
    public:
      explicit ref_wrap (T & t) : p{t} {}
      operator T & () const noexcept {
        return p; }
      T & operator= (const T & t) const {
        p = t; return p; }
  };
  // convert the iterator type to a reference 
  // type
  template <typename T> struct
    iter_reference_type { typedef typename
    std::iterator_traits<T>::reference type; };
  template <typename T> struct
    iter_reference_type < T * > {
      typedef T & type; };
  // optionally wrap the reference in our 
  // wrapper
  template <typename T> struct
    wrapped_reference_type { typedef T type; };
  template <typename T> struct
    wrapped_reference_type < T & > { typedef 
      ref_wrap< T > type; };
  template <typename Iter> struct
    wrapped_iter_reference_type {
    typedef typename wrapped_reference_type< \
      typename iter_reference_type<Iter>::type \
      >::type type; };
  // a second template parameter is added to 
  // zipit in case the reference deduction 
  // doesn't work for some custom iterator
  template <typename Iter,
    typename wrapped_reference = typename 
  wrapped_iter_reference_type<Iter>::type >
  class zipit : public std::pair<Iter, Iter>
  {
  ...
  public:
    typedef wrapped_reference reference;
    std::pair<reference, reference> operator*()
      const {
      return std::make_pair(
        reference(*this->first), 
        reference(*this->second) );
    }
  ...
  };

Here we have a simple class to wrap a reference, which allows us to store it in an std::pair. std::reference_wrapper cannot be used since it is implicitly convertible from T and any assignment then simply changes the reference, not the referred to value.

The SFINAE is required to extract the true value reference type. T * converts to T &, std::vector<T>::iterator converts to either the proxy object, or if it is a raw pointer to T &. Raw references then get wrapped and packed in a pair. Proxy objects donâ€™t get wrapped, they just get put into a pair.

The same wrapper could be used to implement operator-> but since we have to return a pointer to the std::pair<reference, reference> we might have to allocate the pair object on the heap and return as a unique_ptr. And thatâ€™s really ugly. I want no part of it.

â€¦and of course plenty more features are needed if the iterator is to model one of the other iterator types.

â€¦and donâ€™t forget to specialise std::iterator_traits (deriving from std::iterator was deprecated in C++17).

â€¦and since C++11 the iterators must be swappable through std::swapâ€¦ so specialise that too.

All in all, implementing iterators can lead to bouts of anxiety and heavy drinking. So drop zipit from the code â€“ just use Boost.ZipIterator and accept its semantics, or std::inner_product without a zip iterator, or std::vector<std::pair<int,int>>, or donâ€™t store the dice throws.

References

[1] http://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0515r0.pdf

[2] https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2017

[3] http://libcxx.llvm.org/cxx1z_status.html

[4] https://en.wikipedia.org/wiki/Linear_congruential_generator

[5] The C++11 Programming Language standard â€“ ISO/IEC 14882:2011(E)

[6] http://www.cplusplus.com/reference/iterator/

Balog Pal

This entry starts with a really good teaser. Dice game that is always drawn? In real life I would ask â€œdid you check your RNG sequence? Iâ€™d bet you either keep generating the same number or the exact same sequence for the two players.â€ Itâ€™s worth a look just to prove my gut right or wrong.

Looking at the code I see some iterator magic that I save for later. Then more magic with the new <random> that Iâ€™m not up to date with. However, in the past I read a lot of problems related to RNG. It reminds me an old story from around the millennium. I had to write an application that used winsock sockets for communication. When it was â€˜readyâ€™ I went to the forums to ask some peersâ€™ opinion. And the people, instead of commenting on the code just pointed me to fetch the â€˜winsock lame listâ€™ and serve myself. As anyone would guess I found a good amount of matchesâ€¦ And random looks like a similar can of worms, maybe someone compiled a similar list? Letâ€™s try a web search for â€˜lame list for rngâ€™. Wow, guess what, first hit is titled â€˜The C++ <random> Lame List â€“ Elbenoâ€™ [1]. It even refers to the ws lame list as the muse. Unfortunately, it lacks the explanations but even just the points can help a lot.

Matching the code against the list I see hits on 6, 7, 11, 14 and we get half-dozen good hints on what to avoid as we try to fix. And especially 14 looks like a killer that may match my first clue. But letâ€™s stash that too, and put the code in a compiler so we can instrument it with some output at interesting points. And while at it, try some static analysis.

In Visual Studio 2017 (15.5.5) default W3 compile gives one warning on total += â€“ conversion double to int. Maybe Wall tells more. Huh, indeed. 900 warnings! Too bad all but 4 come from the standard headers. I wonder if the VS team ever tried the dogfood method that others use to create actually usable stuff. The 3 warnings on our code are int signed/unsigned mismatch that are not very interesting till we not press the bounds. But the int turns that comes directly from user input is just used to size the vectorâ€¦ weâ€™ll try to play with -1 later. <evil grin>

Now letâ€™s see analyze. Weâ€™re warned that begin() and end() can be declared noexcept. Also â€˜The reference argument generator for function play can be marked as constâ€. That in general we treat by adding the const, but for this case it is an indication of a problem: we expect the generator to mutate, if it could be const, we messed up. Must be that missing ref() from point 14.

Lastly, a warning â€œDo not dereference a invalid pointer (lifetimes rule 1). return of += was invalidated by end of scope for allotemp.2â€ that honestly beats me, especially as op -= has identical code without warning.

Clang invoked through the â€˜clang power toolsâ€™ extension also picked that ref should be const [see the clang-tidy check google-runtime-references] and said that in call play() the first arg is not initialized, what is not really true for our case, if we got there, cin reported okay, so it did complete >> properly.

Letâ€™s just run it as is to see those draws. No luck here, we hit an assert: â€œvector iterator not dereferencableâ€ in operator*, debugger shows this->first being null. Wow. While at it, I take a peek into the sequences sitting in player 1: 3 1 3 6, and 2: same. Was I right or was I right? But I really want to see the output, maybe in release build that assert will not stand in the way? Indeed, now the program executes, and I get â€¦ â€˜Player 1 winsâ€™ on all my tries. Letâ€™s hope we didnâ€™t also launch some nukes on another continent.

Originally, I did not want to bother with the code before class randomize at all, but must push back the related rant even further, to at least get UB out of the way. We donâ€™t have too much code before the assert, just the begin() so letâ€™s look at that: guess what, it takes out vectors by value, gets the iterator, stores it, then the copies are destroyed for good leaving us with the invalid iterators. Same story with end(). Letâ€™s add the missing &s and see what happens. Yey, no more assert, and weâ€™re finally drawn for all attempts. And if we use std::ref(generator) in the two generate() calls, then we get various results too.

So, once we have the program working as desired, whatâ€™s left besides to throw it ALL away and create properly? From scratch, as we learnt from the food industry, that if you already mixed the bad things with the good, it is hopeless to separate them later, instead we only add what is good. It does not stop us from reusing what we like from the original source be it idea or a code fragment.

For start we paste the original main(). It looks okay-ish for the original aim. We just need a bounds check on turns before passing it on. Call to play() is fine, just make sure generator is passed by ref. And we will need the random source. Having it on the stack will trigger #11. We might ignore that reasoning it is a single case and we are positive to have enough stack space. Or use some alternative, like making it static or have a dynamic instance with make_unique. This program does not want threads, so we can ignore the related points, but commit them to memory.

Now letâ€™s make a proper random source. While at it, letâ€™s also rename it accordingly, random_source is way better description than randomize. The list approves mt19937 as a good start, I have to look up how to use it. Along with other hints that weâ€™ll need random_device and uniform_int_distribution. The latter on cppreference.com has the code for our case. All together I came up with this:

  class random_source
  {
    static unsigned int get_seed()
    {
      try {
        std::random_device rd;
        return rd();
      }
      catch (std::exception const&)
      {
        return static_cast<unsigned int> (time(
          nullptr));
      }
    }
    std::mt19937 mt{get_seed()};
    std::uniform_int_distribution<> dis{1, 6};
  public:
    int operator()() {
      return static_cast<int>(dis(mt)); }
  };

The description of random_device is pretty vague, most of it is left to the implementation, but it is intended to do the best to summon some entropy. If it canâ€™t and we get the exception the OP should decide what to do, I just dumped in an alternative source that is less lame as the 2nd violin. Time returns unspecified type, but it is numeric and for unsigned anything goes. The other cast in the op() is benign as the number is in the known range. I was thinking to change the return type instead to unsigned, but for the use plain int is the natural thing. Or is it? Maybe byte is more fitting. That can be rearranged later. Moving on to play().

The aim was to do play N rounds, in each make both player throw dice, decide the round outcome and accumulate the result. One way to do that is

  int turn(random_source &generator)
  // returns score for player 1: -1 0 or +1
  {
  // return sign( generator() - generator() );  
  // too bad no stock sign function....
    auto const d1 = generator();
    auto const d2 = generator();
    return d1 < d2 ? -1 : d1 > d2;
  }
 int game(int turns, random_source &generator) 
  // the difference between player 1 and 2's wins
  {
    int total = 0;
    for (int i = 0; i < turns; ++i)
      total += turn(generator);
    return total;
  }

And in the original play() function we just

  auto const total = game(turns, generator);

and print the result. And we are done. For good. No kidding. Fun thing that it now even works with the original main without bound check and prints draw for -1 instead of crash or assert.

Someone may object that we were supposed to criticize or fix zipitâ€¦ Were we actually? It breaks one of my fundamental rule for reviews (and dev process): application programmers are forbidden to write blacklisted stuff, that starts with known timewasters like â€˜logger classâ€™, â€˜string classâ€™ â€˜refcounting pointerâ€™, and also contains â€˜collectionâ€™ and â€˜iteratorâ€™. For the collection I can be convinced in theory if someone has a problem that requires a really new type of collection not covered by either standard or the many libraries we work with. And for iterator if one opens with â€œI have this code full of algorithm use, I needed an iterator for that special collection to drive them.â€ Not the case here. Even for that case Iâ€™d ask whether it is really needed right now or we may wait a little till ranges finally arrive.

If the more advanced version of the program have actual need to keep record of turns, it can be simply done in a pair<int, int> per round, collection of those and running whatever processing with tools that work (WELL!) out of the box.

And those who really need to implement their own, can start with web search for â€˜how to write c++ iteratorâ€™, the first hit [2] is more than promising.

An extra stab at header separation: this simple code does not need a header at all, now, or probably ever. But if some parts are extracted, the header shall be self-containing, have all needed #include lines, not rely on the client.

[Web Editor's Note: Article is continued here.]

Notes:

More fields may be available via dynamicdata ..