Journal Articles
Browse in : |
All
> Journals
> CVu
> 303
(9)
All > Topics > Programming (877) All > Journal Columns > Code Critique (70) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Code Critique Competition 112
Author: Bob Schmidt
Date: 03 July 2018 16:27:33 +01:00 or Tue, 03 July 2018 16:27:33 +01:00
Summary: Set and collated by Roger Orr. A book prize is awarded for the best entry.
Body:
Please note that participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.
Note: If you would rather not have your critique visible online, please inform me. (Email addresses are not publicly visible.)
Last issue’s code
I’m playing around with simple increasing sequences of small numbers and seeing what various different generators produce. I had problems that sometimes the result was wrong – for example if I produce a number that overflows the integer size I’m using – and so I added a postcondition check that the output sequence is increasing. However, I’ve found the check doesn’t always ‘fire’ – it works with gcc with or without optimisation but fails with MSVC unless I turn on optimisation. I’ve produced a really simple example that still shows the same problem, can you help me see what’s wrong?
Expected output, showing the postcondition firing:
1,2,3,4,5,6,7,8,9,10, 1,3,6,10,15,21,28,36,45,55, 1,1,2,3,5,8,13,21,34,55, 1,5,14,30,55,91,140,204,285,385, Postcondition failed! 1,5,14,30,55,91,-116,-52,29,-127,
Can you help (and perhaps identify one or two other issues en route)? The code is in Listing 1. (Unfortunately, the first few lines from fibonacci()
didn’t appear in the printed copy of CVu. We apologise for the confusion this caused.)
#include <algorithm> #include <functional> #include <iostream> #include <iterator> #include <vector> class postcondition { public: postcondition(std::function<bool()> check) : check_(check) {} ~postcondition() { if (!check_()) std::cerr << "Postcondition failed!\n"; } private: std::function<bool()> check_; }; template<typename T> std::vector<T> get_values(int n, std::function<T(T)> generator) { std::vector<T> v; auto is_increasing = [&v]() { return is_sorted(v.begin(), v.end()); }; postcondition _(is_increasing); T j = 0; for (int i = 0; i != n; ++i) { j = generator(j); v.push_back(j); } return v; } using out = std::ostream_iterator<int>; template <typename T> void sequence() { auto const & result = get_values<T>(10, [](T i) { return i + 1; }); std::copy(result.begin(), result.end(), out(std::cout, ",")); std::cout << std::endl; } template <typename T> void triangular() { T i{}; auto const & result = get_values<T>(10, [&i](T j) { return j + ++i; }); std::copy(result.begin(), result.end(), out (std::cout, ",")); std::cout << std::endl; } template <typename T> void fibonacci() { T i{1}; auto const & result = get_values<T>(10, [&i](T k) { std::swap(i, k); return i + k; }); std::copy(result.begin(), result.end(), out (std::cout, ",")); std::cout << std::endl; } template <typename T> void sum_squares() { T i{1}; auto const & result = get_values<T>(10, [&i](T j) { return j + i * i++; }); std::copy(result.begin(), result.end(), out (std::cout, ",")); std::cout << std::endl; } int main() { sequence<int>(); triangular<int>(); fibonacci<int>(); sum_squares<int>(); sum_squares<char>(); // overflow expected } |
Listing 1 |
Critiques
Balog Pal
“I produce a number that overflows the integer size I’m using – and so I added a postcondition check that the output sequence is increasing.“
Hmm. Signed integer overflow is undefined behavior. If we have that in the program, postcondition checks will not help – or reliably detect problems. Let’s see the code if it is indeed the source of the problems.
First, just a fast read from the top making notes of other problems.
The postcondition class has a converting ctor, we probably want an explicit
here. It takes function by value, probably following some outdated wisdom on passing predicates by value. I learned the hard way that std::function
is a fat beast and pass it around as const ref. Though for this case we make a copy of what is passed, so the preferred way is probably indeed taking it by value, but then move in instead of creating yet another copy.
Formally we’re violating the rule of 3, but the intended use form will not lead to problems. The industrial version would probably inherit from nocopy
and possibly require noexcept
for the check function too. I guess the idea about using the check in dtor is to cover all exit paths automatically instead of explicit points. That’s fine, but we need to be extra careful about object dependencies, avoid accessing objects in the check function that were already destructed.
check_(check)
is a form I avoid and use the same name for member variable and parameter, experience shows that this is less error prone in general, unless the compiler has reliable warning on mistakes made when writing check_(check_)
.
get_values
again takes function by value. What’s worse, it takes function for no good reason I can imagine. It’s a template, should just have another parameter for type of passed generator and take the source without repackaging it. It uses the postcondition with a lambda, the dependency chain looks fine at glance. We’re calling the generator with the previous aggregate and just storing it in a vector. The check also looks okay, with all the std::
noise around (what I would not have in my code) it is odd to see the naked is_sorted
, but it should work via ADL.
Then we have the sample sequence functions. With obvious blatant copy-paste for the printout part. That at least should be extracted into a function, but my preferred way would separate responsibilities and just return the vector with the sequence and have an outer framework that used these and arrange the printing or whatever other work.
In the implementation I prefer the initial value look i = 0;
and i = 1;
instead of the {1}
and especially the {}
form. I also note that naming of the local and lambda param would fare well in IOCCC, so we’ll look ahead to deciphering what is really going on. It also shows that the interface is weak on design: it would be so much easier if we got both the sequence number and the aggregate as argument.
auto const & result = get_values
is okay, we get life-extension on the result coming from the function. But these days I prefer just auto const
as copy-elision is reliable for construction, and for the &
case, the compiler is actually allowed to make a copy if it feels like it. Making the difference moot.
Let’s check the generator lambdas. For sequences, there’s little chance of getting it wrong. In triangular
we increment i
, that is correctly captured by ref (and explicitly too).
fibonacci
deviates from all traditions that would use 2 values and roll them. Instead it struggles with just one. It looks to work through evil trickery, at least for types covered by std::swap
. Which should be switched to the using std::swap;
followed by unqualified call to pick up possible user-defined overloads, not just specialisations.
sum_squares
has i * i++
. Wow. We were looking for UB, but not this one. :) I’m skipping this discussion; if anyone is interested, please read the gigabytes we typed over the last 30 years about i + i++
and similar expressions. This one is not covered even by the recent refining of the order of evaluation for C++17 (P0145R3) and we have an explicit example in the execution section. But even if ++
was reliably sequenced at i++
, the outcome would still be unspecified and we may generate the wrong number depending on whether the increment happens before or after reading the first i
. The fixed form should create the value first and do the increment afterward.
Finally, main
leaves nothing to note. After we have fixed the previous problems, the int
versions should work fine. The char
version is indeed bound to overflow, let’s look where. The lambda in sum_squares
would be the obvious candidate, but it is actually okay: the expression has char
type inside that gets promoted to int
, the +
and *
are calculated on int
without overflow and the return type, as we didn’t say otherwise is int
too.
So, UB dodged here; j = generator(j);
must be the place where we narrow the int
down to char
. The MSVC compiler actually flags it for us, but not at that place: it is lost in the preceding lines of code. And this line is not picked anywhere. Ah, and it is right too: at that spot we call generator()
that is a T(T)
function and we store char
as char
. The narrowing magic is done inside the dark corners of std::function
where it adapts our int(char)
lambda. And with all the type-punning around even the compiler can’t tell us the point.
Anyway, what we have is not an overflow in an expression but a narrowing conversion. Which is not UB, ‘just’ providing an implementation-defined value for the out-of-range case. And the values we saw in the vector suggest we got negative values and CAN expect the postcondition check to fire. I definitely missed something during the code reading.
In MSVC, I can usually debug inside lambdas and other odd locations, but no luck in this case. The breakpoint on the postcondition check just never triggers, despite it being called. So I wrote some dump code inside check
. And found that in debug mode the vector v
is empty. Thus passing the sorted check with flying colors....
DOH. My working theory is that we are hit by NRVO/move semantics. In get_values
, we return a vector by value, and do that from a local variable named v
. At return point v
is considered xvalue so returning it via move
is fair game. And in that case the content is moved out from v
to the temporary that gets actually returned. Then the destructors are called and the check is executed on the ‘unspecified-though-valid’ state of the eviscerated v
. That happens to look empty, but could be anything really. While in release build, the compiler probably uses NRVO, avoiding the copy/move completely. v
never exists as a local variable but gets constructed at the caller’s location, so the check function sees its content.
As experiment, I tried return std::move(v);
in get_values
, that kills NRVO and with that MSVC release mode became consistent with debug, and check
was looking at empty vector in both.
Though it makes sense the way we saw, I made an extra check in the standard. And indeed, 9.6.3p3 clearly states:
The copy-initialization of the result of the call is sequenced before the destruction of temporaries at the end of the full-expression established by the operand of the return statement, which, in turn, is sequenced before the destruction of local variables (9.6) of the block enclosing the return statement.
So we better commit an extra bullet to the check list on destructor-dependencies with the case of move-from-return beyond the early destruction cases.
James Holland
The problem with the student’s code is associated with the return statement of get_values()
. It seems to me that in debug mode, the vector v
is being affected in some way before the destructor of postcondition
is called. It is probable that Named Return Value Optimisation (NRVO) is employed resulting in the contents of v
being moved to the calling function. This would leave v
in a valid but unknown state. I suspect, in this case, v
has been left with zero length. Furthermore, I suspect that the destructor of postcondition
reads the modified value of v
and finds it empty. If this is the case, postcondition will conclude that v
is sorted and not produce a warning message. (An empty string is considered sorted.)
To my mind, this is incorrect behaviour. If NRVO takes place by means of moving (as opposed to copying) the contents of v
, then the destructor should inspect the location to where the contents of v
has been moved. The destructor will then conclude that the string is not sorted, as expected. Alternatively, moving the contents of v
should be prohibited if a reference to it is made by a destructor.
When the optimiser is enabled, I assume different code is generated that either calls the destructor of postcondition
before any manipulation of v
or does not move v
. Either way, this would result in the destructor of postcondition
seeing v
with its original contents and conclude that the vector is not ordered and produce a warning message.
It is strange (and somewhat worrying) that the program behaviour is dependent on whether the optimiser is enabled. The conclusion is that Visual Studio is producing incorrect code when the optimiser is disabled, i.e. in debug mode.
Probably the simplest way to ensure that the destructor of postcondition is invoked before the value of v
is returned from get_values()
is to add a pair of braces as shown below.
template<typename T> std::vector<T> get_values(int n, std::function<T(T)> generator) { std::vector<T> v; auto is_increasing = [&v]() {return is_sorted(v.begin(), v.end());}; { // Enclose postcondition in a new scope. postcondition _(is_increasing); T j = 0; for (int i = 0; i != n; ++i) { j = generator(j); v.push_back(j); } } // Enclose postcondition in a new scope. return v; }
The destructor of postcondition
will be called when the newly added closing brace is encountered. This is just before the return
statement as required. The amendment produces expected results both in debug mode and release mode.
There is one other important matter to consider, that of undefined behaviour. The function sum_squares()
contains a lambda with the code return j + i * i++;
. This is problematic because it is not known when, during the evaluation of the expression, the variable i
will be incremented. The result of the expression depends on when i
is incremented. To produce a defined result, the return
statement should be split into three separate statements as shown below.
const auto k = i; ++i; return j + k * k;
Finally, within the templated functions that generate the sequences, the student has defined the variable result as auto const &
. This is perfectly valid syntax but it does give the impression that result is a reference to another variable that has been declared in the source code. This is not the case. What the compiler does, in effect, is to create a temporary variable and initialises it with the value returned from get_values()
. The compiler then uses the value of the temporary variable to initialise the reference variable result. This is permitted because result is const
. Had result been defined as non-const
, the compiler would have issued an error message because changing the value of result (that refers to a temporary) does not make any sense as the temporary variable will soon disappear. The outcome of this is that initialising a const &
variable has the same effect as initialising a const
variable. Given this, I suggest removing the &
from the definitions.
Jason Spencer
The headline bug here is caused by a possible optimisation that the compiler might perform and it may remove the values from v
in get_values
before they are checked.
Note that there is actually a typo in the print and PDF versions of this Critique – the fibonacci()
function is missing the first three lines of the function. It was correct in the mailing-list announcement, on which this response is based.
To understand what is happening in get_values
, we need to understand how the compiler generates code to manipulate the stack when a function is called. Typically space is reserved on the stack for the return value – since we’re returning a vector let’s call the object in this space vret
. Space for the variables local to the function (including v
) is then reserved on the stack. The variables are located in this space in the order in which they are declared within the function, and they’re constructed in this order too. They are destructed in the opposite order to their construction.
When the return statement is executed, v
is either copied (by copy constructor), or (since C++11) moved into vret
. It is also possible for the copy/move to be elided, and for v
to actually be in the space assigned for vret
– this is called Return Value Optimisation (RVO) – specifically Named RVO (NRVO) because we’re not constructing the return value in the return statement, we’re returning a variable. This copy elision can occur whether there are side-effects of copying/moving or not.
Whether return by copy construction, move construction, or copy elision occurs depends a lot on the compiler, the compiler version, the flags used and the optimisation level, hence the complicated nature of the behaviour the student found.
The student was expecting _
to be destroyed before v
, and it will be. But we now face three possible scenarios at the destruction of _
:
v
was copied tovret
at thereturn
statement – in which case,v
is still valid and the test behaves as expected.v
was moved tovret
at thereturn
statement – in which case,v
is now empty and the test always passes because a vector with less than two values still causesis_increasing
to return true.v
isvret
due to NRVO – in which case, the test behaves as expected.
It is the second scenario that prevents the error from appearing when it is expected.
The details of copy/move/RVO optimisations can be found at [1] and [2].
The other issues I’d comment on are:
- I’m not sure using
postcondition
to test is a good idea. The destructor, and therefore the test, will also be executed if an exception is thrown and not caught inget_values
(thrown by a generator perhaps). The student may want to teststd::current_exception
in~postcondition
before calling the test. And of course the test can’t throw anything, so should probably benoexcept
. Consider a templated function as a test that doesn’t rely on RAII. - Implicit casting of output to
int
:Since
out
is an alias forstd::ostream_iterator<int>
and it is used in thesequence<T>
,triangular<T>
,fibonacci<T>
andsum_square<T>
templates, the output sequence values, which should contain values of typeT
, are cast toint
before being output. So ifsum_squares<float>;
is included inmain()
the values are truncated and printed asint
s. - The generator functions should do type checking on
T
, and potentially disable themselves for types that don’t support the operators used, or for types that would produce nonsensical results.For example,
fibonacci<std::string>()
will work, butfibonacci<std::complex<float>>()
won’t.T
has to support one or more of:operator++
,operator+
and must have astd::swap
overload depending on the generator.T::operator<
is also required for theis_increasing
test. Consider anenable_if
[3] withstd::is_arithmetic
[4] andstd::is_swappable
[5] in the simplest case.
In terms of design I’d look more into the overflow checking and the generators.
The issue of robustly checking overflow is a complicated one and often involves tests alongside the operations, rather than testing the result, as was the case here. Overflow detection isn’t that difficult to do in theory – most CPUs have an overflow flag (OF) which is set when the last ALU operation resulted in a value that overflowed the output register. You might also want to check the carry flag (CF) [6]. The problem is that there’s no straightforward way of detecting this in C++. We could write inline assembler functions that perform the arithmetic operation and copy the OF flag to a local variable and act based on the state of that, but this would be architecture specific. It also limits optimisation – for example a typical compiler would swap a multiplication by a constant 8 into a left shift by 3, but this code wouldn’t.
Another approach [7] is to, for each operation type, check that the output variable has enough capacity – this typically involves finding the position of the most significant set bit. An unsigned integer multiplication of variables A and B will never use more than MSB_A+MSB_B bits in the output, where MSB_A is the position of the MSB in A. The unsigned integer addition of A and B will never use more bits than MAX(MSB_A,MSB_B)+1. And so on.
Most compilers also have options to trap on overflow for arithmetic, but it can be complicated to handle this at runtime, and is compiler specific.
There are some mathematical functions in C++ that do support overflow detection in a portable and robust way. For example std::math_errhandling
[8] can be used to detect some overflow and other conditions for floating point types.
std::overflow_error
, while sounding interesting is only thrown by std::bitset
conversion functions, and not general purpose arithmetic, but there’s no reason not to use it in bespoke overflow testing code.
Another approach is to create a special type that wraps the arithmetic type, overloads the arithmetic operators, and checks for overflow after every operation. For example [9].
Yet another approach is to perform the operation on a larger (similarly signed) data type and cast to the requested type at the end, checking for truncation after the cast. This might not however catch all cases of overflow at every intermediary operation (or in fact the overflow may be required behaviour – an implicit modulo).
I’m not a fan of using is_increasing
because although the program spec says the sequences are increasing, it is possible for aliasing to occur – for example, in a sum_squares<uint8_t>
sequence, at the sixteenth term 256 is added to the running total – this overflows, but the total remains the same, and that term at least would pass the test (although earlier terms might not). I’m sure there are many other such examples.
Regarding the generators – they’re not very versatile the way they are written here. I’d consider implementing them either as iterators, or as generator functors. In the current design the state of the sequence is store in variable i
, which is a nice design, but if it were stored in an object then the object can be copied, reset, etc.. something like this:
template <typename T> class arithmetic_progression_ { T n; T step; public: typedef T result_type; arithmetic_progression_ (const T n = 0, const T step = 1) : n(n), step(step) {} T operator()() { n += step; return n; } T get_last() { return n; } }; template <typename T> class triangular_ { T n; T level; public: typedef T result_type; triangular_ (const T n = 0, const T level = 1) : n(n), level(level) {} T operator()() { n += level++; return n; } T get_last() { return n; } }; template <typename T> class sum_squares_ { T n; T level; public: typedef T result_type; sum_squares_ (const T n = 0, const T level = 1) : n(n), level(level) {} T operator()() { n += level * level++; return n; } T get_last() { return n; } };
This is much more versatile because you can now generate as many values as you like (25
in the example below):
std::vector<int> v; std::generate_n ( std::back_inserter(v), 25, triangular_<int>() ); std::copy ( std::begin(v), std::end(v), std::ostream_iterator<int>(std::cout, ",") ); std::cout << '\n';
Or you can use an iterator wrapper (such as the Boost Generator Iterator Adaptor [10]) to print the sequence directly to any output stream:
arithmetic_progression_<int> a_p; std::copy_n ( boost::make_generator_iterator( a_p ), 15, std::ostream_iterator <decltype(a_p)::result_type> (std::cout, ",") ); std::cout << '\n';
Something to also consider is whether it wouldn’t be beneficial to generate sequences at compile time. In the past this meant using templates [11], but there are now more options with constexpr
[12]:
constexpr int factorial(int n) { return n <= 1 ? 1 : (n * factorial(n - 1)); }
When it comes to generators inspiration can be taken from C# and Python. Paolo Severini [13] has an extensive write-up comparing C# and C++ generators and their implementation. He also looks at co-routine approaches.
References
[1]https://en.cppreference.com/w/cpp/language/copy_elision
[2] https://shaharmike.com/cpp/rvo/
[3] https://eli.thegreenplace.net/2014/sfinae-and-enable_if/
[4] http://en.cppreference.com/w/cpp/types/is_arithmetic
[5] http://en.cppreference.com/w/cpp/types/is_swappable
[6] http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt
[7] https://stackoverflow.com/questions/199333/how-to-detect-integer-overflow
[8] http://en.cppreference.com/w/cpp/numeric/math/math_errhandling
[9] https://accu.org/index.php/journals/324
[10] https://www.boost.org/doc/libs/1_62_0/libs/utility/generator_iterator.htm
[11] https://www.jasonspencer.org/articles/series/
[12] https://en.cppreference.com/w/cpp/language/constexpr
[13] https://paoloseverini.wordpress.com/2014/06/09/generator-functions-in-c/
Commentary
As all three entries explain, the cause of the differences in behaviour is an interaction between destructors and generating the return value. The ‘feature’ was added to C++11 when move semantics were added to the language and returning from a named local variable was allowed to move the value. I do not know at what point in the process anyone realised that this change could cause the difficulties that this critique demonstrates.
I think the critiques between them covered most of the points in the critique.
The winner of CC 111
The critiques all detected the problem with the unpredictable behaviour, but James additionally provided a solution to the troublesome behaviour – using an extra scope to ensure the destruction occurs before the return.
As Pal stated, the problem with signed overflow is that this produces undefined behaviour and attempting to detect this can be troublesome. This is an active area of discussion on the C++ committee...
Using destruction to verify post-conditions can have problems, as Jason points out, because of the danger of throwing an exception while unwinding from an exception, which will simply terminate the program.
Both Pal and James pointed out the UB in sum_squares
in the expression j + i * i++
but James additionally provided some code for a correct replacement.
All three critiques gave the code a good look, but on balance I consider James’ critique was the most helpful to the person with the problematic code so I have awarded him the prize for this issue.
Code Critique 112
(Submissions to scc@accu.org by Aug 1st)
Further to articles introducing D, I am attempting to use the event-driven Vibe.d server library, which I understand to be like a native version of Python’s Tornado and potentially a ‘killer app’ of D as I haven’t seen any other mature event-driven server library in a compiled language.
I would like to build a back-end server that performs some processing on the body of the HTTP request it receives. To begin with, I would like it to simply echo the request body back to the client.
My code works but there are three problems: (1) when I issue a POST request with lynx, 3 spaces are added to the start of the response text, (2) I cannot test with nc because it just says 404 and doesn’t log anything, and (3) I’m worried that reading and writing just one byte at a time is inefficient, but I can’t see how else to do it: I raised a ‘more documentation needed’ bug at https://github.com/vibe-d/vibe.d/issues/2139 but at the time of writing nobody has replied (should I have used a different forum?)
The code is in Listing 2.
import vibe.d; void main() { auto settings = new HTTPServerSettings; settings.port = 8080; listenHTTP(settings, (req, res) { ubyte[1] s; while(!req.bodyReader.empty()) { req.bodyReader.read(s); res.bodyWriter.write(s); } }); runApplication(); } |
Listing 2 |
You can also get the current problem from the accu-general mail list (next entry is posted around the last issue’s deadline) or from the ACCU website (http://accu.org/index.php/journal). This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.
Notes:
More fields may be available via dynamicdata ..