Journal Articles
Browse in : |
All
> Journals
> CVu
> 273
(12)
All > Journal Columns > Code Critique (70) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Code Critique Competition 94
Author: Martin Moene
Date: 05 July 2015 21:56:30 +01:00 or Sun, 05 July 2015 21:56:30 +01:00
Summary: Set and collated by Roger Orr. A book prize is awarded for the best entry.
Body:
Participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.
Note: If you would rather not have your critique visible online, please inform me. (We will remove email addresses!)
Last issue’s code
I’m trying to write a simple program to demonstrate the Mandelbrot set and I’m hoping to get example output like this Figure 1, but I just get 6 lines of stars and the rest blank. Can you help me get this program working?
Can you find out what is wrong and help this programmer to fix (and improve) their simple test program? The code is in Listing 1.
* **** **** **** * ********** ****************** ***************** ***************** ******************* ********************** ********************* * ** ********************** ******* ********************** ********* ********************** ********* ********************** * ********* ********************* ********************************************* * ********* ********************* ********* ********************** ********* ********************** ******* ********************** * ** ********************** ********************* ********************** ******************* ***************** ***************** ****************** * ********** **** **** **** * |
Figure 1 |
#include <array> #include <complex> #include <iostream> using row = std::array<bool, 60>; using matrix = std::array<row, 40>; std::ostream& operator<<(std::ostream &os, row const &rhs) { for (auto const &v : rhs) os << "* "[v]; return os; } std::ostream& operator<<(std::ostream &os, matrix const &rhs) { for (auto const &v : rhs) os << v << '\n'; return os; } class Mandelbrot { matrix data = {}; std::complex<double> origin = (-0.5); double scale = 20.0; public: matrix const & operator()() { for (int y(0); y != data.size(); ++y) { for (int x(0); x != data[y].size(); ++x) { std::complex<double> c = (xcoord(x), ycoord(y)), z = c; int k = 0; for (; k < 200; ++k) { z = z*z + c; if (abs(z) >= 2) { data[y][x] = true; break; } } } } return data; } private: double xcoord(int xpos) { return origin.real() + (xpos - data[0].size()/2) / scale; } double ycoord(int ypos) { return origin.imag() + (data.size()/2 - ypos) / scale; } }; int main() { Mandelbrot set; std::cout << set() << std::endl; } |
Listing 1 |
Critiques
Paul Floyd
It took some debugging to get to the causes of the errors.
First impressions. I didn’t like the uncommented use of indexing a string literal in the row operator<<
. I understood it, but not all developers would. As it’s a relatively rare idiom, it needs an explanatory comment. Just to make sure that there was no problem with the output, I changed the literal to *.
so that the ‘true’ values printed something more visible. And indeed there was no problem there. I had a quick check that the unqualified call to abs
was indeed calling the complex version rather than the int
one. Lastly I didn’t like the gratuitous use of operator()
, defined in the class body to boot. I’d prefer a named function like calculate
. If calculate
is always called after construction, then I’d call it from the constructor instead.
Next, compiling the code. I tried with
- Solaris Studio 12.4 – no warnings
- clang++ 3.2 – complained about sign comparison in the
x
andy
for loops - g++ 4.8.2 – same
x
andy
for loop sign comparison, and also about missing field initializers forMandelbrot::data
.
So I started adding some debug traces. I noticed fairly quickly that the values of complex c
and z
always had zero imaginary parts. I looked at initializer expression for c
std::complex<double> c = (xcoord(x), ycoord(y));
and saw the first light. That doesn’t initialize a std::complex
with two fields. It’s an operator, expression in parentheses. The result of the expression is a double, and there’s a std::complex
constructor overload that takes just a double, so it can be converted to std::complex
.
In C++11 uniform initialization style, this should be
std::complex<double> c = {xcoord(x), ycoord(y)};
The next thing that I noticed were some excessively high values coming from xcoord(x)
. These looked suspiciously like INT_MAX
. I added traces for all of the values used in xcoord()
, and they all looked reasonable.
Then I looked a bit more closely at the expression:
return origin.real() + (xpos - data[0].size()/2) / scale;
In terms of types, that is
double + (int - size_t/int) / double
If it were the case that the size_t
in the middle were converted to int
, then all would be well. But it isn’t. If I look at the typeid of data[0].size()/2
(and even demangle it, then it is unsigned int
on a 32bit build and unsigned long
on a 64bit build. The same is true for xpos - data[0].size()/2
. What this leads to is that it isn’t the size_t
that is converted to int
, it is the int
that is converted to size_t
. xpos
ranges from 0 to 59, and data[0].size()/2
is 30. So when xpos
is less than 30, the result is negative. Or it would be if the expression type were not size_t
, which has no negative values. Instead it wraps around to a very high positive number.
The fix for this is quite simple. Instead of using literal 2
in the expression, use 2.0
, thus
return origin.real() + (xpos - data[0].size()/2.0) / scale;
the types this time are
double + (int - size_t/double) / double
This time, in the subexpression size_t/double
, the size_t
gets converted to double
, which again happens for (int - double)
. Since double
has no problem representing negative numbers, this works as intended. The same change needs to be made in ycoord
.
This is one of my pet peeves, people using literals of the wrong type in expressions. I’ve even seen stuff like (double)2)
rather than the more obvious 2.0
.
Lastly, how could these have been avoided? Well, both mistakes were well formed C++, so the compiler was of no help. The huge numbers produced by xcoord
could have been detected fairly easily with a bit of analysis of what the expected upper and lower bounds are, and a suitable assert
.
Finally, the initialization expression for c
. As 0.0
is perfectly valid for c.imag()
, an assert
won’t help.
A unit test could be used, but that would require considerable changes to break up Mandelbrot::operator()
, which I think would be excessive, so that leaves good old fashioned functional testing and debugging.
Jim Segrave
Problems with this code:
The member functions xcoord
and ycoord
don’t work as intended. The sub-expressions involved will be evaluated as integers and conversion to double
comes too late, their fractional part will have been truncated. A simple fix is to change the divisor from 2
(an integral value) to 2.0L
, a double
value. Then the parenthesised sub-expressions will evaluate as doubles and the results are what is expected.
The second major problem is in the statement:
std::complex<double> c = (xcoord(x), ycoord(y)), z = c;
This does not call the constructor for a complex double
with a real
and an imaginary part. Instead, the parenthesised expression is treated as a pair of expressions separated by a comma – the ycoord()
result is discarded and the complex double
is constructed using the single argument constructor which sets the imaginary component to 0
and only initialises the real component. Changing the declaration to use a braced initialiser fixes this problem.
The compiler, with sufficient warnings enabled, should complain about mixing signed and unsigned values in the comparisons in the two for statements which iterate over x
and y
. Changing the types for x
and y
to size_t
s (and similarly changing the xcoord()
and ycoord()
functions to expect size_t
arguments, clears this warning.
Finally, the compiler may complain about the initialisation of the variable data
. This variable is an C++ array of C-arrays of rows, where a row is a C++ array of a C array
of bools. To initialise this, data needs to initialise the C++ array of rows, so it needs a pair of curly braces. Within that pair, you need an initialiser for a C array of rows, hence another pair of curly braces within the first one, which should contain an initialiser for a row. That calls for a third set of curly braces, which should contain an initialiser for a C array of bools. That needs a fourth set of curly braces around a false value:
matrix data{ { { { 0 }} }} 1 2 3 4
1 is the initialiser list for data
2 is the initialiser list for a C array of rows
3 is the initialiser list for a row
4 is the initialiser list for a C array of bools
It’s ugly, but that’s what g++ 4.8 and clang 3.5.1 want to see.
With those corrections, the program compiles without warnings or errors and produces the expected output. However, it can be improved:
A Mandelbrot set is symmetric around the X axis, so it seems the program should produce a symmetric pattern. It also seems reasonable to expect that the origin point (-0.5, 0) should be plotted. But with an even number of output lines (40 in this case), there are 20 lines printed above the X axis and 19 below the X axis. For the same reasons, there are 30 points plotted to the left of the Y axis and 29 to the right. I changed the number of rows and columns to 41 and 61, which makes the output symmetrical around the origin point, but the origin point itself is no longer plotted. I altered xcoord
and ycoord
so that they will plot an equal number of rows above and below the origin and an equal number of columns left and right of the origin, while still plotting the X row and Y column passing through the origin.
A final change is one of my pet obsessions – there are if
statements in this code which control a single statement and omit any braces to make the controlled statement a compound one. Omitting the braces is a recipe for introducing bugs into code in the future. It is my firm belief that no if
, for
, do
or while
statement should ever omit the braces to make any controlled code (even an empty statement) be a compound statement.
Then anyone adding statements later will be forced to notice and respect the statement blocks.
The updated code is:
#include <array> #include <complex> #include <iostream> using row = std::array<bool, 61>; using matrix = std::array<row, 41>; std::ostream& operator<<(std::ostream &os, row const &rhs) { for (auto const &v : rhs) { os << "* "[v]; } return os; } std::ostream& operator<<(std::ostream &os, matrix const &rhs) { for (auto const &v : rhs) { os << v << '\n'; } return os; } class Mandelbrot { matrix data = {{ {{ 0 }} }}; std::complex<double> origin = (-0.5); double scale = 20.0; public: matrix const & operator()() { for (size_t y(0); y != data.size(); ++y) { for (size_t x(0); x != data[y].size(); ++x) { // use braced initialiser list for c std::complex<double> c{xcoord(x), ycoord(y)}, z = c; int k = 0; for (; k < 200; ++k) { z = z*z + c; if (abs(z) >= 2) { data[y][x] = true; break; } } } } return data; } private: double xcoord(size_t xpos) { return origin.real() + (xpos - (data[0].size() - 1)/2.0L) / scale; } double ycoord(size_t ypos) { return origin.imag() + ((data.size() - 1)/2.0L - ypos) / scale; } }; int main() { Mandelbrot set; std::cout << set() << std::endl; }
James Holland
The first thing to do is to clear the decks by getting rid of any compiler warnings. My compiler issues warnings about the comparisons between signed and unsigned integer expressions that occur in the two outer for
statements of operator()();
The problem is that data[y].size()
and data.size()
return unsigned types while the variables used in the comparison, x
and y
, are signed types. It would be convenient to ask the std::array
what type it uses for representing sizes and to use that. This is achieved by referring to the std::array
’s public definition of size_type
as shown below.
for (row::size_type y(0); y != data.size(); ++y) for (matrix::size_type x(0); x != data[y].size(); ++x)
Unfortunately, the software still behaves in the same unexpected way. After quite a bit of effort it was discovered that xcoord()
and ycoord()
are not returning correct values. Investigations revealed that using both signed and unsigned types in the return expression are to blame. As noted above, data[0].size()
and data.size()
return unsigned types while xpos
and ypos
are of type unsigned int
. C++ has set of rules it uses to determine how to handle situations like this. It turns out that, in this case, signed values are converted to unsigned types. The result of the expression is also unsigned. This has the consequence that what was thought to be small negative numbers are manipulated as if they were large positive numbers. This accounts for xcoord()
and ycoord()
returning unexpected values.
In our situation, there are two ways in which the problem of signed values can be resolved. Probably the most direct way is to cast the unsigned value to signed values as shown below.
double xcoord(int xpos) { return origin.real() + (xpos - static_cast<signed int> (data[0].size())/2) / scale; } double ycoord(int ypos) { return origin.imag() + (static_cast<signed int>(data.size())/2 - ypos) / scale; }
This works well where the sizes of the dimensions of data are even numbers. Dividing such numbers by 2 (as is done in xcoord()
and ycoord()
) results in no loss of information. Had the sizes been represented by odd numbers, dividing by 2 would lose the fractional part of the result. In such cases it would, perhaps, be simpler to make the divisor a double
instead of a signed int
. This can be achieved by simply replacing the 2
in xcoord()
and ycoord()
with 2.0
. Dividing an unsigned type by a double
will result in a double
which is inherently signed and will preserve any remainder. For the problem at hand, I have chosen the casting method as it makes clear the intentions of the programmer and, as the size()
value is even, dividing by a double
is not necessary.
Having corrected xcoord()
and ycoord()
it is disappointing to discover that the program still does not produce the expected result. Further investigation is required. It turns out that the initialisation of c
is at fault. C++ allows variables to be correctly initialised in a variety of ways but there are a few cases where what seems obvious are inappropriate. This is one of them. In this case, the values within the parenthesise of the initialisation of c
are being interpreted as arguments of a comma operator. The result is that the first value (xcoord(x)
) is ignored and the second value (ycoord(y)
) is copied to the real part of c
. The imaginary part is automatically set to zero. This is not what is required. Correct initialisation of c
is achieved by any of the following statements.
std::complex<double> c{xcoord(x), ycoord(y)}; std::complex<double> c = {xcoord(x), ycoord(y)}; std::complex<double> c(xcoord(x), ycoord(y));
The use of parentheses ((
and )
) and braces ({
and }
) both have their advantages and disadvantages so it is largely a matter of choice as to which is used. It is probably best to be consistent throughout the program, however.
Having resolved the initialisation problem, the program should work as expected with the display of the Mandelbrot set. There are, however, a couple of amendments that could be made to improve the appearance of the source code. As mentioned above, a consistent initialisation approach would be beneficial. The initialisation of data, origin and scale uses three different syntaxes. The initialisation of x
and y
uses yet another. I suggest, for no particularly strong reason, that brace initialisation is used throughout. Another slightly curious feature is the declaration of k
just before the for loop. k
would be better declared as part of the for
loop as it is not required outside the scope of the loop.
For completeness, I provide the corrected program.
#include <array> #include <complex> #include <iostream> using row = std::array<bool, 60>; using matrix = std::array<row, 40>; std::ostream & operator<<(std::ostream &os, row const &rhs) { for (auto const &v : rhs) os << "* "[v]; return os; } std::ostream & operator<<(std::ostream &os, matrix const &rhs) { for (auto const &v : rhs) os << v << '\n'; return os; } class Mandelbrot { matrix data{}; std::complex<double> origin{-0.5}; double scale{20.0}; public: matrix const & operator()() { for (row::size_type y{0}; y != data.size(); ++y) { for (matrix::size_type x{0}; x != data[y].size(); ++x) { std::complex<double> c{xcoord(x), ycoord(y)}, z = c; for (int k{0}; k < 200; ++k) { z = z * z + c; if (abs(z) >= 2) { data[y][x] = true; break; } } } } return data; } private: double xcoord(int xpos) { return origin.real() + (xpos - static_cast<signed int>(data[0].size()) /2) / scale; } double ycoord(int ypos) { return origin.imag() + (static_cast<signed int>(data.size())/2 - ypos) / scale; } }; int main() { Mandelbrot set; std::cout << set() << std::endl; }
Commentary
This problem was, it seems, a little frustrating to analyse. This is partly because there were two unrelated bugs and finding and fixing just one of them was not enough to produce sensible output.
Fortunately many of our debugging tasks involve a single root cause of the problem, but it is worth remembering that this is not always the case, even in short code fragments like this one.
I think between them the critiques covered most of the issues. One issue no-one covered is that the operator<<
defined for row and matrix are in the default namespace and use types from the std
namespace so there is potential for a violation of the One Definition Rule if another source file in the final executable were to define the same operator on one of the types.
The odd placement of k
outside the for
loop was historical – the original program used the final value of k
to produce a more flexible output than just a simple boolean value.
The winner of CC93
There were good points made by all entrants. I also note that all three entries referred to removing compiler warnings: I take this as a good sign that warnings are getting better and/or programmers are learning to use the inbuilt static detection provided by the compiler. Paul did not actually remove the warnings though – it might have been useful for a critique to have explained why they weren’t directly relevant in this case.
There are many problems that can be caused by mixing types, particularly signed and unsigned integers. In this case dividing by 2.0 rather than 2 fixed the problem, but primarily because double is signed rather than because of truncation as Jim suggested – but doing this does avoid problems if the program changes in the future.
There are now many different ways to initialise variables in C++ and James’ preference for use of a consistent initialisation style in a single source file is a good idea and one likely to reduce errors. (Scott Meyers’ recommendation in Item 7 of Effective Modern C++ is to use brace initialisation where possible.)
I liked Paul’s final reflective question “Lastly, how could these have been avoided†– even though in this case there are few quick answers – and overall his critique wins the prize by a short head.
Code Critique 94
(Submissions to scc@accu.org by August 1st)
I had some code that used an anonymous structure within an anonymous union to easily refer to the high and low 32-bit parts of a 64-bit pointer. However, I get a warning that this is non-portable (I'm not quite sure why - MSVC and g++ both accept it) but after googling around for a solution I found one that uses #define. It all compiles without warnings now so I think it's fixed.
Can you give some advice to help this programmer? The code is in Listing 2.
--- address.h --- #pragma once typedef struct { union { struct { int32_t offsetLo; int32_t offsetHi; } s; void *pointer; }; } address; // simulate anonymous structs with #define #define offsetLo s.offsetLo #define offsetHi s.offsetHi --- test program --- #include <cstdint> #include <iostream> #include "address.h" int main() { int var = 12; address a; a.pointer = &var; std::cout << "Address = " << std::hex << a.offsetHi << "/" << a.offsetLo << std::endl; } |
Listing 2 |
You can also get the current problem from the accu-general mail list (next entry is posted around the last issue’s deadline) or from the ACCU website (http://www.accu.org/journals/). This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.
Notes:
More fields may be available via dynamicdata ..