Journal Articles
Browse in : |
All
> Journals
> CVu
> 154
(12)
All > Journal Columns > Code Critique (70) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Student Code Critique Competition
Author: Administrator
Date: 07 August 2003 13:15:59 +01:00 or Thu, 07 August 2003 13:15:59 +01:00
Summary:
Body:
Please look at the following code that is just playing with container of objects.
#include<iostream> #include<fstream> #include<vector> using namespace std; class A { public: A(int p1= 0) : x(p1) { cout<<"CTOR"<<endl; } A(const A& a) { x = a.x; cout<<"COPY CTOR"<<endl; } ~A() { cout << "DTOR: address = " << (long)this <<endl; } friend ostream& operator<<(ostream& out, const A& obj); private: int x; }; ostream& operator<< (ostream& out, const A& obj) { out << "x = "<<(obj.x)<<endl; return out; } int main() { vector<A> v; // Be inefficient, not production code for(int i=0; i<3; i++) v.push_back(A(i)); // Lo behold, container of objects copy(v.begin(), v.end(), ostream_iterator<A>(cout, "")); cout << endl; A * pa = &(v.back()); // pa points to address of reference // to the last element cout << (*pa); // info cout << "pa = " << (long)pa << endl; // address cout << "call pop_back()" <<endl; v.pop_back(); // Line A // destructor WILL destroy last element // Now pa points to deleted memory, should crash cout << (*pa); // Line B cout << "pa = " << (long)pa << endl; // address delete pa; // Line C return 0; }
Lines A and C delete the same memory, which shouldn't be allowed. But program runs fine allowing dereferencing at line B. Why?
From Matthew Towler <<towler@ccdc.cam.ac.uk>>
There are a few misconceptions that lead to the behaviour seen, and a number of small style issues in this code. I will begin with the more important, larger issues.
The first comment in main "be inefficient, not production code" implies that test code should be deliberately inefficient, and on changing to production code everything should be rewritten with efficiency first. Code should be written to be simple and easily understood first, and then optimised if this is shown by measurement to be necessary.
The comment on lines 10 and 11 of main is misleading. Taking the address of a reference yields a pointer to the referenced item, rather than the "address of reference", which has no meaning in C++. A more accurate comment would be
// pa points to the last element
The next issue is on line A, but first an aside.
Objects are created in two stages, first raw memory is allocated, either on the stack or the heap (with new) - at this point it is just a collection of bytes. An object is then constructed into this memory by a constructor - this initialises the memory to create an object of the desired type.
Destroying objects is the reverse; first a destructor removes the object from the memory leaving again a block of raw bytes, which are then returned to the system. deleteing a pointer performs both of these steps, but note that the steps do not have to be combined.
the object's destructor, but does not imply releasing the memory. This last stage depends on the implementation of the container, and is not normally of concern to the programmer. During this call either the container reallocates its storage, leaving pa pointing at now unused memory; or (and most likely given the observed behaviour) the memory is unaltered, so pa now points at uninitialised memory. Note that for vectors, adding or removing elements invalidates all interators and pointers into the container. When using any container you should be aware of what operations invalidate iterators.
This means the second line of the comment after line A is incorrect, the memory may or may not have been deleted.
Line B is then accessing uninitialised memory, which will give what is referred to as 'undefined behaviour'. Undefined means exactly that, it could do anything ranging from nothing, to (as some are fond of stating) reformatting your hard disk, and anything in between. Sometimes the code will crash (usually with a core dump or halt in the debugger) but it is wrong to expect a noticeable failure.
The observed behaviour suggests the uninitialised memory that pa points to was not altered by the destruction, so the access to the memory works as if the object still existed - perfectly in line with 'undefined behaviour'
The next line works correctly as pa has not been altered.
Finally line C attempts to destroy the object referred to by pa. This will first call the destructor, which will do something undefined (in this case the trace output makes it appear to work); it then calls operator delete. Deleting a pointer not allocated with new (such as pa) results in further undefined behaviour. In this case the particular implementation of delete presumably realises this and does nothing observable.
As regards the programmer's perceived problem - deleting memory already deleted has undefined behaviour, there is nothing in the language that detects and disallows such attempts.
Fundamentally, the programmer needs to understand that a vector<A> is storing the A by value i.e. by storing copies of A objects. Not by allocating As on the heap. The container deals with the memory management aspects automatically.
Now the smaller coding style issues, IMHO all are generally considered bad practice and can either be taken as read or the explanations found elsewhere.
-
Use of a default value in the constructor argument rather than a separate default constructor
-
Lack of explicit keyword on a single argument constructor, especially one taking an int.
-
Use of direct initialisation in A's copy constructor, rather than an initialiser list.
-
C style cast of this pointer to long when outputting. This is not portable, and is unnecessary as there will be a system supplied output operator for pointers.
-
Use of endl everywhere rather than simply '\n', this is not crucial for test code, but it is a bad habit to get in to.
-
A has a copy constructor but no assignment operator, these functions should always appear in pairs.
-
The second comment in main does not add anything to the understanding of the code. Comments should only be used to explain the higher level intent of the code or unusual or difficult to understand passages.
-
Use of i++ rather than the preferred ++i in a loop, this is not so important for ints but it is a good habit to develop.
-
It is worth making use of the separator argument in ostream_iterator ("" does nothing) otherwise the output runs together e.g. 3 elements with values 12, 3, 4 would be output as "1234".
-
Calling code which might throw an exception (such as output operators) in a destructor is dangerous, it should be surrounded by a try..catch block to stop any exceptions escaping.
From Ruurd Pels <<ruurd@tiscali.nl>>
Oh, well. I might as well try this. Here it goes.
-
Layout
Apart from the obvious editing to fit the source in the columns, I would like to implore anyone that source code is a document. And documents should be readable, preferably to others. The code under scrutiny is sloppy in layout and lacks to convey what the intention is. More tongue-in-cheek, I think we can leave the time that there were language implementations that only allowed one-character variable names safely behind us. In my opinion, verbosity is not a bad thing, on the other hand, sometimes I myself go overboard.
A few style pointers are in order:
-
Use type names and variable names that make clear what they represent, so, here, class Integer instead of class A.
-
If code is documented inline, I like to see the comment indented at the same level as the code that follows. Not doing so tends to blur the outline of the code and makes it harder to read.
-
Last, but not least, some general remarks regarding the purpose of the program would be in order.
-
-
Idiom.
In 'Advanced C++ Styles And Idioms', Coplien tells us that if and when we do not intend to create types that are second grade citizens, we must adhere to the 'Orthodox Canonical Form'. This means that a default constructor, a copy constructor, an assignment operator and a destructor must be implemented. It is my tendency to explicitly specify all of them instead of relying on the compiler to generate the default ones for me, so I added the assignment operator. As for the destructor, I almost invariably make them virtual, just to prevent rude surprises if I derive another class from a class I created. Only in very special circumstances and duly commented in the source code I use a static destructor.
The main function usually is defined as:
int main(int argc, char* argv[])
but when not using those arguments, I would prefer to see:
int main(int, char*[])
but certainly not:
#pragma argsused int main(int argc, char* argv[])
In the same vein, returning 0 at the end of main should be replaced with the more portable EXIT_SUCCESS.
[Please note that I would argue with almost all the above. Do not provide a virtual destructor unless you intend the class to be a base class - and think very carefully before deriving from a non-abstract base. I would prefer the signature int main() if command line arguments are not used because it is the long hallowed idiom to express that intent and return 0 is just as portable as return EXIT_SUCCESS. FG]
With respect to the ostream friend, I have a few remarks. Even in the eye of friendship, I tend to use accessor methods to obtain the inner value from the object to be streamed, but there is another little idiom I would like to present in this matter. If class A would be the base class for other classes, it is beneficial to separate the actual streaming from the body of the friend function and add that functionality to a protected virtual streamOn method [but why not make it virtual? FG] This makes the task of streaming much easier, because it is only necessary to overload the streamOn function in derived classes instead of rewriting the ostream friend boilerplate over and over again, so:
void Integer::streamOn(ostream& ostr) { ostr << getValue(); } ostream& operator<<(ostream& ostr, Integer const & obj) { obj.streamOn(ostr); return ostr; }
Last but not least, if streaming an object to an ostream, adding text or an end-of-line to the streaming operation is generally a bad plan, because the user of the class then cannot decide how he will do output formatting. Every time he uses the streaming friend, he gets the text and the newline 'for free', possibly at a time that is inconvenient to him. Furthermore, it is very easy to forget to consume the text and the newline when at a certain time a corresponding input stream operator is written. Realizing that an ostream does not necessarily mean that the content of an object is printed on a console is important in this matter.
In the main function, a template is instantiated. I know that this is trivial code in some respect, but I prefer to use typedefs to create 'new' types based on templates, and while I am at it, typedef the iterators as well. So instead of:
vector<A> v;
I would prefer to see:
typedef vector<Integer> IntegerVector; typedef IntegerVector::iterator IntegerVectorIterator; typedef IntegerVector::const_iterator IntegerVectorConstIterator;
[If you are going to use those names I can see very little benefit from using typedefs. In my opinion a typedef should be used to create a more descriptive name for a type. Yours do not do that so skip them. FG]
and subsequently:
IntegerVector integerVector;
In a more general case, this even collapses to for example:
typedef vector<Rule> Rules; typedef Rules::iterator RulesIterator; typedef Rules::const_iterator RulesConstIterator;
[Now I agree about the first, but again I think the other two add nothing. FG]
in which case it is quite easy to change the container type to something different, for example, if rules are all of a sudden uniquely identified, into:
typedef set<Rule> Rules;
without having to change much in any code using the Rules type.
[and that is a good reason for the first typedef but not for the typedef applied to the iterators. FG]
Call them style idiosyncracies, I have developed some I guess. I have the tendency to use 'that' for the parameter name if I need to refer to another value of the same type, at least in the copy constructors and the assignment operator. Also, in streaming friends, I tend to name the ostream& parameter as ostr.
[Yes we all have our personal styles but we should always examine naming conventions and ask if they add to the readability of the code. FG]
-
Intent.
From what I gather from the source code, this is an exercise in reconnoitring the innards of a vector and the memory allocation strategies of the underlying operating system. I don't have a problem with stretching computers and operating systems. However, this type of code could be an indication that the student does not properly understand the true nature of object oriented programming at large or STL container classes in particular. Halfway through the main function the line between the concept of container classes and the technique employed by it is crossed. At a certain point in time, he seems to want to treat the vector container as an array of Integer objects, misusing the fact that vector::back() returns a const reference to the object in the container in order to create a pointer to that object.
At a certain point, the student wants to inspect the address of the object and uses a cast to output the address pointed to by the pointer to A. The cast is superfluous, and also wrong. If the intent was to convert the address to an integer number, on 32-bit architectures at least the cast would have to be towards an unsigned long, but since the cast is supefluous, simply streaming out the pointer would have yielded the address pointed to in hexadecimal notation. The simple:
cout << pa << endl;
would have sufficed. This leads me to believe that the student should again review how pointers are treated in C++, especially in the face of passing them to an ostream object.
The statement:
A* pa = &(v.back());
leads me to believe that the student must be taught the concept of ownership. It should be made clear to him that an STL container has ownership of the objects stored in it and that access to the elements of the container should be done through the methods the container provides. Awareness of ownership problems will make him a better software engineer.
Last but not least, the question why the program does not crash on line B and why the double deletion seems to be allowed. The case probably is that a vector allocates an array of objects to accommodate its elements. This means that vector<T>::pop_back() indeed calls the destructor of the element, but does not remove the storage space associated with it. In combination with the fact that we are dealing with a very simple class stored in the container, this could mean that even if the destructor is called, the memory is still there and is still used by the container. If the stored object was more complex, for example referring to dynamically allocated memory, the dereferencing probably would have gone awry as well as the double deletion of the object. For what it is worth, I tried this under gcc 3.2, and the second time the destructor is called through delete pa it is not even executed.
-
Finally.
I sort of tried to recreate the same problem here. The main code is not very much better in terms of properly handling the container, however, I adjusted a number of stylish and idiomatic flaws:
// Student Code Critique 22, Critiqued #include <iostream> #include <fstream> #include <vector> using namespace std; class Integer { public: Integer(int theValue = 0); Integer(const Integer& that); Integer& operator=(const Integer& that); virtual ~Integer(); int getValue() const; friend ostream& operator<<(ostream& ostr, const Integer& obj); protected: virtual void streamOn(ostream& ostr) const; private: int value; }; Integer::Integer(int theValue) : value(theValue){ cout << "ctor(" << value << ")" << endl; } Integer::Integer(const Integer& that) : value(that.value){ cout << "cctr(" << value << ")" << endl; } Integer& Integer::operator=(const Integer& that){ value = that.value; cout << "op=(" << value << ")" << endl; return *this; } Integer::~Integer() { cout << "dtor(" << this << ")" << endl; } int Integer::getValue() const { return value; } void Integer::streamOn(ostream& ostr) const { ostr << getValue(); } ostream& operator<<(ostream& ostr, const Integer& obj) { obj.printOn(ostr); return ostr; } typedef vector<Integer> IntegerVector; typedef IntegerVector::iterator IntegerVectorIterator; typedef IntegerVector::const_iterator IntegerVectorConstIterator; int main() { IntegerVector integerVector; for(int i = 0; i < 3; i++) integerVector.push_back(Integer(i)); Integer* integerpointer = &(integerVector.back()); cout << "integer value = " << *integerpointer << endl; cout << "integer address = " << integerpointer << endl; cout << "call integerVector.pop_back()" << endl; integerVector.pop_back(); cout << "integer value = " << *integerpointer << endl; cout << "integer address = " << integerpointer << endl; delete integerpointer; return EXIT_SUCCESS; }
[I made some minor corrections to Ruurd's English and added rather more comments than I normally do somewhat in the style I would use were I conducting a seminar. I am aware that naming style is both individual and varies from one national group to another. However names should add value and, in my opinion, overly long names work to destroy the shape of code. It is no accident that mathematicians prefer very short names. The skill is in finding the point of balance.
My final point is that while I agree with the provision of streaming functionality through a member function, I would make it public and abandon the friend declaration of the streaming operator <<. There is still a tendency to overuse friend declarations. FG]
The editor's choice is: Matthew Towler
Please email <francis@robinton.demon.co.uk> to arrange for your prize.
(Submissions to <francis@robinton.demon.co.uk> by Sept. 6th)
It is very easy for more experienced programmers to forget how completely confused a novice can become when asked to write a program for some simple task. I have picked the following out of my slush pile because it illustrates the problem at many different levels. It starts with what is a clear confusion (if you can have such an oxymoron) about the problem itself and then goes downhill from there.
What would you do to help this student with his programming? Note that this time the student knows enough about C++ to write a suitable program, so the problem is in the mind to the extent that I think he is even confused by what is needed as output.
I am working on a program that will tell someone how old they are in years, days and months. I.e. You are 27 years, 200 months, and 1500 days old.
#include <iostream> using namespace std; void main() { int currentyear = 2003; int dob = 0; int result = 0; int day = 0; int month = 0; cout << "Enter your year of birth: "; cin >> dob; result = dob - 2003; cout << "You are " << result << " years old" << endl; cout << "Enther you month of birth: "; cin >> month; for(result=0; result <=12; result++) if(result <= 12) result = month+12; cout << "You are " << result << " months old"<<endl; cout << "Enter you day of birth: "; cin >> day; for(result=0; result <= 365; result++) if(result<=365) result = day+365; cout <<"You are " << result << " days old"<<endl; }Of course remember to comment on the program defects but your main focus should be how to explain the need for clear thought when producing a solution.
Notes:
More fields may be available via dynamicdata ..