My intention in writing this item is to start you thinking about the many places where programmers have strong opinions and disagreements. Sometimes these concern actual solutions to problems but more often than not they focus on details such as layout, naming conventions and apparently low level detail.
Over the last few weeks several items have directed my attention to how widespread these issues are even among genuinely expert programmers. There is an article in the current issue of 'The C/C++ Users Journal' about designing an implementing a template class for mathematical vectors. Then I attended a presentation by Stan Lippman on class design at the recent SIGS conference in London. Several things that Stan was saying were wildly at variance with what I had heard from such other industry gurus as Scott Meyers and Kevlin Henney. At the same event I heard an excellent presentation from Herb Sutter (yes I have booked both him and Stan for future JACC events)on writing exception safe code. Herb reiterated the requirement that destructors do not throw exceptions. Now a few issues ago we had a case being made to the contrary.
In every case the people involved are of at least expert rank and most are life masters (to borrow a rank from the World of Contract Bridge - now there is an idea, what about ranking programmers in a similar way?) of programming.
I am going to address several issues in the rest of this article. Please feel encouraged to contribute items for future inclusion (so that they can be co-ordinated, send them direct to me: editor@accu.org). The object of the exercise is not to determine winners but to understand why different programmers come up with opposing views. Sometimes the differences maybe because someone has not understood all the implications of their preferred approach, but often it will be because different contexts require different mechanisms. I hope that contributors will provide reasonable detailed rationales for their views, even better they will attempt to provide several different ones.
I do not want to say very much about these but there are several points that should be kept in mind. The primary purpose of a comment is to assist readers of the code to understand it. I personally hate comment that is invasive of the code. In my days as a Forth programmer I greatly appreciated a mechanism in that language that allowed me to separate code and comment into different units (called screens in Forth). While there is a good case to be made for recording the development and maintenance history of source code somewhere I question whether it is correct to place that in the same file (even worse sprinkled through the source code). I have never seen this done but perhaps those comment headers that some programmers are so fond off should be hived of into included files. I wonder if we could persuade some implementor to provide a special switch that pulls in and displays comment files or not as the programmer wishes; sort of like the habit I have seen some programmers use of making comments white on white when they want to focus on the source.
Unfortunately the comments that most of us were first exposed to were intended to help us learn a programming language and had little to do with the needs of production code. A line such as:
i++ // increments i
is fine in a chapter that is introducing increment/decrement operators yet it rapidly becomes silly. Unfortunately many authors fail to make this kind of special use of comment clear. They tell you the syntax of commenting but never explain the style/semantics of comments.
Let me give you a slightly more extended example. At novice level you might see:
Mytype(); // declaration of explicit default constructor
However at expert level you might see:
Mytype(){}; // required to reinstate a default ctor because // a copy ctor has been declared
The comment for a novice is vacuous for an expert, while the expert example would be deeply confusing for a novice.
If code is well written, it needs little in the way of comment. Programmers who insist on commenting the purpose of a variable are almost certainly choosing poor names for their variables. The use of such comments actually serves to perpetuate poorly written code because it acts as a crutch to minimise the damage done by a poor coding style. Many commenting styles are akin to the way that formal mathematical proofs are presented. But any likeness is purely superficial. The reasons part of a mathematical proof is essential because the writer is often making large leaps based on insights that may not be obvious to others; source code is not like that.
Go and study the commenting styles of several reputable authors. Why do you think the Scott Meyers (Effective C++ and More Effective C++) uses a lot of end of line comments while Matt Austern (Generic Programming and the STL) hardly ever does?
Try to avoid adopting some author's commenting style until you understand it. At that stage you may come to understand that it is inappropriate for your purposes. Comments are written in a context that is much wider than just the source code they are attached to. Almost all the religious wars about comments arise out of a failure to grasp this fundamental point. Perhaps student code does need masses of comments to help the instructor follow the sometimes contorted logic of the novice, but the instructor that is more concerned with quantity than quality is missing the point (and doing a grave disservice to his clients). Learn to distinguish between commenting for study and commenting for maintenance: they meet totally different objectives.
No I am not going to waste your time by raking over this well trodden territory. But it is worth thinking about how the different popular formats work in different contexts. The traditional K&R style is reasonably economical when placed on the printed page. Some of the other currently popular formats seem to emphasise the pattern of the code at the cost of space. Unfortunately many coders then hide the code pattern with excessive commenting in the body of the code. Actually I like formats that provide me with plenty of whitespace (and blank lines) if I am going to print it out so that I can work on it with a traditional writing tool (pen or pencil). On the other hand I feel uncomfortable with layout rules that result in functions exceeding one screen, or that result in lines wrapping. Code should be readable and the layout should encourage reading.
One of the odd features of this guideline (often elevated to a divinely inspired rule) is that there is no way of applying it in several of the common OO programming languages. This may explain why some writers have so much difficulty with it. If you are doing plain procedural programming it may seem a silly burden to impose. In addition it may get in the way when you are writing a text on some particular type of programming. For example, if you are writing about 2D graphics programming it would seem to be reasonable to use:
struct coordinates { double x, y; };
even in a book where the programming language is C++. You want to focus on algorithms for 2D graphics. However if you are developing code for a commercial 2D graphics library you might be better advised to write:
class coordinates { public: double x()const; coordinates& x(double); double y()const; coordinates& y(double); coordinates(double x =0.0, double y =0.0); private: double x_m, y_m; };
And before you start picking at that code, remember the context here is one of highlighting style in the context of why the code is written. While I am writing about this specific topic, note that we should distinguish between co-ordinates and mathematical vectors. They have many similarities but there are also differences. Both rely on ordered pairs, but it makes sense to add two vectors but not to add two sets of co-ordinates. You can even add a vector to a set of co-ordinates (and get a set of co-ordinates returned). Does this mean that both should be derived from a common base class? In a context where even the slightest element of code bloat hurts the answer is probably yes but I would argue for no in general. If you need an ordered pair neither a 2D vector nor a pair of co-ordinates will do. They may be very similar but they are not, to my mind, substitutable.
There are some who would argue that the data in this example should be protected so that derived classes have free access to the base class data. Indeed if you come from Smalltalk you might wonder what the problem is.
In the above example there would seem to be little cause to waste time in debating the issue because you have already granted complete access via access functions that expose the underlying data types. How often do you see authors raise this question when dealing with issues of access and data hiding? If you want to preserve your rights to change data structures as a class designer you need to do something like:
class coordinates { public: typedef double value_t; value_t x()const; coordinates& x(value_t); value_t y()const; coordinates& y(value_t); coordinates(value_t x =0.0, value_t y =0.0); private: value_t x_m, y_m; };
Much more important than knowing 'data should always be private' is understanding the implications of what you write. Sound bites do not transmit wisdom they only act as convenient tags to recover wisdom.
There is a considerable body of experts who will teach you to avoid inline (and more specifically defining member functions in the class interface) as if it was some serious disease. On the other hand, Stan Lippman would criticise the above coordinates class because the member functions had only been declared and not defined. In the context of the kind of intensive graphics programming that he is accustomed to (Disney is a client of the company he works for) I think he is quite correct. However in the context of many other types of work he is probably wrong. What is wrong is making such decisions about such fundamental classes without understanding the implications.
Another issue that must be understood here is the interaction between inline and virtual. This does not mean that it is wrong to define a virtual function as inline, but you need to understand that the inline feature will only apply when the compiler can apply static binding.
For example:
class Base { public: virtual void printon(ostream & out = cout) const { out << "Base class/n"; } }; Base & b_g() { static Base * b = new Base; return *b } int main(){ Base b; b.printon(); // OK can place code inline Base * b_ptr = new Base; b_ptr -> printon(); // OK can place code inline b_g().printon(); // will almost certainly require indirection return 0; }
Note that in this example you will get code bloat unless the compiler stores a single copy of the message (string literal) and uses that via a pointer everywhere it is needed. I would not bet that compilers will be that smart (remember that the class would normally be in a header file that was included into several translation units.
There are a lot of things that compilers could do to help us but too often they do not cater to the needs of the implementer's marketing department. An example in the above is that local static. For it to work reliably in a multi-threaded environment such static objects must be created atomically. In other words all other calls to b_g() must be locked out until the Base object has been constructed.
Actually, in this case, any subsequent write access to the reference returned must also lock against other accesses until completed. In fact just about all kinds of static variable are a menace in multi-threaded code. Commercial components and libraries MUST start certifying whether they are thread save in general and specifically whether they are thread safe for multiple accesses to the same variable.
I could write for most of a week on items such as the above but I have not the time and your editor has not got the space. More than that, I would be taking away the opportunity for more of you to contribute.
To my mind a religious issue is one where experts entrench themselves in lines of battle and start talking in terms of absolutes. You get a great deal of noise (sound bites shouted ever more forcibly) and hardly any light. What we need is more light. I have no problem with Stan Lippman advocating a considerable amount of inlining (and even highlighting that one cost of using the virtual mechanism is that you will normally lose such potential efficiency gains.) What I do have a problem with is advocating it without pointing out the possible costs of doing so.
To some extent I do not care what you do as long as you understand the costs and benefits. Religious beliefs are held without reason (even though the originators may have had good reasons). Those holding them often have closed minds. Unfortunately those attacking religious beliefs often have equally closed minds on the other side. What we need is understanding.
Engineers are often pragmatists, which is fine as long as you have a clear understanding that what works for water pipes may not be sensible for gas ones. When you use the nearest pipe as an earth, do you stop to ask what kind of pipe it is? Do you know if it makes any difference?
Like idioms and patterns, guidelines are a mechanism for encapsulating the hard won experience of experts. Like the former we must understand the context and what we are seeking to achieve.
Now pick your item and write about the rationale behind it. You might be surprised at how often the most widely accepted statements lack the universality they claim. On the other hand once you understand such as 'do not use goto' you may discover that they have wider applicability.
If you put your minds to it there is sufficient material to run a column this size for at least the next five years.
Overload Journal #33 - Aug 1999 + Project Management
Browse in : |
All
> Journals
> Overload
> 33
(8)
All > Topics > Management (95) Any of these categories - All of these categories |