Journal Articles

CVu Journal Vol 11, #2 - Feb 1999

Browse in :

All > Journals > CVu > 112 (20)

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: More About Names

Author: Administrator

Date: 03 February 1999 13:15:29 +00:00 or Wed, 03 February 1999 13:15:29 +00:00

Summary:

Body:

Our ancestors believed that knowing the true name of an object or person gave you power over it/them. They distinguished between a public name and the true name. The former could be anything but the later was in some sense special and unique. While we might not agree with the mysticism associated with this world-view our ancestors had a large measure of truth in their belief. For example knowing a person's name does indeed give you power over them in many ways. You can call for their attention, you can locate them in a directory, you can identify them to the police etc. None of these things work if you do not know their true name but only an alias or nickname (though both give you a bit more power than if you have no identifier).

In programming, good names for identifiers add to our power. Well-defined concept names help in communication. Do you get as tired as I do of people who throw around terms such as 'object-oriented' or 'polymorphic' as if there were a single universally accepted meaning for these words? Such an assumption leads to a great deal of confusion and talking at cross-purposes.

If you track comp.lang.C++.moderated you will have seen Francis' recent postings on the subject of 'value type'. Francis has a very simple view; if something can be passed by value and assigned to then it is a value type. Not everyone agrees but at least he has a rigid and verifiable criterion. By contrast Francis maintains that although something that does not meet those requirements is not a value type it is not necessarily and object type (it might be an abstract type, some kind of interface type etc.) From Francis' perspective function and array types are not value types in C while everything else is. In both C and C++ enums must be value types.

In Java something is a value type if and only if it is a built-in type. This is an essential property of Java types and no amount of hacking will change that. However much we might wish it otherwise a Java String is not a value type.

Now this is rather by the way, because I set out to write a little more on the subject of choosing identifiers. Laziness in choosing names leads to hard to read source code. A name should carry a strong hint as to what you are doing. One of the banes of my life is the habit that some programmers have of sticking in 'get', 'set' or 'put' as a prefix for a function name (I do it myself sometimes but only in moderation).

What do you think get_int() should do? If you think that is obvious (I am not so sure) what about get_month()? If you were a C programmer you would probably expect that function to extract month from some source stream, but a C++ programmer probably expects it to be a member function of a Date class. These different expectations make reading mixed code more difficult than it need be. Part of the problem is the oft-quoted guideline that function names should be verbs or verbal phrases. Frankly, I think that is silly. Certainly functions that do things should follow that guideline. I have no problem with print(), printOn, scan() or scanFrom() but I have a very big problem with a get_month() that does nothing but return a value. The simple name month() would seem an appropriate name for a member function in such an instance. And while we are on the subject, I would be deeply suspicious of a member function called set_month(). Apart from anything else it suggests that there is an actual data member representing the month. Consider:

class Date {
// private implementation
public:
  Date(Day = 1, Month = 1, Year = 1900);
  Day day();
  Month month();
  Year year();
// other interface elements
};

(This assumes that I have provided Day, Month and Year types, but at the simplest these could be (preferably at namespace scope) typedefs for some integer type) If I want to change the month I can write something like:

bool testing() throw(){
  try {
    Date d(2,3,1998);
    d = Date (d.day(), 5, d.year());
  }
  catch(…) { return false; }
  return true;
}

You may think that mechanism will be rather inefficient. You are probably right from a runtime consideration but how often do you think you should change a single attribute of a date? By comparison you have far less code to maintain in the Date class and so will save a lot of your time.

Every time you find yourself tempted to write a 'set' member function stop and ask yourself whether it is worth the development the maintenance cost.

Let me get back to the concept of a month() member function. I often see people argue that it is OK to return a const reference to a private data member. I think this is definitely not OK. Let me explain.

One reason for private data is to support the concept of data-hiding, an essential part of the development of abstract data types. As soon as you allow a const & return you have broken your data encapsulation. Supposing that your initial design of a Date class had discrete data members for the different attributes of a date (day, month and year) and you chose to make your month() function return a Month const &. At that stage there is no problem because there is a real piece of data to which the return value can be bound.

When you later decide that it would be better to handle the representation of Date data with a Julian day you are in deep trouble. To what are you going to bind the return value? In this case you cannot even use the typedef trick by which you can allow for late changes of attribute types because the problem is that there is no data member to bind to the reference.

Returning a const & results in a long-term commitment to maintain an appropriate data member. If you are happy to make that commitment, fine. However experience will almost certainly teach you not to enter into such a commitment lightly. Attributes should nearly always be returned by value.

Global Functions

Most of the above is C++ specific but I think C programmers should also give more consideration to what they call things. Which of the following is more descriptive:

int value = get_int(stdin);

int value = read_int_from(stdin);

int value; if (!assign_int_from(&value, stdin)) {

In a sense I am not too worried by your answer as long as you recognise that names of functions are not arbitrary.

Every time you name a function you should be asking yourself what the name will communicate to the user. Of course there are various conventions to keep names under control. Not all the conventions are sensible. For example printf() in C could just as well have been called print(). The fact that the function does formatted printing is unimportant, printing is always formatted. We do not have a print unformatted function. On the other hand the f prefix to fprintf() is useful, not least because the input stream (file) is the first parameter. We quickly learn that the prefix letter to these function name families identifies the first parameter type. I do not think I would endorse such terseness in modern environments where compilers/linkers can handle longer identifiers but as long as the convention is well understood it does little harm.

There should always be a clear indication in functions that change things that this is what they do. I do not accept that the C habit of relying on the visibility of taking an address is sufficient in such cases. When I write a function that can mutate an array, how do I know this just by reading the code? Words such as 'change', 'update', 'modify' etc. should be part of a mutating function's name. We have the resources to do better and we should use them.

Conclusion

Think about what names tell you? They are a vital element in making code readable. If you think it necessary to add a comment to a function's declaration to explain what it does then the name is no use. We want to know, at a glance, what a function does at every point of use. Comments at the point of declaration are of little value and at the point of definition they are useless.

Functions with good names describe themselves.

Unfortunately there are so many counter pressures to using good names. One of these is that the majority of programmers are men and men are rather bad at words (that is not sexist, it is a measurable difference between men and women, though there will be exceptions). Men find it difficult to think of suitable names. However even this handicap can be overcome with practice. Try reading your code aloud (another reason to abhor weird prefix notations). Does it make sense? It should do. You should be able to put your source code through a speech synthesiser, sit back, close your eyes and listen to it.

That gives me a final thought to finish on. It would be nice to have a speech synthesiser that would read source code sensibly (e.g. read // as 'comment'). Now that would make a worthwhile student project as it requires understanding of a number of different things. In addition the end result would be useful to those with defective eyesight.

Notes:

More fields may be available via dynamicdata ..