Journal Articles

CVu Journal Vol 11, #2 - Feb 1999 + Programming Topics

Browse in :

All > Journals > CVu > 112 (20)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Naming Conventions for Spoken Readability

Author: Administrator

Date: 03 February 1999 13:15:29 +00:00 or Wed, 03 February 1999 13:15:29 +00:00

Summary:

Body:

If you wrote a sentence in a serious document and realised that it made a silly rhyme, would you want to change it? Of course you would, because otherwise the rhyme would be distracting from what you are saying. But what about the following code?

Colour colour = new Colour (Colour::red);

This might be read as "colour colour equals new colour colour red". Most of the information here is conveyed in the symbols and the case, rather than the words. It uses a common naming convention that class names should begin with a capital letter and instance names with a lower case one, and a too common practise that an instance can be called the same thing as its class, but for a case difference.

Naming conventions are useful for avoiding confusion. For example, two ways of avoiding confusion between parameters and member data are:

Method 1: Prefix "the" to all member data

MyObject::MyObject(int someData) {
  theSomeData=someData;
}

Method 2: Prefix "in_" to all parameters

MyObject::MyObject(int in_someData) {
  someData=in_someData;
}

Of course, you can use 'this->someData=someData', but that is more confusing, especially if you try to use it in a constant initialiser:

MyObject::MyObject(int someData) : someData(someData) {};

The point: It helps if you have a naming convention, but some naming conventions are better than others. Hungarian notation is a naming convention and I think I'd better not get into that debate (if I had any bricks through my window I'd be liable ☺).

The colour example is confusing to blind people working with speech synthesisers (which are not ideal interfaces for working with code - Braille is much better but not so widely available), and I'm considering writing some kind of filter to sort people's code out for them, although I'm not sure about the feasibility of this in the general case. You might be able to tell the difference between Colour and colour by context (or get your synth to tediously read out the case of every letter), but "possible" is not the same as "ideal".

I think you need context sensitive speech synthesiser. Now that would be an interesting project. Francis

If I tried to persuade everyone to adapt their naming conventions for blind people, I'd just get accused of being extremist. But speech synthesis is not the only time you might want to read code. When discussing code (especially over the telephone) you need to convey it orally, preferably in a reasonably efficient and easy-to-understand way. Many people think verbal thoughts while reading to themselves, and if code is easy to read verbally then this can help. So I think that, where possible, consideration should be given to making code verbally readable.

As an example of this, compare the following:

shape.draw(screen);
shape.drawTo(theScreen);

something.writstat(true);
something.setWritingStatusTo(true);

something.writstat();
something.getWritingStatus();

Colour colour=new Colour();
Colour drawing_col=new Colour();

I don't want to lay down any firm rules, since every piece of code is different. If experienced code authors were vaguely trying to make their code verbally readable then they should have little problem in doing it (in whatever language provided that the language itself allows readable code), and I'm convinced that it can only be a good thing. But I tend to prefer beginning 'set' and 'get' methods with 'set' and 'get', so it's obvious that you're talking about a method and not a field.

What about identifiers with many words in them? One common convention is to begin each new word with a capital letter, and this is the one that Java uses. Fortunately some speech synths can be told to begin a new word at every capital, so the pronunciation is not garbled. If you are reading the code visually though then it can depend on the font you're using - in particular, the relative dimensions of capital and lower case letters. If it is not obvious where the capitals are then you have to look at the identifier for longer to find them before you can split it into words. I can change the font, but fully sighted people have a general tendency to not know much about fonts. But whether it is a good thing or not can depend on the identifier - are the words long or short? Are there more than two or three words in the same identifier? (If so then it can be a problem!) Is one word ending with a letter that looks similar to the capital of the next? And so on.

The alternative is to use underlines (_) to separate the words. This can make for longer identifiers and slower visual reading (because you tend to 'notice' the underlines even if you don't say 'underline') and can also slow down your typing (touch-typing an underline can take a significant fraction of a second), but there should be no problem in picking out the words. Speech synths can be told to treat underline as whitespace.

If you can't touch-type (and I'm surprised at the number of programmers who can't) or you have one of those badly-designed laptop keyboards or a Macintosh (that puts the tactile markings in confusingly non-standard places), you may still be able to write long identifiers quickly by using one of those editors that lets you type in part of an identifier and completes it for you.

Some people find identifiers that are entirely in capital letters hard to read, and these are sometimes used for pre-processor macros. It can depend on their frequency of use though - if one is only ever used about twice in a large program then it can be more acceptable than using it every other line.

Of course, some programmers will find that their English is not yet good enough to know how readable a piece of code with English identifiers is. Some "localised" compilers let you write identifiers in languages with characters other than those occurring in the ASCII set. For example, I recently came across someone who had learned C++ with a "Chinese compiler", that let you have Chinese identifiers. The problem with this is that it ties you to one particular compiler and compromises on the portability of C++. What would happen to your code if that compiler is no longer supported in the next decade, and the machines it runs on are ancient and no longer used by anyone? What if you want to distribute your code in source form, or port it to some other platform? If you must write identifiers in these languages then it would be better to use their romanised forms (Pinyin, Romaji etc) - tone Pinyin without the brackets will compile OK.

I thought of hacking out a converter for Chinese C++, that generates ASCII identifiers to replace the Chinese characters (and substitutes hex sequences in string literals), so that the code can be compiled using any other compiler. However, then I found out about one localised compiler that not only allows you to put Chinese characters in identifiers but also re-defines the behaviour of the string functions to cope with them. This is so horribly non-standard it makes it almost impossible to port code between that compiler and any other in any reasonable length of time. How can a compiler writer know that, whenever a programmer writes strlen(), she means "number of characters in the string"? I could write strlen() when I actually wanted to know the number of bytes. What these localised compilers are effectively doing is creating another language, superficially like C or C++ but too proprietary to interact fully with it. It may still be possible to write a converter from the 'localised' version to the real language, but you would have to re-write half the standard libraries at the same time.

Localised versions of languages may increase readability for some people, but I think they defy the original point of the languages. If you want to create a new language, go ahead, but don't pretend it's interoperable with the existing ones when it isn't.

I think we need to examine the tools that we use for code preparation. We are already getting quite effective speech to text tools but cannot capitalise on these for writing code. Clearly colour can help with our understanding of a piece of code. However in both cases the tools need to understand context. We need context sensitive tools. We know what can be achieved and we know that many could benefit so why doesn't someone get on and do it? An IDE that allowed me to dictate my code, could read it back to me and could use colour to highlight selected features would, I think, have substantial sales. Start telling your tool vendors (and in the USA remind them that they are supposed to actively consider the needs of the disabled)

Notes:

More fields may be available via dynamicdata ..