Journal Articles

CVu Journal Vol 11, #5 - Aug 1999
Browse in : All > Journals > CVu > 115 (21)

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Being Idiomatic

Author: Administrator

Date: 03 August 1999 13:15:32 +01:00 or Tue, 03 August 1999 13:15:32 +01:00

Summary: 

Body: 

The art of writing requires far more than knowledge of the grammar and vocabulary of a language. If you have the slightest doubt about this, just take a few minutes surfing the net. You will find a wealth of rubbish that is grammatically correct and even correctly spelt. The art of programming requires much more than a working knowledge of a computer language. I sometimes see a phrase such as 'x has immense expressive power.' applied to a computer language. Expressive power is important and one of the most important tools for expression is in the selection of names. Some languages such as the older dialects of FORTRAN are very restrictive. Actually the target users probably thought FORTRAN was very liberal as mathematicians and engineers are great exponents of the art of conciseness.

Let me take a little time reflecting on simple high-school algebra and simple scientific formulae. Which of the following looks strange to you:

age x age + 4 = -4 x age

or

x2 + 4x + 4 = 0

Of course they contain almost exactly the same information. The difference is that the second uses the conventional mathematical abstraction and idiomatic form that allows you to quickly identify the algebraic content divorced from any real world context while the first clutters up the form with a variable name that indicates a context (in which the solution is probably unexpected) and further confuses. One of the difficult steps that algebra demands of us is the ability to abstract from real world problems to stylised mathematical forms. I am sure that some of my readers can appreciate what a struggle that can be, whilst for other the mathematical training makes the procedure obvious.

I find it curious that one of the very early style rules for programming is 'use meaningful variable names.' Yes, please do. Remember that meaningful is governed by context. In the context of mathematics single letter variable names is normal and anything else is not.

While I am on the subject, note how mathematics uses an '=' sign with two distinct meanings. In the above equations the '=' is used as a way of stating a requirement. However when I write 'x = -2' I usually expect the reader to understand that as a purely factual statement (the value of x is negative two).

Learning the idioms of mathematics is important (and some of my maths teachers did not seem to realise that many problems can be solved correctly by unconventional (non-idiomatic) methods. Let me give you a little example. Given the following simple arithmetic problem:

1 1/2 ÷ 2/3

Which of the following is a better way of obtaining an answer:

3/2 ÷ 2/3 = 3/2 x 3/2 = 9/4

or

3/2 ÷ 2/3 = (3 x 3)/6 ÷ (2 x 2)/6 = 9/4

Those who have been rigidly drilled in the standard way to handle fractional division may take some time to realise that the second form actually applies understanding and is consistent with the standard mechanism for addition and subtraction of fractions. Go one step further and muse on your reaction to the following:

1 1/2 x 2/3 = 3/2 ÷ 3/2 = 1

Be honest, if you saw your child do that in their homework your first reaction would be that they did not understand what they were doing and that the method was wrong. It may be unconventional but it is entirely correct. The trouble is that such unconventional methods make it harder for others to follow your reasoning, doubly so when you arrive at the wrong answer through some error in your calculation even though the method is right.

As a parent you would rightly expect your child's teacher to give full credit for the correctness of the method. However you might have cause for concern if the child was never directed onto the idiomatic path. By the way that is one of the fundamental flaws in the concept of the discovery teaching style. Children need not only to learn to solve problems but also to communicate their methods to others. Using common idioms is part of the process of communication. An educational system that teaches idioms by rote and never helps its students to grasp the value of abstraction is a failure, but a system that never teaches idioms and ignores abstraction is just as much a failure.

And So to Programming

Now let me relate this to programming. What is a correct program? The pragmatic answer is one that produces the correct output in response to given input. Unfortunately too many of our academic institutions treat that as the only criterion by which to assess their students' work. Some add a refinement that individuals must produce programs that are different from each other. How many comment on the use of idiom? (Actually, how many of the tutors have any idea as to what the idioms are?) How many take any time to discuss the choice of variable names? By the way, just because a language such as C++ allows very long variable names is no justification for using long names. I know of no other discipline where those teaching so easily accept crude solutions as being of equal merit to well honed ones. What would you think of a woodwork teacher who considered satisfactory a box made by gluing four pieces of wood to a base and hammering a few nails in to join the sides? Even more how would you feel about such a teacher if they gave such an effort equal marks to one where the sides had been carefully dovetailed together?

Consider the following code snippet (validating some input) from a student

if ( bar(x) ) printf("%d", foo(x));
else printf("Error");

When challenged, the student defended the code on the grounds that foo and bar were standard names for functions (the books he had read were full of their uses). The tutor objected to the control expression on the grounds that it was not a boolean one. To which the student promptly replied that of course it was, bar returned a bool (note that this code was in C).

When pushed, this was the tutor's code:

 if ( check(x) == TRUE) printf("%d", calculate(x));
else printf("Error");

Do you think that was an improvement? (Forget about layout wars for the moment)

It is now worse. The function names are no more informative than they were before and the boolean expression will now return false unless the returned value from check is exactly that which represents TRUE. You can patch that last problem up with:

 if ( check(x) != FALSE) printf("%d", calculate(x));
else printf("Error");

Now we have a clumsy boolean expression which requires thought to see that it is correct. Remember that most people are very poor at writing correct logic statements.

Now consider my version for the same code:

 if ( not_negative(x)) printf("%d", square_root(x));
else printf("Error: %d is negative and out of range for a square-root function", x);

I have backed off from full abstraction by using descriptive function names. This is not mathematics and the criteria for value added abstraction are different. One immediate result of this is that the first printf becomes suspect, surely a square_root function will return a double?

Despite all that has been said to the contrary, code that compares a return value to TRUE or FALSE almost always contains hidden flaws. Writing your code without such comparisons will be no less robust and frequently more so.

Be wary of too much abstraction. Function names should clearly identify what they do. foo and bar are fine when the focus is on issues of syntax but they have no place in code that is going to be compiled in order to produce output.

Even my version of the code raises several points and I would be quite happy to have a tutor argue that it would be better to write:

 if ( !(x < 0)) printf("%f", sqrt(x));
else printf("Error: %d is out of range for a square-root function", x);

What I am not happy with is the idea that thoughtless idioms and over abstract function names should be let go without comment. Good function names along with good variables are a pre-requisite for good programming. Choosing them can take time (though it gets easier with practice). The pay-off comes in producing readable code that expresses your intentions both to the compiler and to other programmers.

Now, as always, I would love to hear your opinions.

Notes: 

More fields may be available via dynamicdata ..