Journal Articles

CVu Journal Vol 15, #6 - Dec 2003 + Programming Topics

Browse in :

All > Journals > CVu > 156 (7)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: BRACKETS OFF!

Author: Administrator

Date: 03 December 2003 13:16:01 +00:00 or Wed, 03 December 2003 13:16:01 +00:00

Summary:

Body:

The mathematical formula:

              v = u + at

calculates the speed, v, of an object, with initial speed u and constant acceleration a, after time t. Placing the "a" next to the "t" is a convenient shorthand for "multiply a by t", which also makes it apparent that the multiplication must be done before the addition.

When the same formula is written in C, the multiplication operator needs explicit representation:

Example 0)    v = u + a * t

The layout of this expression no longer makes it clear that the multiplication should be done before the addition, so a programmer might choose to parenthesise:

              v = u + (a * t)

Are these parentheses required to guarantee correct evaluation of v? If not, should they be included anyway, to help convey the meaning of the expression? How can coding standards help with such choices?

This article aims to answer these questions. It first presents some examples of the operator precedence and associativity rules in action, then offers some guidelines on when to parenthesise expressions, and finally argues that these guidelines should be replaced by a single rule.

More Examples

Example 1)    x = 8 - 4 - 2
Example 2)    r = h << 4 + 1
Example 3)    str += ((errors == 0) ? 'succeeded" : "failed")
Example 4)    *utf++ = 0x80 | ucs >> 6 & 0x6f

The expression presented in Example 0 contains three operators: assignment, addition and multiplication. These operators - indeed all operators - follow a strict precedence which defines the order of evaluation. Since multiplication has higher precedence than addition, which in turn has higher precedence than assigment, the expression is equivalent to:

              v = (u + (a * t))

This means the compiler can be trusted with the expression as first presented. No parentheses are required. Good, the language does what we expect.

In Example 1, subtraction binds more tightly (i.e. has higher precedence than) assignment, so the subtractions are performed first. Since all the arithmetic operators associate left to right, the expression is equivalent to:

              x = ((8 - 4) - 2)

In Example 2, arithmetic operators bind more tightly than shift operators, so the expression is equivalent to:

              r = (h << (4 + 1))

Why did the programmer not write r = h << 5 ? Probably because he really meant:

              r = (h << 4) + 1

but bit shifting (like, say, finding the address of something, or subscripting an array) somehow seems closer to the machine and feels as if it ought to be of higher precedence than addition, so the crucial parentheses were missed^[1].

In example 3, the parentheses are unneccessary, since the comparison operators bind more tightly than the conditional operator, which in turn binds more tightly than the assignment operators. Do the parentheses help you understand the meaning of this expression? Would you have left them out - and if so, would one of your team-mates have complained?

How should the fourth example be parenthesised, to make its meaning clear? It is equivalent to:

              *(utf++) = (0x80 | ((ucs >> 6) & 0x6f))

which shows how complicated an expression looks when parentheses are added indiscriminately.

Coding Standards and Guidelines

In general - at least, in my experience - coding standards do not provide rules on how to parenthesise expressions. I suspect this is for two reasons. Firstly, because although all programmers use parentheses to clarify the meaning of expressions, they may well disagree on what makes an expression clear. Clarity seems a matter of taste. While programmers in a team may agree (to differ) on whether tabs or spaces are to be used for indentation, their coding standard leaves them free to rewrite Example 4 as:

              str += errors == 0 ? "succeeded" : "failed"

And secondly, if a coding standard were to rule on how to parenthesise, it would be difficult to find a middle ground. This leaves as candidate rules the two extremes:

parenthesise everything
never parenthesise

The first quickly leads to unreadable code, and the second seems overly proscriptive. In the absence of a hard rule, here are some guidelines, which I hope are non-contentious, and which may help us reach a conclusion:

have the operator precedence tables to hand and understand how to interpret expressions using them,
understand the logic behind the operator precedence tables, but be aware of the traps and pitfalls,

remember, parentheses are not the only way to make order of evaluation clear. For example, Ex 4 could be rewritten:

  *utf++ = 0x80 |
                ucs >> 6 &
                0x6f

or even:

  *utf = ucs >> 6;
  *utf &= 0x6f;
  *utf |= 0x80;
  ++utf;

if an expression is hard to understand, break it down into simpler steps, or extract it out as a function with a meaningful name,
trust the compiler: it might not implement partial template specialisation correctly, but it will get operator precedence right every time,
never use parentheses simply because you aren't sure how an expression will be evaluated without them: treat doubt as an opportunity to learn,
all macro arguments must be parenthesised.

Concluding Thoughts

Any effort put into becoming familiar with precedence tables is likely to pay off across a range of languages. For example, although C++ introduces several new operators over C, there are no surprises. The precedence rules remain in force even if the operators have been overloaded (but that's the subject of another article). Java operator precedence is almost a subset of C's. Similarly, scripting languages are generally compatible with C, even where C's precedence rules are slightly screwy^[2]. So, while PERL introduces lower precedence versions of the logical operators not, and, and or, it ensures that not binds more tightly than and which in turn binds more tightly than or^[3]. Interestingly, in Python, where whitespace is syntactically significant, parentheses can be used not just to indicate order of evaluation, but also to wrap lengthy expressions over several lines.

The more experienced I become as programmer, the fewer parentheses I use. Coming from a mathematical background, it was several months into my first job before I dared use the conditional operator - and when I finally did start using it, I parenthesised all the sub-expressions for safety. Later on in my career, when I first found myself working with the bitwise operators, again, I enclosed sub-expressions with brackets. As my confidence has increased, the brackets have peeled away.

This, though, is simply evolution. Familiarity with the languages you use makes it easier to read expressions without the unnecessary noise of parentheses. Evolving in this way, however, leaves a programmer vulnerable when working on code written by a more experienced teammate, unless the experienced programmer writes to a lowest common denominator.

Surely it would be better for everyone to program to a highest common denominator. The operator precedence tables are a fundamental part of the language. The rules for using them are simple. Although there are many precedence levels, the operators do group logically. Update your Coding Standards. Prohibit unnecessary parentheses. Brackets off!

References

[Koenig] Andrew Koenig, 1989, C Traps and Pitfalls, Addison-Wesley, ISBN 0-201-17928-8 2

^[1] This example is lifted straight from [Koenig], from the section headed: "Operators do not always have the precedence you want".

^[2] According to [Koenig], some of C's peculiarities can be blamed on its heritage: "The precedence of the C logical operators comes about for historical reasons. B, the predecessor of C, had logical operators that corresponded rougly to C's & and | operators. Although they were defined to act on bits, the compiler would treat them as &&and || if they were in a conditional context. When the two usages were split apart in C, it was deemed too dangerous to change the precedence much."

^[3] This contrasts C/C++, where not, and and or, if available, are equivalent to !, && and || respectively.

Notes:

More fields may be available via dynamicdata ..