Browse in : |
All
> Topics
> Programming
All > Journals > CVu > 124 Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Reading C & C++ Variable Declarations
Author: Administrator
Date: 03 July 2000 13:15:38 +01:00 or Mon, 03 July 2000 13:15:38 +01:00
Summary:
Body:
I was never formally taught the C language. In college in 1979 the main teaching language was Pascal. I learnt C by looking over the shoulders of more knowledgeable colleagues in my first job. By studying their program listings it was fairly obvious that int i; declared that i was an integer, that char *cp; declared a pointer to a character and that char buf[100]; declared an array of 100 characters. I soon came to understand that char *argv[] was an array of pointers to characters. But that was about as far as I got. I found more complex declarations puzzling. I was reminded of this recently while reading an interview with Bjarne Stroustrup [Stroustrup2000]:
[Interviewer:] In another interview, you defined the C declarator syntax as an experiment that failed. However, this syntactic construct has been around for 27 years and perhaps more; why do you consider it problematic (except for its cumbersome syntax)?
[Stroustrup's reply:] I don't consider it problematic except for its cumbersome syntax. It is good and necessary to be able to express ideas such as "p is a pointer to an array of 10 elements that are pointers to functions taking two integer arguments and returning a bool." However,
bool (*(*p)[10])(int,int);is not an obvious way of saying that. In real life, I'd have to use a typedef to get it right:
typedef bool (*Comparison)(int,int); Comparison (*p)[10];…
I find the fact that Stroustrup finds declarations difficult to get right somewhat reassuring. I, too, use his technique of breaking complex declarations down into more manageable stages using typedefs. I think his example is a good one: I can grasp the declaration of p using typdefs much quicker than I can understand the one line declaration. And I could not have understood the latter at all before I learnt the "Right-Left" rule, more on which in a moment.
"Declarations specify the interpretation given to each identifier…"
If you want to really understand C declarations I suggest the best place look is section A8 of The C Programming Language [KandR1988], from which the above quote is taken. There you will find a dozen pages describing the syntax and meaning of declarations. For a concise summary I would turn to section 4.9.1 of The C++ Programming Language [Stroustrup1997], from which the following extract is taken:
A declaration consists of four parts: an optional "specifier," a base type, a declarator, and an optional initializer. Except for function and namespace definitions, a declaration is terminated by a semicolon. For example:
char* kings[] = {"Antigonus", "Seleucus", "Ptolemy"};
Here, the base type is char, the declarator is *kings[], and the initializer is ={…}.
A specifier is an initial keyword, such as virtual and extern, that specifies some non-type attribute of what is being declared.
A declarator is composed of a name and optionally some declarator operators. The most common declarator operators are:
* | pointer | prefix |
*const | constant pointer | prefix |
& | reference | prefix |
[] | array | postfix |
() | function | postfix |
Their use would be simple if they were all either prefix or postfix. However, *, [], and () were designed to mirror their use in expressions. Thus, * is prefix and [] and () are postfix. The postfix declarator operators bind tighter than the prefix ones. Consequently, *kings[] is a vector of pointers to something, and we have to use parentheses to express types such as "pointer to function."
So to turn Stroustrup's kings declarator example from a vector (array) of pointers to something into a pointer to a vector of something we would use parentheses like so: (*kings)[]. It is this mixture of pre- and postfix operators where the Right-Left rule can help.
The rule is described in various places on the web, which is where I first came across it. But I do not know who first coined the term. The Right-Left rule for reading declarations is:
-
Start with the identifier. Say, "identifier is."
-
Go right, interpreting the operators you find according to the table below. If you encounter a right parenthesis, or there are no more operators, go left from the identifier.
-
When going left interpret the operators according to the table below. If you encounter the base type just say it. If you encounter a left parenthesis, or there are no more operators, go right from where you stopped going right last time.
Repeat rules 2 and 3 until there are no more operators to interpret.
* | "pointer to" |
& | "reference to" |
[] | "array of" |
[n] | "array of n" |
() | "function returning" |
(arg) | "function taking arg and returning" |
Using this rule to work through the example Stroustrup quotes in the above interview:
bool (*(*p)[10])(int, int);
Rule 1, locate the identifier: "p is"
bool (*(*p)[10])(int, int);
Rule 2, go right, encounter right parentheses, go left.
bool (*(*p)[10])(int, int);
Rule 3, go left, encounter *, look up in table: "p is pointer to"
bool (*(*p)[10])(int, int);
Rule 3, go left, encounter left parentheses, go right.
bool (*(*p)[10])(int, int);
Rule 2, go right, encounter [10], look up in table: "p is pointer to array of 10"
bool (*(*p)[10])(int, int);
Rule 2, go right, encounter right parentheses, go left.
bool (*(*p)[10])(int, int);
Rule 3, go left, encounter *, look up in table: "p is pointer to array of 10 pointers to"
bool (*(*p)[10])(int, int);
Rule 3, go left, encounter left parentheses, go right.
bool (*(*p)[10])(int, int);
Rule 2, go right, encounter (int, int), look up in table: "p is pointer to array of 10 pointers to function taking args "int, int" and returning"
bool (*(*p)[10])(int, int);
Rule 2, go right, no more operators, go left.
bool (*(*p)[10])(int,int);
Rule 3, go left, and encounter base type: "p is pointer to array of 10 pointers to function taking args "int, int" and returning bool"
As you can see, this simple mechanical process has produced a fairly clear description of the given declaration. It is worth noting that just because you can use the Right-Left rule to turn a declaration into English, it isn't necessarily legal C. For example, the Right-Left rule will merrily read
int (*fn)()[7];
as "fn is a pointer to a function returning array of 7 ints," which, as you know, is not permitted in C.
Not everyone will find this mechanical approach to reading declarations necessary or useful. But personally I do. Some years ago I saw a program that, given a C declaration, would spit out the equivalent English description. I thought this was quite magic at the time, but now I can see it would not be too hard to encode the Right-Left rule.
[KandR1988] Brian Kernighan & Dennis Ritchie The C Programming Language Second Edition, Prentice Hall, 1988.
[Stroustrup1997] Bjarne Stroustrup The C++ Programming Language Third Edition, Addison Wesley, 1997.
[Stroustrup2000] Interview in Visual C Developers Journal and reproduced at: http://www.devx.com/upload/free/features/vcdj/2000/05may00/ens0500/ens0500-1.asp
Notes:
More fields may be available via dynamicdata ..