Journal Articles

CVu Journal Vol 16, #1 - Feb 2004 + Programming Topics + Student Code Critiques from CVu journal.
Browse in : All > Journals > CVu > 161 (8)
All > Topics > Programming (877)
All > Journal Columns > Code Critique (70)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Student Code Critique Competition 26

Author: Site Administrator

Date: 01 February 2004 01:21:34 +00:00 or Sun, 01 February 2004 01:21:34 +00:00

Summary: This item is part of the Dialogue section of C Vu, which is intended to designate it as an item where reader interaction is particularly important. Readers’ comments and criticisms of published entries are always welcome.

Body: 

Before We Start

I have become accustomed to an ever growing stack of books waiting for me to get round to reading them (for the first time forty-five years I have failed to read a serial in Analog as soon as the last part was published). What came as a surprise was to realise that writing that I am committed to is going the same way - the stack is growing faster than I can reduce it. Some things have to be done by me but other things could be done by others if they had the will to do so.

This column is one of the latter. Its contents are largely written by others. Putting it together takes roughly a day's work. Is there anyone reading this who would be willing to take it over? I can continue to manage the supply of prizes but freeing up another day would be much appreciated.

Student Code Critique 24 revisited

For some reason some of my email has been disappearing before it reaches me (and as my ISP does not yet apply any filters, though they will soon, it isn't caused by some form of false positive for spam). I keep complete logs of all mail that reaches my mailbox and can identify several places where people have clearly sent me things that have never arrived (what worries me is the cases I do not know of, but note that I always acknowledge email with content).

It seems that two entries for SCC 24 were consumed by this email eating demon. So here they are but without the restatement of the problem.

From Matthew Towler

This program is simply outputting text so first of all I will suggest coding purely in standard C++ and removing the MSDOS specifics. These are clrscr() (more on this in a moment) and getch() which I think is being used to wait for a character to be pressed between groups of output. Removing these means we can also remove <conio.h>.

The first line of main() is void clrscr(); I guess this is intended to call an MSDOS function to clear the screen, but in fact it declares a function named clrscr taking no parameters and returning nothing - but does not call this function.

The simplest way to get the output into a text file would be to redirect the output. Assuming the code compiles to an executable named Sum.exe, then this would be achieved from the command line (from the directory containing Sum.exe) as follows:

Sum > output.txt

Then the following should open the complete output in notepad

notepad output.txt

The output might be enhanced by adding some separators e.g.

std::cout << x << '\t' << m << '\t';

would output the values of x and m with a tab character in between.

The headers <iostream.h> and <iomanip.h> are not standard, they are old pre-standard headers and should be replaced by <iostream> and <iomanip>. This will mean that cout and setw will need to be additionally qualified with std::. <math.h> is standard but deprecated, you should really use <cmath> which would place the pow() function in the std namespace. However in practice compiler support for <cmath> is patchy, and all will provide <math.h> for C compatibility so I usually stick with <math.h>.

My next point is that of named constants. The values 1, 11, 25000, 35000, 5000 and 2 appear several times each in the code. I am not sure what the code is calculating but it is likely the meaning would be much clearer if the constants were named to indicate their meaning or purpose. This would emphasise the connections between the individual loops and the data they are handling and just as importantly it would reduce the likelihood of a single typo breaking the code.

Some more informative naming of variables would also help. i, r, j, m and x leave the reader in the dark about what sort of calculation is being performed, and also give a higher probability of errors due to typos.

Sum is declared static, which for a program consisting of a single non-recursive function does not significantly alter the meaning. The only advantage of this is that it will not allocate 25000 ints on the stack; MSDOS often has a fairly small stack and this could be important. Sum is not declared with a type, so it is using the deprecated implicit int C feature. Most modern compilers will now warn about this and a few will refuse to compile the code.

An alternative to this two dimensional array would be a std:vector< std::vector< int> >. This would allow the access syntax to be changed from Sum[i][j] to Sum.at(i).at(j) which will check all accesses are within limits at run time. This could be a helpful debugging aid in a simple program such as this where efficiency is not paramount. The equivalent declaration of Sum is as follows.

std::vector< std::vector<int> >
    Sum(5000, std::vector<int>(5000));

Use of std::vector would also mean that all the integer values in Sum would be automatically initialised to zero, as opposed to the static data which will not do this [actually, you are mistaken, static data is zero-initialised by default - Francis].

It usually aids readability if variables are declared to exist for the shortest time; and initialised as they are declared. For instance r is only used within the first for-loop and initialised at the end of the loop; it could be declared and initialised at the top of this loop. The advantage of this is that when reading code there are less lines to search for the variable. r is currently being used uninitialised; but I did have to think for a long time to work out (and changed my mind several times) whether or not the loop bounds in the nested loops conspired to initialise it before first use. It is exactly this sort of confusion that this guideline is intended to avoid.

Similarly the loop variables could be declared within the loops e.g.

for(int i =1; etc.

The code inside the first pair of nested for loops would be simpler if the if(i == 1) block were moved above the inner loops and the inner loop started at 2. This saves an if() per loop, and far more importantly clarifies the intent that this code is setting up the calculation for the subsequent iteration.

Finally, I am not sure about the mixing of floating point and integers in the loop conditions; namely

i <= pow(2, j)

in the first loops and similarly for x and m in the second loops.

In math.h, pow() is declared as

double pow(double, double);

Standard C++ (but not C) also provides an overload (according to Stroustrup 3rd Ed).

double pow(double, int).

Which overload is called is irrelevant as the problem lies with the double return value. For example if i = 4 and j = 2 the expression resolves to 4 <= 4.0, which may not compare as you expect. To perform the comparison the inton the left will first be promoted to a double, which will result in an approximation of the integer value such as 3.999999. This small error can in some cases change the result of the comparison [True in theory but all compilers I know of correctly and exactly represent the floating point equivalents of whole numbers provided as constants - it is the computed value which is suspect not the conversion from int to double. Francis].

The solution to this is to write a function to call pow() and perform the comparison with a tolerance to take account of any approximation errors.

[Sorry, but I really do not have time to go through my email archives looking for names of authors. If you send me a file I save it for future processing. If you leave your name out of the file, your work will be published anonymously. Francis]

My comments to Student Code Critique 24 are split into two parts. The first part describes the changes needed to get the required behaviour of the program. the second shows a number of ways the code could generally be improved.

Solving the problem

Rewriting the program to make it write data to a file requires very little. A few changes to existing code and a couple of new lines does the job:

Taken from the top the changes are:

Add a new #include statement:

#include <fstream> //file handling

The line:

void main(){

should be changed into:

int main( int argc, char* argv[] ){

The standard mandates a return type of int, the 2 parameters gives us access the command line parameters, which we will use to name the file to write the data to. Insert the following to achieve this. [If you choose that mechanism you should first check that there is a second command line argument and check that the file is successfully opened.]

// open an output file named by the first
// argument to the program.
std::ofstream output(argv[1]);

Change the line:

cout << x << m << setw(10)
     << Sum[x][m] << '\n';

to

output << x << m << setw(10)
       << Sum[x][m] << '\n';

and the data will be written to the file named as the first argument to the program on the command line.

This makes the program write data to a named file instead of stdout. [Note I edited the submitter's std::out - there is no such thing.]

Improving the code

The code contains a lot of unnecessary noise. Removing this noise can lead to a program that is significantly easier to read and understand. Removing the noise also increases the programmer's trust in the output of the program.

I have grouped the noise into categories, trying to isolate the core of each problem. My hope is that it will help recognizing the type of problems in the future.

#includes should follow the C++ standard and use the include files without the .h extension. The C header for math should be replaced by cmath, which includes the math definitions in the std namespace.

Some of the #include statements in the program are not necessary, and they should be removed from the code. Include statements which are not used add false dependencies.

For large programs with a lot of header files, the false dependencies can forces a recompilation of an otherwise unchanged compilation unit. On a heavily loaded machine the reading and parsing of the include files can be a large part of the total compilation time.

Forward declarations:

Forward declarations of functions and variables should be deferred to the latest possible time. This ensures that the declaration is made as close to point where it is used as possible. A consequence of this is the removal of declarations that are not used.

Following this advice, we remove the lines:

void clrscr();
int i, r, j, m, x;

One should try to only use arrays of the needed size. Allocating more memory than we need might be masking that we do not really know how memory is used. It could be hiding an error in indexing, which could be hard to spot, even when using memory bounds checkers.

A simple scan of the code that assigns to the array Sum shows that the variable j, which is used as the second index into the array, only indexes the range [1:11]. If we change the declaration of the array Sum to reflect this usage, we reduce the amount of memory required by a factor of 454.

The first index i runs in the interval [1:2048], which leads to a reduction of the needed memory by a factor of approximately 2.5, all in all we are able to reduce the memory need of the program by a factor of over 1100.

The indexing into the array is done from 1. The C and C++ arrays are normally indexed from 0, but adapting an existing algorithm that used indexing from 1 can lead to problems.

This is especially important if the code might be used by others, or if the code will be used again and again over a period of years. It will be a lot easier to see the connection between the algorithm and the code if we keep indexing from 1.

The declaration of the array should look like this:

long Sum[2049][12]

The keyword static that was previously used is not necessary [I think the writer has missed the significance of using static to both ensure default initialisation and to use 'static' memory provided at load time rather than, possibly precious, stack memory.] The size of the array has taken into account that the indexing starts at 1.

The for-loops should declare the type of the counting variable it uses inside the for-statement. This makes sure that the counting variable is only visible inside the for-loop [As this was for an MSDos platform, that might not be true].

The inner for-loop contains the construction:

for(int i = 1; i < pow(2,j); i++) {
// code //
  i = i + 1;
}

Because the code block spans several lines, it is not immediately obvious that the for-loop step size is two and not one. This should be written as:

for( int i = 1; i < pow( 2, j ); i =+ 2 ) {
// code //
}

The innermost loop has the following construction:

for(i ......) {
  if(i == 1) {
  // code //
  }
  if(i > 1) {
// Code using variable r
    r = r + 1
// code not using variable r
  }
  r = 2;
}

From the above it should be obvious that every time r is used, it has the value 2, and we might as well change the single usage of r to using a properly named variable, like

const long magic_constant = 2;

Making it obvious that the code depends on a magic constant.

To ease the reading of the for-loops writing the data, these loops should use the same variables as was used to generate the data:

for(int j = 1; j <= 11; ++j) {
  for(int i = 1; i <= pow(2.0, j ); i++) {
    output << i << " " << j
           << setw(10) << Sum[i][j]
           << '\n';
    output << i << " " << j
           << setw(10) << Sum[i][j]
           << '\n';
  }
}

The innermost loop for writing the data two times could be changed to two lines, writing the same output. This makes it more obvious that the data is written twice, at the cost of making all changes to the writing code in both places, but the eye is quite good at catching differences in lines that should look the same.

The post increment operator used in the forloops, should be changed to the pre increment operator, if not the code will generate a temporary to hold the value before incrementing, that will never be used.

Readability:

The use of space characters in the code, can significantly increase readability, especially if used in a consistent way, like always enclose operators like +, -, *, /, =, ==, <= and so on with an equal amount of space before and after. [However placing spaces after any form of open bracket and before any form of close bracket often reduces readability. In addition there are good readability arguments for not placing spaces either side of a strongly binding operator such as * or /.]

The rule could read something like: Binary operators have an equal amount of one or more space characters on both sides.

The difference is clearly seen comparing two versions of the first for statement:

for(j=2;j<=11;j++)

and

for ( j = 2; j <= 11; j++ )

[Yet is not:

for(j = 2; j <= 11; j++)

slightly more readable? Too much space can be as bad as too little.]

All in all the code ends up looking like this:

[snipped]

This code generates the same output as the sample output, except that a space is inserted between the writing of the two array indexes.

Student Code Critique 25 entries

The problems

I am little confused about return values of sizeof operator.

Here is a simple C program which is putting me in doubt:

#include <stdio.h>
int main() {
  int a[10];
  int *p =(int *) malloc (10);
  printf("Size of a = %d \n",sizeof(a));
  printf("Size of p = %d \n",sizeof(p));
}

Output is :

Size of a = 40
Size of p = 4

My understanding says even array name is a pointer. If so why it does not show sizeof(a) as 4? Or if sizeof shows the total allocated memory then why sizeof(p) does not show 10?

There are numerous errors in both the code and the student's understanding. Please address these comprehensively, perhaps including places where your explanation would be different if this were a C++ program.

I am getting an error linker error. Here is the code:

// Define Libraries
#include <stdio.h>
#include <math.h>
//Start Of Main Program
main() {
  double hypo, base, height;
/* Enter base and height */
  printf("Enter base:");
  scanf("%f", &base);
  printf("Enter height:");
  scanf("%f", &height);
  hypo = sqrt(pow(base, 2)+pow(height, 2));
  printf("hypotenuse is %f", &hypotenuse);
}

Ignore the question asked by the student and address the serious problems with the code itself.

From Annamalai Gurusami

Provide Prototypes

It is a good practice to include all the relevant header files (meaning, provide the proper prototypes), even though it is not mandatory as far as the C programming language is concerned. (It is compulsory in C++). For using malloc, include the stdlib.h header file, so that the necessary prototypes are provided to the compiler.

In the absence of an appropriate prototype, the compiler would assume that the return type is an integer (while malloc function actually returns a void pointer). This is fine in a 32-bit system, but will be a problem on a 64-bit system. The reason is that in a 32-bit system, an integer and a pointer occupy the same amount of memory space (four bytes), whereas in a 64- bit system, an integer occupies 4 bytes, but the pointer occupies 8 bytes. So provide prototypes. It might save you a lot of trouble later.

[The above paragraph contains a number of misconceptions. There is no requirement on any system that any integer type have the same amount of storage as any pointer type. Note that not only does both C & C++ allow pointers to different fundamental types to be different in both size and layout, but some implementations actually use this license. The requirements are that void* and char* be identical in size and layout; void* must be able to store the value of any data pointer without loss of information and that all pointers to struct (and, in the case of C++, class) only need a declaration of the type name. Different types of pointer can, and sometimes are, different sizes. Pointers can, and sometimes are laid out differently to the layout of an int.

Lastly, as of the 1999 release of C, implicit int and implicit function declarations are no longer supported. Francis]

Array Name Is Not A Pointer

An array name is not exactly a pointer [1], even though in many circumstances it degenerates into one. An array is a single entity, in the sense that its name and its memory location are inseparable. When we declare a variable like int a[10]; we have an object that can hold 10 integers and it is named a. There is no other memory associated with that object. When we just use an array name where a pointer is expected (type being appropriate), it is interpreted as a pointer to the first element of the array. In our case, if we use a, where an int* is expected, then it is evaluated as &a[0]. This dirty work is done by the compiler. Also note that this association cannot be changed. For example, we cannot do something like

int a[10];
a = malloc (10);
// *ERROR* cannot reassign another object to a
// a is not a pointer variable.

This is why sizeof(a) gives the size of the array object named a. The size of a in our case is 10*sizeof(int) which is often equal to 40.

The confusion mainly comes because of the applicability of the subscript operator to an array and a pointer pointing to a malloc'ed memory location. For example,

int a[10];
int *p = malloc(10*sizeof(int));
// initialize a[0]...a[9] and p[0]...p[9]
for(int i=0; i < 10; ++i) {
  printf("p[%d]=%d\n", i, p[i]);
  printf("a[%d]=%d\n", i, a[i]);
}

The similarity between these two usages is the main reason for the confusion that an array name is a pointer. But one should remember that the "array" p is not a single entity. There are two entities involved - one is the pointer variable p, and another is the allocated memory that can hold 10 integers. Also the association between the allocated memory and the pointer variable is only temporary, in the sense that the pointer variable can be made to point to another memory location. For example:

int a[10];
int *p = malloc(10*sizeof(int));
// initialize appropriately
p = a; // correct: evaluated as &a[0]
a = p; // error: a cannot be re-assigned.

I leave it as an exercise to the reader to convince himself/herself that an array name is not a const pointer.

The behaviour of sizeof operator

The sizeof operator is evaluated at compile time. And it gives the size of the object that is given as an operand. So:

int a[10];
sizeof(a);
// evaluates to the size of 10 integers
char *p = malloc(n);
sizeof(p);
// evaluates to the size of the pointer p
// which is independent of n.

Note that the type of the result of sizeof operator is size_t (which is unsigned int or unsigned long [Or some other unsigned integer type]). So check for the appropriate format specifiers in your system and use it in the printf statements.

The argument to malloc

In the provided program, the argument to malloc is 10 (which is in unsigned char). The size of an integer [on this system] is 4 bytes. So the value provided to malloc is not in multiples of the size of an int. This most probably will result in hard-to-find bugs (because of memory alignment problems. [Not likely, malloc is required to return suitably aligned memory for any type]) Also the behaviour might depend on the implementation of malloc. Whatever the after effects, the provided code doesn't seem to reflect what the user wants. If the user wants to store 10 integers, then (s)he has to calculate the required size manually, by doing 10*sizeof(int), and passing that to malloc.

If it was C++, things are more clear. For an array of 10 integers, we can just write

int *p = new int[10];

Casting the void* returned by malloc

Since the function malloc returns a pointer of type void (a void*), there is no need to cast it explicitly. A void* can be assigned to other pointers without any explicit cast [In C but not in C++]. In C++, it is recommended to use the new for allocating memory, which returns a pointer to the type asked for. operator [Not really, the memory allocation tool in C++ is operator new, the creation of dynamic objects is done with a new expression] Since no generic pointer types are involved, there is no question of casting.

Compilation Errors

The use of hypotenuse instead of hypo in the last printf statement will result in compilation error. Also if it was C++, the compilation would fail because the return type of main() is omitted. [Also true in C99]

Comment on Comments

Comments are normally used to explain what a particular piece of code does (if that cannot be easily understood, just by reading it) and why. Comments like // Start of Main Program are really superfluous, and do not add value. Also wrong comments can be very harmful. For example, the comment // Define Libraries is not correct. The header files just provide declarations. Any declared functions are defined elsewhere.

Flushing the standard output

When a prompt needs to be displayed to the user, it is always prudent to flush the standard output (because a prompt normally doesn't terminate with a new line). If this is not done, then it is not guaranteed to be displayed on the terminal, because most of the terminal drivers are line buffered. [But most, if not all, flush stdout before collecting input from stdin.]

Input Validation And Error Checking

There is no validation done on the provided input. The following checks can be performed on the input. Check if they are legal numbers. Check if they are greater than zero.

Check if there has been any error while calling the various functions (at least for scanf, pow and sqrt.) For example, while calling the pow function, it is possible that an overflow occurred. This can be determined by checking the errno variable. Such validations would help to determine the precise cause of a failure.

Also note that scanf doesn't provide much error checking. For example, if a value overflows, scanf doesn't report it. For this reason, a combination of fgets and strtod is preferred. The strtod function returns the error status through the errno variable.

The format specifier in scanf

For a double, scanf requires that %lf be specified as the format. [2]

Linker Error

Including a header file doesn't "define" the functions of a library. They just declare them. The function definitions of a library are probably available elsewhere in the system as shared/static libraries. This information should be provided to the compiler (or more precisely, to the linker). Normally the option is '-l' (l as in Link). For more details on this, look at the compiler/linker documentation and man pages.

From ?? [see my comment earlier]

I'll first answer the main concern of the student, and then address some other issues. Certainly, there's a misunderstanding about arrays and pointers. The first definition int a[10]; creates an array of ten ints, that is, it reserves space somewhere for ten consecutive ints. That 'somewhere' can be identified as a. On the other hand, int *p = (int *) malloc(10); creates a pointer to an int, named p. Where does it point? To the memory location [for a block of ten unsigned char] returned by malloc(), or to NULL, in case the request couldn't have been granted. So now let's tackle the size issue: sizeof() is an operator which returns the number of bytes of an expression or a type passed as argument. So if sizeof(int) is 4 in your environment (which happens to be a common size on most 32-bit architectures) then sizeof(a) will be 40 (10 integers each occupying 4 bytes) In the second case, you get the size of a pointer to int, which in your system turns out to be 4 bytes.

Now with this information in hand, we have a better understanding to answer to your questions:

Q:

"My understanding says even array name is a pointer. If so why it does not show sizeof(a) as 4?"

A:

Because an array is not a pointer! In many cases, its name is converted to a pointer to the first element (this is generally called decay), but the object itself remains being an array! The only exceptions when this decay doesn't take place are when being an operand of the sizeof operator (which is our case), the & unary operator (address operator).

Q:

"Or if sizeof shows the total allocated memory then why sizeof(p) does not show 10?"

A:

Because sizeof just returns the size in bytes of the operand passed, in this case an identifier for a pointer to int.

Now let's see briefly a couple of errors in your code. Let's see malloc() usage. First of all, the cast is unnecessary, since a void* is automatically converted to any other pointer type. That alone isn't a big issue by itself, but your code shows a good reason not to do it: you forgot #include <stdlib.h> which is where malloc() prototype is declared, thus letting the compiler assume the function returns an int. With this assumption, assigning the int returned by malloc() to the pointer to int, would cause a diagnostic unless you force the conversion, which is done by a cast, which is exactly what you're doing. See how this innocent-looking cast turns out to hide a potential bug in your program. So the advice here is not to cast but #include the appropriate header.

Another error is the omission of a return-statement, given that main() must return an int, as your definition clearly states.

The last error has to do with the data type used to represent the size of an object. sizeofyields a size_t value, which is a typedef to a basic type, such as unsigned int or unsigned long int. So what specifier (%u, %lu) should we use? Well, it depends on the system you're working on, depending on sizeof definition (check stddef.h) On the other hand, in C99 you can use %z for size_t, thus avoiding the above mess. Lastly, I suppose your intention is allocating space for 10 integers, so if that were the case, you should do:

int *p = malloc(10 * sizeof(int));

or

int *p = malloc(10 * sizeof *p);

that is, reserve space for 10 objects of the type pointed by p, which in this case is an int. Note that even though both statements are valid, the second one is preferred, because the data type is not fixed, but expressed by the content of the pointer. So if you later happen to change of pointer type, you won't run into any problem.

So our program would look like this:

#include <stdio.h>
#include <stdlib.h>

int main() {
  int a[10];
  int *p = malloc(10 * sizeof *p);
  printf("Size of a = %lu\n", sizeof a);
  printf("Size of p = %lu\n", sizeof p);
  return 0;
}

Note that sizeof's operand doesn't need to be put inside parentheses, when it isn't a data type.

Now I'll show you the same code in C++, and briefly discuss some useful facilities the language provides us.

#include <iostream>
using std::cout;

int main() {
  int a[10];
  int *p = new int[10];
  cout << "Size of a = " << sizeof a << '\n';
  cout << "Size of p = " << sizeof p << '\n';
}

The two most important differences relate to memory allocation and printing output to stdout. First note that using new, you don't have to think in terms of raw bytes as you do with malloc(). You just ask memory for n objects of type T. Also note that you don't need to use bizarre specifiers such as %d, %u, etc., you just specify what you want to output and there you go. One minor detail here is that the compiler inserts automatically a 'return 0' if you don't provide it, but note that this is done only in main().

[However note that this program, like the C one, forgets to release the memory obtained dynamically.]

This problem has to do with misuse of conversion specifiers. Unfortunately, the same specifier (%f) means different things in these two generally interrelated functions. In the case of printf(), '%f' specifies a double argument, and applying the length modifier 'l' doesn't affect its meaning at all. So in our case, using '%f' or '%lf' hold the same meaning [*] So what if you want to print a single precision floating point number? (a float, to be precise). You have no other choice other than using %f! The key issue here is automatic conversion: when you pass a float to printf() (and to any variadic function, the compiler automatically promotes it to a double. Thus printf() always receives doubles! Note there isn't any problem with this, as we're not losing precision. But in the case of scanf(), '%f' and '%lf' are two different things. In the first case you ask for a float, in the second for a double. Why the distinction here? Because scanf() arguments are pointers, which are not promoted.

This is also a good occasion to point out a better way to receive input from a user. Actually, scanf() is targeted to read formatted input (for example, from a configuration file, table, etc) rather than from an unforeseeable input source as a human being. The problem with scanf() relates to its poor support of error detection, among other things. Instead you should use something like the fgets() and strtod() in combination:

fgets(buffer, sizeof buffer, stdin);
strtod(buffer, NULL);

You should also add some error checking to make sure you get what you're expecting. fgets() makes it easy; returning a NULL pointer in case of error. strtod() is a little more complicated, because it returns you the value parsed, or 0 if there was any problem. But as you might have already guessed, 0 is also a valid value! So to set apart an error from a legitimate input, strtod() makes use of the infamous errno. Fortunately for us, we won't deal with that mess. In our problem domain, we're dealing with sides of a triangle, which means that positive values are the only valid values we're looking for! [What about degenerate triangles?]

This simplifies a lot our function, at the expense of getting more detailed error messages:

double read_side(void) {
  char buf[32];
  if(fgets(buf, sizeof buf, stdin) != NULL)
    return strtod(buf, NULL);
  else
    return 0;
}

Now let's see some other problems, in a 'top to bottom' approach:

// Define Libraries
#include <stdio.h>
#include <math.h>

This is wrong. You're not defining any libraries at all. You're simply including declarations to be used by your program.

main() {

main() returns an int by definition, so you should state int main() instead, and place the corresponding return 0 at the end of the function.

printf("Enter base:");
scanf("%f", &base);

You need to add fflush(stdout) to ensure text is displayed before the scanf(). [I cannot remember ever finding that necessary.]

printf("hypotenuse is %f", &hypotenuse);

You're passing the address of a float variable to printf(), instead of the value expected. Besides, the identifier you're passing doesn't even exist. In case you meant hypo, you should pass the object itself, not its address, to comply the requirement imposed by the %f specifier. You should also add fflush(stdout), or suffix a \n to ensure the string is flushed. [They do entirely different things, the first forcibly flushes stdout the second appends a new-line to the output buffer. ]

Lastly, not an error but a better practice. In this case, it's better to manually square the numbers you received, that is, use something like sqrt(base * base, height * height); which will surely give you better performance.

This is how a possible implementation might look like:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double read_side(void) {
  char buf[32];
  if(fgets(buf, sizeof buf, stdin) != NULL)
    return strtod(buf, NULL);
  else return 0;
}

int main(void) {
  double hypo, base, height;
/* Enter base and height */
  printf("Enter base: ");
  fflush(stdout);
  base = read_side();
  printf("Enter height: ");
  fflush(stdout);
  height = read_side();
  if(base > 0 && height > 0) {
    hypo = sqrt(base*base + height*height);
    printf("Hypotenuse is %f\n", hypo);
    return EXIT_SUCCESS;
  }
  else {
    fprintf(stderr,
        "Both sides must be positive\n");
    return EXIT_FAILURE;
  }
}

[*] This only applies to C99. In C89, the behaviour is undefined, so we'd better use %f for maximum compatibility.

From Catriona O'Connell

The program needs to include stdlib.h for the declaration of malloc(). Without this declaration it is assumed to return an int rather than a pointer to void.

If the student is trying to dynamically allocate the same amount of storage as a ten element array, then he/she should replace the argument 10 in the call to malloc() with 10 * sizeof(int).

The cast to (int *) of the return type from the call to malloc is not required since the coercion of (void *) to anytype * is automatic. It may be harmful to do so if malloc() is not declared to return void*, as in this case because of the missing header file. The explicit case was necessary in pre-ANSI C and is still necessary in C++. In the case here, the cast may be erroneous because only 10 bytes have been allocated. Minor stylistic point: No check is made on the return code of malloc(). malloc() returns NULL if the memory cannot be allocated. In production code this should be checked.

The function sizeof() [actually it is an operator not a function] produces an unsigned integer object of type size_t (declared in stddef.h or cstddef for C++). This header should be included for completeness.

The C Standard (ISO/IEC 9899:1999(E)) states in 6.5.3.4/3

"When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1. When applied to an operand that has array type, the result is the total number of bytes in the array. When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding."

The C Standard (ISO/IEC 9899:1999(E)) states in 6.3.2.1/3

"Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined."

In this case the student is asking for the size of the array "a" (40 bytes) and the pointer p (4 bytes) - not the size of the object to which p points. The program is therefore reporting the correct values.

The C++ Standards (ISO/IEC 14882:1998(E)) states in 5.3.3/2

"When applied to a reference or a reference type, the result is the size of the referenced type. When applied to a class, the result is the number of bytes in an object of that class including any padding required for placing objects of that type in an array. The size of a most derived class shall be greater than zero (1.8). The result of applying sizeof to a base class subobject is the size of the base class type.) When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element."

You have to be careful about using sizeof with arrays. You have to be sure that you use it only at a point in the program where you are dealing with the actual array and not a pointer to its first element. In many, but not all, contexts an array T a[N] will implicitly convert to a pointer to its first element T *p = a. This is not the same as saying that an array is a pointer.

In C++PL (Special Edition) B.2.1 Bjarne Stroustrup notes that in C the sizeof a character constant and of an enumeration equals sizeof(int). In C++, sizeof('a') equals sizeof(char) and a C++ implementation is allowed to choose whatever size is most appropriate for an enumeration.

The final printf() should refer to hypo not &hypotenuse (the address of a non-existent variable).

There is a missing return in main().

The function scanf() can return a double into its argument, but the format flag should have been specified as %lf. By specifying a float, the results will be unpredictable.

scanf() is evil. As pointed out in the responses to SCC21, calling scanf() multiple times in the same module can lead to excess characters being left in the input buffer. Everything after the valid number (including the crlf) is left in the buffer. The second call to scanf() tries to interpret that according to the input format. The student would either have to program in assignment to a string for the remainder of the buffer or be introduced to the delights of non-assigning scanf() calls such as

scanf("%*[^\n]"); // skip to end of line.
scanf("%*1[\n]"); // skip a newline character.

to clear the input buffer.

If the student is determined to use scanf() then he/she should have checked the return code which specifies how many variables have been assigned.

The call to pow() to create a square is overkill and can be replaced by simple multiplication.

The attached code sample is a minimally improved version using nonassigning scanf() calls and another with the data gathering moved to a separate function.

#include <stdio.h>
#include <math.h>
#define BUF_SIZE 100
double getDouble(char * prompt);

int main() {
  double hypo, base, height;
  base = getDouble("Enter base");
  height = getDouble("Enter height");
  hypo = sqrt(base*base + height*height);
  printf("hypotenuse is %lf", hypo);
  return 0;
}

double getDouble(char * prompt) {
  double num;
  char buffer[BUF_SIZE];
  int gotDouble = 0;
  while(!gotDouble) {
    printf("%s :", prompt);
    if(fgets(buffer, BUF_SIZE, stdin)) {
      if(sscanf(buffer, "%lf", &num) == 1) {
        gotDouble = 1;
      }
      else {
        printf("Error in sscanf()\n");
      }
    }
    else printf("Error in fgets()\n");
  }
  return num;
}

The Winner of SCC 25

The editor's choice is: Annamalai Gurusami

Please email to arrange for your prize.

Student Code Critique 26

(Submissions to by March 10th)

It is generally worth sending in a late entry, depending on my work load it may or may not be considered for the prize but it will eventually get published. However this time please make an extra effort to get entries in on time as I would like to get this column done before the ECMA TG5 (C++/CLI binding) in Melbourne.

This time the problem is still about some student code however it is code that works but how do you help the student to move forward? Yes, I know the simple answer but I am looking for something more.

The following program (below my question) produces the results I want, but I am trying to create a for-loop to replace all of the cout statements.

What I have so far is:

for(int i = 0; i <= len; i++)
  cout << str[i + 1];

This will produce: "eed help with C++."

Could someone clue me in on how to create the for-loop? Should I be using a nested for loop? Any help would be appreciated.

Student's Program:

#include<iostream>
#include<cstring>

using namespace std;

void main (){
  const int arraySize = 20;
  char str[arraySize] = "Need help with C++.";
  int len = strlen(str);
  cout << "The sentence \"Need
           help with C++.\" has "
       << len << " characters."
       << endl << endl;
  cout << str << endl;
  cout << str + 1 << endl;
  cout << str + 2 << endl;
  cout << str + 3 << endl;
  // 13 similar statements snipped
  cout << str + 16 << endl;
  cout << str + 17 << endl;
  cout << str + 18 << endl;
}

Notes: 

More fields may be available via dynamicdata ..