Journal Articles
Browse in : |
All
> Journals
> CVu
> 123
(22)
All > Journal Columns > Code Critique (70) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Student Code Critique Competition
Author: Administrator
Date: 06 May 2000 13:15:37 +01:00 or Sat, 06 May 2000 13:15:37 +01:00
Summary:
Body:
This is another column that I would be happy to hand on to someone else. If you are reluctant to volunteer because you do not think you have a ready source of student code for review, fear not because I can provide material when needed.
Let me start this time with a reworking of January's problem. I was delighted to find someone responding to my suggestion to try it in STL. I think the result shows why I am in the camp that believes newcomers should start with C++ and then progress to C rather than the other way round.
An STL Solution from Roger Orr <rogero@howzatt.demon.co.uk>
January's C Vu 'Student code critique' discussed a proposed C solution to a fairly simple problem. The program read a file and produced a list of unique items. The first item in the file was the number of remaining items and the rest of the file contains the space-separated items.
In the editorial notes the question was asked: "Anyone like to provide an answer based on an STL set?" I was struck when thinking about this by the different approach to the problem which using the STL gives compared to a standard C solution. Let us look at a possible solution to the problem:
#include <set> #include <string> #include <algorithm> #include <iterator> #include <iostream> #include <fstream> using namespace std; // for simplicity int main() { set<string> tokens; ifstream infile( "land.dat", ios::in ); int count; infile >> count; // we ignore the count copy(istream_iterator<string>(infile), istream_iterator<string>(), inserter( tokens, tokens.begin() ) ); copy( tokens.begin(), tokens.end(), ostream_iterator<string>(cout, " " )); }
Now what is different compared to a "traditional" C solution? Firstly there are only three lines of code (!) - a definite simplification. This is of course because we are pulling in a lot of code from the standard headers.
Hence, there are many header files included - there were only three in the C solution. We include everything but the kitchen sink. In fact, I got stuck when I first typed this code because I omitted #include <string> and the compiler I was using gave me a hard to decipher error message about a missing operator - although the code worked unchanged with a different compiler!
Secondly we have written no memory management code. In a C solution there would typically be some #defines - to specify the maximum string length and maybe even the maximum number of strings. These are to prevent over-running fixed C arrays, or if using dynamic memory to make it easier to program.
In the STL solution we are automatically freed from worrying about these issues because the STL containers manage their own memory. In fact, if the specification had, for example, required strings over a certain length to be rejected or required all strings to be the same length we would have to add code to our STL solution to detect this. The code written with the STL tends to be more generic, simply because the STL is generic.
In fact I could replace the three occurrences of <string> with a different data type and handle a similar problem with, for example, integers - or even use a typedef for set<string> and then use the value_type typedef. It would be quite hard to make an equivalent change with C code.
Next, once I fixed the missing #include, this code worked first time (amazing) - and I do not expect there are any obscure memory problems lurking to trap me later.
Lastly, I do not have to remember to close or delete anything - the objects all take care of their own tidying up. None of these are particularly revolutionary comments, but I was struck by them more forcefully when reflecting on the process of trying to re-write some simple C code using the STL.
I have to process a set of files, one per employee, that contains data about an employees working times. A typical file would be:
Kaiser 18.12.1999 6.0 K1001 Praktikum GIP 17.12.1999 4.5 K0002 Fachbereichssitzung 19.12.1999 2.0 K1001 Vorlesung GIP 19.12.1996 3.0 K0001 Weihnachtsfeier 01.01.2000 2.5 K3001 Klausur GIP erstellen
I am required to read the data into a structure and sort it by a bubblesort on the dates. I should also be able to generate a list for a specific time period (e.g. provide a list from 19.12.1999 till 30.01.2000)
Here is the C code. As I am German, the identifiers are in that language.
This is my structure in the header file:
/*** Datenstrukturen ***/ struct stunden { float std; char kostenstelle[6]; char datum[11]; char name[21]; char arbeit[41]; }; /*** Globale Variablen ***/ extern struct stunden std[100]; extern int anz_daten;
Here I will read my data from a file into the structure:
int datei_einlesen(char datei[13]) { FILE *pf; char name[21]; pf = fopen(datei, "r"); if(!pf) return 0; fscanf(pf, "%s", name); while(!feof(pf)){ fscanf(pf, "%s %f %s", std[anz_daten].datum, &std[anz_daten].std, std[anz_daten].kostenstelle); fgets(std[anz_daten].arbeit, 40, pf); if(strpbrk(std[anz_daten].arbeit, "\n")) std[anz_daten].arbeit[strlen( std[anz_daten].arbeit)- 1] = 0; strcpy(std[anz_daten].name, name); anz_daten++; } fclose(pf); return 1; }
And here I want to list a period of time:
void datum_zerlegen(char datum[11], int *tag , int *monat, int *jahr) { char t[3], m[3], j[5]; strncpy(t, datum, 2); t[2] = 0; strncpy(m , datum + 3, 2); m[2] = 0; strncpy(j, datum + 6, 4); j[4] = 0; *tag = atoi(t); *monat = atoi(m); *jahr = atoi(j); } void zeitraum_ausgeben(char von[11], char bis[11]) { int x, vtag, vmonat, vjahr; int btag, bmonat, bjah, xtag, xmonat, xjahr; datum_zerlegen(von, &vtag, &vmonat, &vjahr); datum_zerlegen(bis, &btag, &bmonat, &bjahr); for(x = 0; x < anz_daten; x++){ datum_zerlegen(std[x].datum, &xtag, &xmonat, &xjahr); if( (xjahr >= vjahr ) && (xjahr <= bjahr ) &&(xmonat >= vmonat) && (xmonat <= bmonat) &&(xtag >= vtag) && (xtag <= btag )){ printf("%s %5.2f %s %-10s %s\n", std[x].datum, std[x].std, std[x].kostenstelle, std[x].name, std[x].arbeit); } } printf("\n"); }
The student is not that far from getting a working program. There are, however, one or two areas where assistance is required. I have tried to keep the same general style adopted by the student and to modify the program as little as possible. I have, of course, corrected faulty code and provided comments and suggestions where necessary.
The first problem I had is working out what the program is supposed to do, as my knowledge of German is non-existent. This turned out not to be such a great problem after all. I have kept the German variable names as much as possible but where I have added variables and functions I have used English names. This has resulted in rather strange mixture of German and English, but I am sure that the student is better at English than I am at German.
I have organised the various files in the following way. The file student.c contains main() that provides a means of testing the functions provided by the student. The student code is contained in unit1.c with declarations in unit1.h.
The variables (in unit1.c) must not be declared as extern. The keyword extern indicates that the declaration of std and anz_daten is just a declaration and not a definition. The variables need to be defined here and so the extern keyword is not required. As the two variables are defined in unit1.c they do not have global scope. It is only required for them to be visible in unit1.c. Unit1.h only contains references required by student.c. This keeps the scope on variables and functions as local as possible.
The job of function datei_einlesen() is to read employee data from a file and to load it into the std array. If the file fails to open correctly the function returns a 0. There is nothing wrong with returning a 0 to represent failure but it is more usual to return a 1 for failure and 0 for success.
The function datei_einlesen() contains a while loop to detect when the end of file has occurred and, therefore, when to stop reading data from the file. Unfortunately, feof is set when attempting to read beyond the last record. As the student's code stands, the body of the while loop will be executed one too many times. To prevent this I have made use of the comma operator. I have added the statement (fgets() in my case) to the test of the while loop. Now, when a record is read, a test is made to see if it is the end of file before the record is processed within the body of the while loop. In this way the correct number of records is read.
The modified while loop requires that a complete record be read from the file. I have chosen to store the record in a char array called buffer. The individual fields now have to be extracted from a char array rather than from the file stream. The fscan() function will, therefore, have to be changed to sscanf() and its first parameter changes from pf to buffer. The fgets() function will also have to be changed. In this case strncpy() will suffice. If the arbeit data is longer than 40 characters the null terminator character will not be copied by strncpy() and so an additional line is added to ensure std[anz_daten].arbeit is properly terminated.
Formatted input is never an easy business and there will be many ways to achieve the same result. I have not tried to make the file reading routing fool proof. It relies on the data file having the correct format. Also no attempt has been made to ensure the std array is not over written. At present, if there are more that 100 records in the data file this will happen.
The way the student removes any new line characters (\n) is unnecessarily complicated. The function strpbrk() returns a pointer to the \n character if one is found, NULL otherwise. So it is simply a matter of testing the pointer to see if it is NULL and if it is not assign a \0 character to the location pointed to by the pointer.
Now we come to the routine that prints records within a certain date range. Unfortunately, the method is more complicated than the student suggests. It is not sufficient to ensure that the year, month and day are all within their respective ranges. It is a bit more involved. The method I have adopted is as follows. First work out if the record date is equal to or greater than the start date. This is achieved by considering three conditions.
-
If the record year is greater than the start year, then the record is greater than the start date.
-
If the record year is equal to the start year and the month is greater than the start month then the record is greater that the start date.
-
If the record year and the record month is equal to the start year and start month respectively and the record day is equal or greater than the start day then the record date is equal to or greater than the start date.
If any of these tests are true then the record date is equal to or greater than the start time. A similar series of tests is conducted on the record date and the end date, but this time testing to see if the record date is less than or equal to the end date. If both conditions hold true, then the record is within range. This idea is converted to C code as shown in the revised zeitraum_ausgeben() function. Incidentally, the declaration of bjahr is missing the 'r' in the student's code.
The student states that the records are to be sorted by means of a bubble sort. For completeness, I provide a simple bubble sort routine. It should be noted that the bubble sort is not very efficient and that the C library provides a much better sort routine. I give an example of how qsort() is used to sort the employee records in descending order. To sort in ascending order simply swap the parameters in the compare() function.
When defining functions with arrays as parameters the student includes the size of the array. It is not necessary to state the array size as the compiler will ignore it in any case. There is an implicit conversion from an array to a pointer which means that the size of the array is lost to the called function. To avoid the impression that the function knows the size of the array it is best not to mention the size at all.
It is perhaps unfortunate that the student has chosen std as the name of an array. If the program was compiled under a C++ compiler the name will clash with the C++ standard library namespace.
I hope this is of some help and I include a modified version of the student's code.
[Please note that though this is the prize winner there is certainly room to do a code review on the code.]
#include "Unit1.h" #include <stdio.h> #include <conio.h> #include <stdlib.h> int main(void) { if (datei_einlesen("student.dat")) { puts("The file could not be" "opened.\nProgram terminated."); getch(); return 1; } zeitraum_ausgeben("19.12.1999", "30.01.2000"); zeitraum_ausgeben("01.01.1900", "01.01.2100"); bouble_sort(); zeitraum_ausgeben("01.01.1900", "01.01.2100"); quick_sort(); zeitraum_ausgeben("01.01.1900", "01.01.2100"); getch(); return 0; }
Unit.h:
#ifndef UNIT_H #define UNIT_H int datei_einlesen(char datei[]); void zeitraum_ausgeben(char von[], char bis[]); void bouble_sort(void); void quick_sort(void); #endif
Unit.c
/* Program modified by James Holland */ #include <stdio.h> #include <string.h> #include <stdlib.h> struct studen { float std; char kostenstelle[6]; char datum[11]; char name[21]; char arbeit[41]; } std[100]; int anz_daten; int greater(char const *, char const *); int compare(void const *, void const *); void datum_zerlegen(const char datum[], int * tag, int * monat, int * jahr); int datei_einlesen(char datei[]) { FILE * pf; char buffer[100]; char name[21]; char * p; pf = fopen(datei, "r"); if (!pf) return 1; fgets(buffer, 100, pf); sscanf(buffer, "%s", name); while (fgets(buffer, 100, pf), !feof(pf)) { sscanf(buffer, "%s %f %s", std[anz_daten].datum, &std[anz_daten].std, std[anz_daten].kostenstelle); strncpy(std[anz_daten].arbeit, buffer + 21, 40); std[anz_daten].arbeit[40] = '\0'; p = strpbrk(std[anz_daten].arbeit, "\n"); if (p) *p = '\0'; strcpy(std[anz_daten].name, name); anz_daten++; } fclose(pf); return 0; } void datum_zerlegen(char const datum[], int * tag, int * monat, int * jahr) { char t[3], m[3], j[5]; strncpy(t, datum, 2); t[2] = 0; strncpy(m, datum + 3, 2); m[2] = 0; strncpy(j , datum + 6, 4); j[4] = 0; *tag = atoi(t); *monat = atoi(m); *jahr = atoi(j); } void zeitraum_ausgeben(char von[], char bis[]){ int x, vtag, vmonat, vjahr; int btag, bmonat, bjahr, xtag, xmonat, xjahr; int a, b, c, d, e, f, g, h, i, j; /* Terms in the boolean expresion. */ datum_zerlegen(von, &vtag, &vmonat, &vjahr); datum_zerlegen(bis, &btag, &bmonat, &bjahr); for (x = 0; x < anz_daten; x++) { datum_zerlegen(std[x].datum, &xtag, &xmonat, &xjahr); a= xjahr > vjahr; /* x.year>start.year*/ b= xjahr == vjahr; /* x.year==start.year*/ c= xmonat > vmonat; /* x.month>start.month*/ d= xmonat == vmonat; /* x.month==start.month*/ e= xtag >= vtag; /* x.day >= start.day */ f= xjahr < bjahr; /* x.year<end.year */ g= xjahr == bjahr; /* x.year ==end.year*/ h= xmonat < bmonat; /* x.month<end.month*/ i= xmonat == bmonat; /* x.month==end.month*/ j = xtag >= btag; /* x.day >= end.day */ if ((a || b && (c || d && e)) && (f || g && (h || i && j))){ printf("%s %5.6f %s %-10s %s\n", std[x].datum, std[x].std, std[x].kostenstelle, std[x].name, std[x].arbeit); } } printf("\n"); } void bouble_sort(void) { int a, b; struct studen temp; for (a = 1; a < anz_daten; ++a) { for (b = anz_daten - 1; b >= a; --b) { if (greater(std[b-1].datum, std[b].datum)) { temp = std[b-1]; std[b-1] = std[b]; std[b] = temp; } } } } int greater(char const * s1, char const* s2) { int vtag, vmonat, vjahr; int xtag, xmonat, xjahr; int a, b, c, d, e; /* Terms in the boolean expresion. */ datum_zerlegen(s1, &xtag, &xmonat, &xjahr); datum_zerlegen(s2, &vtag, &vmonat, &vjahr); a = xjahr > vjahr; /* x.year>start.year */ b= xjahr == vjahr; /* x.year== start.year*/ c= xmonat > vmonat; /* x.month>start.month */ d= xmonat == vmonat; /* x.month==start.month*/ e= xtag > vtag; /* x.day >= start.day */ return a || b && (c || d && e); } void quick_sort(void) { qsort(std, anz_daten, sizeof std[0], compare); } int compare(const void * b, const void * a) { char const * ap = (*(struct studen *)a).datum; char const * bp = (*(struct studen *)b).datum; if (strcmp(ap, bp) == 0) return 0; if (greater(ap, bp)) return 1; return -1; }
Brett Fishburne also provided an excellent critique which I will make avaible on our website.
If James contacts me we can discuss his prize.
This issue's code is a little different because it is an instructor's code. Obviously I am not goin to name names but I thought you might like to see what some students get to work with.
As always there will be a prize provided by Blackwells Bookshops in colaboration with Addison-Wesley. By the way if other book sellers and/or publishers would like to sponsor one of uour columns I would be very happy to discuss it with them.
sponsored by Blackwells Bookshops & Addison Wesley
// header: array.h #ifndef _MYARRAY #define _MYARRAY #include <assert.h> #include <iostream.h> template <class BType, class IType = int> // defaults are OK here too class Array { protected: BType *arrayData; // pointer to data int lobound, hibound; // 'actual' bounds bool outOfRange(IType i); // function to test bounds public: // constructor Array(int SZ = 16, IType lo = 0); // copy constructor Array(const Array &x); ~Array(); // destructor IType lo() const; // return this->lobound IType hi() const; // return this->hibound // overload the [] operator BType& operator[](IType i); // overload the = operator Array<BType, IType>& operator= (const Array<BType, IType> &x); void grow(int); // increase array size }; // implementation template <class BType, class IType> Array<BType, IType>::Array(int SZ, IType lo) : lobound(lo), hibound(lo + SZ - 1) { assert(SZ > 0); arrayData = new BType[SZ]; assert (arrayData != 0); } template <class BType, class IType> Array<BType, IType>::Array(const Array<BType, IType> &x) : lobound(x.lobound), hibound(x.hibound) { arrayData = new BType[hibound - lobound + 1]; assert (arrayData != 0); (*this) = x; // use assignment } template <class BType, class IType> Array<BType, IType>::~Array() { delete [] arrayData; } template <class BType, class IType> inline IType Array<BType, IType>::lo() const { return lobound; } template <class BType, class IType> inline IType Array<BType, IType>::hi() const { return hibound; } template <class BType, class IType> bool Array<BType, IType>::outOfRange(IType i) { if (i < lobound || i > hibound) { cout << "Array index " << i << " is out of range" << endl; return true; } else return false; } template <class BType, class IType> BType& Array<BType, IType>::operator[](IType i) { assert (!outOfRange(i)); return(arrayData[i - lobound]); } template <class BType, class IType> Array<BType, IType>& Array<BType, IType> ::operator=(const Array<BType,IType> &x) { assert (hibound-lobound == x.hibound-x.lobound); // arrays must be same size for (int i = 0; i <= hibound - lobound; ++i) arrayData[i]=const_cast<Array<BType, IType> &>(x).arrayData[i]; return *this; } template <class BType, class IType> void Array<BType, IType>::grow(int newS) { // YOU ARE TO CODE THIS } #endif
The problem set by the instructor was to write the code for that last template function. The result must compile, link and execute with a file of code the instructor had got written. When you come to critique the code a good place to start would be to consider the problem the students actually have. This is the only place that they are allowed to add code. They are not allowed to change any of the code provided by the instructor.
I would like you then to look at all aspects of the code. Remember that code provided by an instructor under the limitations stated would be exemplar code. Do you think this is, if not why not.
Would you also comment on the suitability of the problem. For example would you develop template code this way?
Entries in by June 14th.
Notes:
More fields may be available via dynamicdata ..