Journal Articles

CVu Journal Vol 17, #4 - Aug 2005 + Student Code Critiques from CVu journal.
Browse in : All > Journals > CVu > 174 (12)
All > Journal Columns > Code Critique (70)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Student Code Critique Competition 35

Author: Administrator

Date: 03 August 2005 05:00:00 +01:00 or Wed, 03 August 2005 05:00:00 +01:00

Summary: 

Body: 

Prizes provided by Blackwells Bookshops & Addison-Wesley

Please note that participation in this competition is open to all members. The title reflects the fact that the code used is normally provided by a student as part of their course work.

This item is part of the Dialogue section of C Vu, which is intended to designate it as an item where reader interaction is particularly important. Readers' comments and criticisms of published entries are always welcome, as are possible samples.

Before We Start

Remember that you can get the current problem set in the ACCU website (http://www.accu.org/journals/). This is aimed at people living overseas who get the magazine much later than members in the UK and Europe.

Student Code Critique 34 Entries

I send an unsigned char called tipus which is meant to be some hex number from 0x00 to 0xff and which should decide the string to be returned...

If the unsigned int called valor I send is above 0xA000 I use the struct decidetest to send a string back, else, I send a string selected from the struct decide.

If no substitution is made, then the string returned is always a space in html...  

For some reason I always get the static char escape string returned... any idea? (the function compiles all right...)

char* Detect_type(unsigned char tipus, unsigned int valor)
{
int i;
static char escape[16] = "  ";
static struct decide {
unsigned num;
char *string;
} cadena [] = {
         {0x00, "Sobre V PK "},
         {0x02, "Sobre V RMS "},
         {0x0E, "--Power OFF--"},
         {0x10, "--Power ON--"},
         {0xff, " "},
};

static struct decidetest {
unsigned numero;
unsigned char num;
char *string;
} cadenatest [] = {
         {0x00, "Sobre V PK Test"},
         {0x02, "Sobre V RMS Test"},
         {0x0E, "--Power OFF--"},
         {0x10, "--Power ON--"},
         {0xff, " "},
};

if (valor >= 0xA000)
  {
    for(i = 0; i < sizeof(cadenatest) /
        sizeof(cadenatest[0]); i++)
    if(cadenatest[i].num == tipus)
      return cadenatest[i].string;
  }
else
  {
    for(i = 0; i < sizeof(cadena) /
        sizeof(cadena[0]); i++)
    if(cadena[i].num == tipus)
      return cadena[i].string;
  }

return (escape);
}

From Roger Leigh

The Initial Problem

We are told the function "compiles all right". This is not true:

$ gcc -c orig.c
orig.c: In function 'Detect_type':
orig.c:22: warning: initialization makes integer from pointer without a cast
orig.c:22: error: initializer element is not computable at load time
orig.c:22: error: (near initialization for 'cadenatest[0].num')
orig.c:22: error: initializer element is not constant
orig.c:22: error: (near initialization for 'cadenatest[0]')

The following changes were made to correct various problems with the code. The code is compliant with the ISO C99 standard.

Looking at the source code, we see that struct decidetest has three members, of which only two are used. Comparing with struct decide, numero looks out of place. If this is removed, the function compiles, and appears to work as intended (I added a simple test program to test each case, including failure). It's always a good idea to do this when writing code: when making the following changes I could check for breakage after each change.

Cosmetic Problems

Detect_type is an unusual name. Think about why you are using a particular naming scheme. Common types are CamelCaseNames and lower_case_underscored. As an example, I usually use camel case for structure and class names, and lower case for all function names, methods and variables. The aim of the capitalisation and underscoring is to separate individual words to give readable and meaningful identifiers. Often there will be a common suffix within a particular program or library, a "namespace". I renamed the function to detect_type.

Your indentation was mostly acceptable, but indentation following curly brackets varied depending on whether they were enclosing the function body or any other use. I re-indented it using Emacs.

The identifiers used, tipus and valor, do not appear to have any meaningful bearing on their use within the function. It is important to use identifiers that convey meaning. I renamed tipus to value.

valor is an unsigned integer, yet is used in a Boolean manner. Quite what the special significance of 0xA000 is is not at all clear. This sort of undocumented magic inside functions will only cause maintenance problems later: keep this localised to where it actually means something. Presumably this has meaning elsewhere in your code, but here a simple true/false value is sufficient. I replaced unsigned int valor with bool test. The caller can simply use (valor >= 0xA000) to achieve the same behaviour.

escape is only used once. Why not eliminate it entirely?

The return value is in parentheses. return is a keyword, not a function, making this unnecessary.

struct decide

After removing numero from struct decidetest, struct decide and struct decidetest are now identical. There is no need to define two identical structures, so I removed struct decidetest.

The num member of struct decide is an unsigned int, but valor (now value) is an unsigned char. There's no need for the extra size, so I changed it to unsigned char. Because it's used as a value rather than a character, I then changed all the instances of unsigned char to uint8_t from <stdint.h>, to better reflect its use as a number rather than a character.

The "string" member of struct decide is not const, but it only ever contains string literals. The function also returns a non-const pointer to it. I made all character data const, and made the return type const as well. const correctness prevents accidental modifications, and therefore makes your code more robust.

The initialisers are quite simple, because there are only two members. You might want to consider using C99 named initialisers for bigger structures, which significantly improves readability and protects against accidentally missing out a member (as we saw in the case of struct decidetest).

The last member of both cadena and cadenatest is followed by a trailing comma. Some compilers don't like this.

Quite what the meaning of the "magic" constants 0x00, 0x02, 0x0E, 0x10 and 0xFF are is not apparent. If their only purpose is to be unique and in a certain order, using an enum would make the code cleaner and more maintainable. I haven't changed this because I can't be certain of breaking other code.

Strings

The string escape and most of the other string literals are of length 16. Is this a requirement? If so, the type of "string" in struct decide should also be char [16], and all of the strings starting with "-Power" are too short. If it's not a requirement, just make them all of type const char * (field widths may be specified with a printf format string, illustrated in the test code). I made this change, because hard-coded string length limitations are rarely used in modern code.

Eliminating Redundancy

The for loop that iterates through the various values is repeated twice for both cadena and cadenatest. This can be eliminated by using a simple pointer. This also required adding an extra NULL element to the end of each array. This also allowed i to be removed.

Amended Code

This is the result of making the above changes. I hope you can see that a few simple changes have made the code much more readable and easy to understand. It's not that the code is doing anything very different, but that the intentions of the programmer are made clear because the structure of the code makes this obvious. The code is also simpler, which makes it easier to understand.

Also note the comments preceding the function. It's important to document the public interfaces of our code, and there are many tools available which can extract the comments from the code and produce full API references.

#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>

/**
* detect_type:
* @value: the value to detect the type of
* @test: true if this is a test,
* otherwise false
*
* Detect a type.
*
* Returns a string describing the type of
* @value, or "&nbsp;" if @value was not
* found. The string must not be freed.
*/
const char *
detect_type (uint8_t value,
       bool test)
{
  struct decide
  {
    uint8_t num;
    const char *string;
  };

  static const struct decide cadena[] =
    {
      {0x00, "Sobre V PK"},
      {0x02, "Sobre V RMS"},
      {0x0E, "-Power OFF-"},
      {0x10, "-Power ON-"},
      {0xff, ""},
      {0x00, NULL}
    };

  static const struct decide cadenatest[] =
    {
      {0x00, "Sobre V PK Test"},
      {0x02, "Sobre V RMS Test"},
      {0x0E, "-Power OFF-"},
      {0x10, "-Power ON-"},
      {0xFF, ""},
      {0x00, NULL}
    };

  for (const struct decide *iter =
       (test == true) ? &cadenatest[0] :
                        &cadena[0];
       iter->string != NULL;
       ++iter)
    {
      if (iter->num == value)
        return iter->string;
    }

  return "&nbsp;";
}

// A simple test case
#include <stdio.h>

int
main (void)
{
  printf("Value Test String\n");
  printf("--- --- -----------\n");
  printf("0x00, false: '%-20s'\n", detect_type(0x00, false));
  printf("0x02, false: '%-20s'\n", detect_type(0x02, false));
  printf("0x0A, false: '%-20s'\n", detect_type(0x0A, false));
  printf("0x0E, false: '%-20s'\n", detect_type(0x0E, false));
  printf("0x10, false: '%-20s'\n", detect_type(0x10, false));
  printf("0xFF, false: '%-20s'\n", detect_type(0xFF, false));
  printf("0x00, true: '%-20s'\n", detect_type(0x00, true));
  printf("0x02, true: '%-20s'\n", detect_type(0x02, true));
  printf("0x0A, true: '%-20s'\n", detect_type(0x0A, true));
  printf("0x0E, true: '%-20s'\n", detect_type(0x0E, true));
  printf("0x10, true: '%-20s'\n", detect_type(0x10, true));
  printf("0xFF, true: '%-20s'\n", detect_type(0xFF, true));
  return 0;
}

From Nevin Liber

First observation: I'm not sure what compiler the student was using. I couldn't get it to compile under either CodeWarrior 9.5 or gcc 2.96, even turning off all the warnings I could.

Bug #1: The compiler error message (this one from Metrowerks; gcc is similar) points out the bug:

Error : illegal implicit conversion from
'char *' to 'unsigned char'
Detect_type.c line 34     {0x00, "Sobre V PK Test"},

Tracing back, struct decidetest has three elements while each of the initializers for the cadenatest array only have two elements. Removing element numero from decidetest fixes the bug. Writing a little test harness:

#include "Detect_type.h" /* Detect_type(...) 
                            prototype */
#include <stdio.h>

int main()
{
  unsigned int valor;
  for (valor = 0xa000 - 1;
       valor <= 0xa000 + 1; ++valor)
  {
    static const unsigned char tipus[] =
      { 0x00, 0x02, 0x0e, 0x10, 0xff, 0xbe };
    unsigned t;
    for (t = 0;
         t != sizeof(tipus)/sizeof(tipus[0]);
         ++t)
    {
      printf("Detect_type(0x%.2x, 0x%.4x)"
        " ==\"%s\"\n", tipus[t], valor,
Detect_type(tipus[t], valor));
}
}
return 0;
}

I get the following output:

Detect_type(0x00, 0x9fff) == "Sobre V PK "
Detect_type(0x02, 0x9fff) == "Sobre V RMS "
Detect_type(0x0e, 0x9fff) == "-Power OFF-"
Detect_type(0x10, 0x9fff) == "-Power ON-"
Detect_type(0xff, 0x9fff) == " "
Detect_type(0xbe, 0x9fff) == "&nbsp; "
Detect_type(0x00, 0xa000) == "Sobre V PK Test" 
Detect_type(0x02, 0xa000) == "Sobre V RMS Test" 
Detect_type(0x0e, 0xa000) == "-Power OFF-" Detect_type(0x10, 0xa000) == "-Power ON-"
Detect_type(0xff, 0xa000) == " "
Detect_type(0xbe, 0xa000) == "&nbsp; "
Detect_type(0x00, 0xa001) == "Sobre V PK Test" 
Detect_type(0x02, 0xa001) == "Sobre V RMS Test" 
Detect_type(0x0e, 0xa001) == "-Power OFF-" Detect_type(0x10, 0xa001) == "-Power ON-"
Detect_type(0xff, 0xa001) == " "
Detect_type(0xbe, 0xa001) == "&nbsp; "

Now, the student said if valor is not above 0xa0000, then 'decide' is used. While my intuition makes me suspect he wasn't being precise, since I can't ask him in person, I have to go with what he said.

Bug #2: The edge case of 0xa000 is incorrect.

Fixing those two bugs:

char* Detect_type(unsigned char tipus,
       unsigned int valor)
{
int i;
static char escape[16] = "&nbsp;          ";
static struct decide {
unsigned num;
char *string;
} cadena [] = {
          {0x00, "Sobre V PK "},
          {0x02, "Sobre V RMS "},
          {0x0E, "-Power OFF-"},
          {0x10, "-Power ON-"},
          {0xff, " "},
};

static struct decidetest {
/* Bug #1: unsigned numero; */
unsigned char num;
char *string;
} cadenatest [] = {
          {0x00, "Sobre V PK Test"},
          {0x02, "Sobre V RMS Test"},
          {0x0E, "-Power OFF-"},
          {0x10, "-Power ON-"},
          {0xff, " "},
};

if (0xA000 < valor) /* Bug #2: if (valor >= 0xA000) */
   {
     for(i = 0; i < sizeof(cadenatest) /
         sizeof(cadenatest[0]); i++)
         if(cadenatest[i].num == tipus)
           return cadenatest[i].string;
   }
else
   {
     for(i = 0; i < sizeof(cadena) /
         sizeof(cadena[0]); i++)
     if(cadena[i].num == tipus)
         return cadena[i].string;
   }

return (escape);
}

While this is now producing correct output, it can be made a lot better.

  1. Formatting. At a minimum, function bodies are indented and struct members are indented inside their structs. This just makes it easier to see the underlying structure of the code.

  2. Types. The type of tipus, decide.num and decidetest.num should all match. I would add the following:

    typedef unsigned char tipus_t;
    

    and change the types for all three of those variables to tipus_t.

  3. String literals. escape is declared to have a size of 16 bytes, yet the literal that initializes it has a size of only 15 bytes (including the trailing '\0'). Plus, most of the other string literals have a size of 17 bytes. A reasonable guess is that the caller is expecting any string returned to have a size of 17 bytes. Additionally, there is no reason to store escape in a function static variable, since we can return the literal directly. Finally, since we are returning string literals, the caller shouldn't modify the elements in those string literals; therefore, the function should return a const char*.

And while we are on the subject of const, both cadena and cadenatest should be static const structs, to insure future maintainers of the code don't modify them.

Issue #3: Refactoring the structs

Looking at decide and decidetest now, I noticed that they have exactly the same members. So I made them instances of the same struct, which I call decide_t.

Issue #4: Declarations should be closer to use

The cadenatest array is only needed when valor > 0xA000, and the cadena array is only needed when valor <= 0xA000, so I moved their declarations into their respective if clauses.

Putting all of that stuff together, the code now looks like:

typedef unsigned char tipus_t;

const char* Detect_type(tipus_t tipus, unsigned int valor)
{
  /* static char escape[16] = "&nbsp; "; */
  struct decide_t {
    tipus_t num;
    const char* string;
};
if (0xA000 < valor) {
  static const struct decide_t
    cadenatest[] = {
        {0x00, "Sobre V PK Test"},
        {0x02, "Sobre V RMS Test"},
        {0x0E, "-Power OFF- "},
        {0x10, "-Power ON- "},
        {0xff, " "},
  };
  int i;
  for (i = 0; sizeof(cadenatest) / sizeof(cadenatest[0]) > i; ++i)
    if (cadenatest[i].num == tipus) {
      return cadenatest[i].string;
    }
  } else {
    static const struct decide_t cadena[] = {
        {0x00, "Sobre V PK "},
        {0x02, "Sobre V RMS "},
        {0x0E, "-Power OFF- "},
        {0x10, "-Power ON- "},
        {0xff, " "},
    };
    int i;
    for (i = 0; sizeof(cadena) / sizeof(cadena[0]) > i; ++i)
    if (cadena[i].num == tipus) {
      return cadena[i].string;
    }
  }
  return "&nbsp; ";
}

Issue #5: Eliminate the loops

Hand coded loops are both error-prone and algorithmically slow (O(n)). We can do better by using a switch statement. While some folks might squawk at having multiple returns, I found it easier as I didn't need any local or static variables (other than the return). Here is the complete redesign:

const char* Detect_type(unsigned char tipus, unsigned int valor) {
  if (0xA000 < valor) {
    switch (tipus) {
    case 0x00:
      return "Sobre V PK Test";
    case 0x02:
      return "Sobre V RMS Test";
    case 0x0e:
      return "-Power OFF- ";
    case 0x10:
      return "-Power ON- ";
    case 0xff:
      return " ";
  };
} else {
  switch (tipus) {
  case 0x00:
    return "Sobre V PK ";
    case 0x02:
      return "Sobre V RMS ";
    case 0x0e:
      return "-Power OFF- ";
    case 0x10:
      return "-Power ON- ";
    case 0xff:
      return " ";
    };
     }
  return "&nbsp; ";
}

Commentary

The student claimed the function compiles all right - but they were unlucky with their compiler! As both entrants found with gcc, many compilers will not actually compile the code - and some will only compile it as 'C' code but not as 'C++' code. Actually, I suspect that some warnings were generated for the student but they probably ignored them. Compiler warnings for computer programmers are much like pain is for athletes - something is wrong and simply pressing on may cause more serious damage.

The basic problem was initialising a structure (decidetest) with the wrong number of data values. However the code hid this problem since the two data structures were accessed in such a similar fashion that the reader expected they were defined identically. I would suggest the writer of the code should follow the so-called 'principle of least surprise' and use the same structure for both arrays.

As Roger's solution shows making the two structures share a common structure also makes it easier to avoid code duplication - the same loop can be used for both the data structures (although in this case it also changed the loop termination condition from a count to a test of a 'special value' - NULL).

An approach that could be considered as an alternative to a switch statement is a 'library' solution - since the arrays are sequential the standard 'C' library function bsearch can be used to find the match. This would avoid writing an explicit loop at all but retains the explicit data structure.

The Winner of SCC 34

The editor's choice is: Roger Leigh

Please email to arrange for your prize.

Student Code Critique 35

(Submissions to by September 1st)

Here is a C++ header file with a number of potential problems. Please critique the code to help the student identify the problems and to help them to provide some better solutions.

Note: the class Report is not shown. It contains a large amount of data, which can be explicitly deleted by a call to ClearAll.

// Reports : vector of reports
class Reports : public map<Data*, Report>
{
public:
  Reports() : nIndex(0) {}
  void ClearAll()
  {
    for (iterator iter=begin();
         iter != end(); iter++)
      (*iter).second.ClearAll();
  }

  Report& GetReport(int nReport)
  {
    int nSize = size();
    assert(nReport < nSize);
    if (nIndex == 0 || nIndex > nReport || 
                    nIndex >=(nSize-1))
  {
    iter = begin();
    nIndex = 0;
  }
  for (; iter != end(); iter++, nIndex++)
  {
    if (nIndex == nReport)
      return iter->second;
  }
  // keep compiler happy
  return *((Report*)0);
}

protected:
  int nIndex;
  iterator iter;
}; 

Notes: 

More fields may be available via dynamicdata ..