    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: A Critique of Some Code</title>
        <link>https://members.accu.org/index.php/journals/732</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 10, #6 - Sep 1998</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c135/">106</a>
                    (12)
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;A Critique of Some Code</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 September 1998 13:15:27 +01:00 or Thu, 03 September 1998 13:15:27 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e16" id="d0e16"></a></h2>
</div>
<p>The sidebar contains a letter from Spencer Davies <tt class=
"email">&lt;<a href=
"mailto:spencer@mettav.demon.co.uk">spencer@mettav.demon.co.uk</a>&gt;</tt>
that was attached to a program that he submitted for comment. He
also asks a number of other questions. I know that Francis would
want any responses copied to him for publication in future issues
of C Vu. I am not going to look at all of Spencer's code in this
issue as even short programs take up a lot of space and raise many
issues. I know that Spencer has specifically agreed for his name to
be published. That way he gets the chance of quick responses from
other members.</p>
<div class="sidebar">
<p>Francis,</p>
<p>I think that it's about time that I asked both your own and the
groups advice on programming style, approach and idiom with regards
to C.</p>
<p>I have been learning the C programming language on and off for
some time. This has been mainly in my own time, with the view to
using the knowledge gained for my own enjoyment and within my
working environment.</p>
<p>On a couple of occasions I have done a bit of programming in the
work environment (please note we are not actually a programming
group but are rather involved in systems integration) only to be
told by the group leader that my code is a mess which makes his
head swim. He ends up re-writing it. I accept his appraisal of my
work, however he seems unable to give me an explanation as to what
the precise problems are with my coding approach. (At least an
explanation I can understand which doesn't send me off into a
glazed eye state!) That is very disheartening and annoying. I
believe part of my problem is that I have never actually learnt how
to program (I started off with a BASIC manual which came with my
first computer in 1982 and have been going down hill since! )</p>
<p>Apart from a couple of courses, which have tended to teach just
the structures of the languages concerned, everything I have learnt
has been picked up from manuals and the like. Can anyone recommend
an approach which will possibly &quot;unteach&quot; any bad habits I have
picked up and in their place &quot;retrofit&quot; a more reasonable approach
to the task of program design and implementation (or something
along those lines).</p>
<p>I have enclosed my latest private programming attempt, which
apart from the 32bit checksum code is all my own work. This is to
provide an overview of my programming style which people could
possibly critique and provide recommendations as to where I am
going wrong and how to improve matters. Program design was no more
than a rough outline of what I wanted to do and how I foresaw doing
it. The program was initially written under Lattice C 5.6 on an
Atari TT then ported across to Watcom C 10.6 on the PC so the
formatting of comments, etc. might seem a bit weird.</p>
<p>Lastly, everyone is constantly going on about both &quot;object
orientated programming&quot; and C++, to which procedural programming
and C seem to be taking a back seat. With no knowledge of such
approaches and again a glassy eyed look when ever anyone mentions
the likes of &quot;encapsulation&quot; or &quot;operator overloading&quot;, would it be
worthwhile starting again and learning this approach/language or
should I stay with C until my more fundamental problems are
resolved. If it is worth tackling C++ and object orientation where
do I start?</p>
<p>Thanks for any help, advice or just plain moans.</p>
<p>Spencer Davies <tt class="email">&lt;<a href=
"mailto:spencer@mettav.demon.co.uk">spencer@mettav.demon.co.uk</a>&gt;</tt></p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e44" id="d0e44"></a>Relearning</h2>
</div>
<p>One of the problems with ridding ourselves of old habits that we
have acquired through the seat of our pants is that they will be
based on what we find natural. In other words our instincts are at
fault. This makes it very hard to change.</p>
<p>Negative criticism of the form Spenser has effectively received
does little to help. I have a lot of sympathy for him in that he
has learnt the way many of the old timers had to.</p>
<p>Unfortunately, much of the code that you see published is too
poor to use as a model for your own.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e53" id="d0e53"></a>Some
Comments</h2>
</div>
<p>Worse, comments (an essential ingredient in making code easily
understood) in most textbooks are completely inappropriate. They
are focused on helping a novice understand the syntax and semantics
of the language. Comments are supposed to help you understand how
the code solves the problem and to highlight dangerous or arcane
lines of source code.</p>
<p>Much production code is inappropriately commented because the
comments belong to an earlier version, are about the language or
have been added to keep some manager happy.</p>
<p>A few well-chosen comments are worth the time that it took to
write them. Unnecessary comments simply serve to confuse.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e62" id="d0e62"></a>Global
Identifiers</h2>
</div>
<p>Before I go on to examine some of Spencer's source code in
detail I think it is about time that I tried to explain a couple of
problems with using global identifiers.</p>
<p>Some things, such as functions in C, have to use names
(identifiers) that are either at file scope (visible throughout the
file) or at global scope (visible to the linker). Every time you
declare some identifier at global scope you introduce one more
place where the linker can become confused. In C the compiler
creates a single list of global names for everything other than
tags (used by <tt class="literal">struct</tt>, <tt class=
"literal">union</tt> and <tt class="literal">enum</tt>). It has no
mechanism for telling the linker how the name is being used. We can
help by qualifying all names used at file scope (that is outside
prototypes, definitions of <tt class=
"literal">struct</tt>/<tt class="literal">union</tt> types and the
bodies of functions) as <tt class="literal">static</tt> unless we
specifically want to use the objects so named in another file.</p>
<p>Your first reaction is probably to assume that such
qualification ensures that reused names will not cause problems.
That is not quite true because such a reused name may conceal a
global name that should have been used. For example:</p>
<pre class="programlisting">
static void fn(void){
/* do something */
}
/* lots of code */
</pre>
<p>Now when some maintenance programmer uses <tt class=
"function">fn()</tt> somewhere in that 'lots of code' the compiler
will use the local definition even if the maintenance programmer
intended the use of some other <tt class="function">fn()</tt>
defined in another file. As C allows implicit declar-ation of
functions (and too many programmers are too lazy to over-rule that
by providing explicit prototypes) the compiler will happily compile
the code and the linker will never be able to issue a redefinition
error.</p>
<p>The point I am trying to make is that global names are a
precious resource and should not be frittered away without good
reason. Generally <span class="bold"><b>variables</b></span> (as
opposed to function names) should not be visible globally, and even
at file scope there needs to be a good reason for using them.</p>
<p>Another problem with global names is that they produce serious
debugging problems because they are accessible throughout the
program. This means that they can be changed almost anywhere. (Be
very wary of using the address of a file scope variable because
that too means that the value contained may be changed somewhere
else).</p>
<p>The final problem with identifiers is caused by the
pre-processor (both a great blessing and a horrible curse). C does
not provide any mechanism for distinguishing pre-processor
identifiers from compile time identifiers. They use exactly the
same character set and generally follow the same constraints.</p>
<p>This leads to a coding guideline that I believe should be
immutable:</p>
<p><span class="emphasis"><em>Pre-processor identifiers should
NEVER include a lower case letter. All other identifiers should
include at least one lower case letter.</em></span></p>
<p>I am absolutely certain that the multitude of authors that use
entirely upper case identifiers for manifest constants provided via
an <tt class="literal">enum</tt> or via a <tt class=
"literal">const</tt> variable do not understand why the traditional
<tt class="literal">#define</tt> route used upper case. If I come
across <tt class="literal">FALSE</tt>' in someone's code I should
be certain that somewhere there is a pre-processor directive:</p>
<pre class="programlisting">
#define FALSE 0
</pre>
<p>If I come across <tt class="literal">false</tt>' I should expect
to find either:</p>
<pre class="programlisting">
int const false = 0;
</pre>
<p>or</p>
<pre class="programlisting">
enum {false};
</pre>
<p>This separation between pre-processor identifiers and all others
is absolutely essential. It should supersede all other guidelines
about identifiers. Unfortunately, the C Standard Library breaks
this guideline. We have to live with that, but we do not have to
like it nor do we have to emulate it.</p>
<p>Time to look at some code.</p>
<pre class="programlisting">
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
unsigned long crc32_table[256] = { /* lookup table */
0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA, 0x076DC419, 
/* 251 further values, omitted by the Harpist to save space */
};
unsigned long crc32(char filein[]) {
  FILE *in;             /* input file */
  unsigned long crc;    /* CRC value */
  unsigned char *buf;   /* pointer to the input buffer */
  size_t i, j;        /* buffer positions*/
  int k;              /* generic integer */
  /* open file */
  if((in = fopen(filein, &quot;rb&quot;)) == NULL){
      fprintf(stdout, &quot;CRC32: Can't open file %s\n&quot;, filein);
      return -1;
  }
  /* allocate buffer */
  if((buf = (unsigned char *) malloc(32766)) == NULL){
    fprintf(stdout, &quot;CRC32: Can't allocate memory!\n&quot;);
    fclose(in);
    return -2;
  }
  crc = 0xFFFFFFFF; /* preconditioning sets non zero value */
  /* loop through the file and calculate CRC */
  while(1){
      i=fread(buf, 1, 32766, in);
      if(i==0) break;
      for(j=0; j &lt; i; j++){
        k=(crc ^ buf[j]) &amp; 0x000000FFL;
        crc=((crc &gt;&gt; 8) &amp; 0x00FFFFFFL) ^ crc32_table[k];
      }
  }
  crc=~crc; /* postconditioning */
  free(buf);
  fclose(in);
  return crc;
}
</pre>
<p>I do not know whether Spencer wrote this code or if he simply
copied it from a book of algorithms. If it was the latter it only
goes to show how bad much published code is.</p>
<p>What do you think of the comments? What about the variable
names? Rather than telling me that <tt class=
"varname">crc_table</tt> is a look-up table (surely obvious) would
it not have been more helpful to at least either explain the
significance of the entries or to give a reference to somewhere
that describes the algorithm?</p>
<p>If it is a look-up table (it is) why has it not been qualified
as <tt class="literal">const</tt>? If it is only used in this file
(it is) why has it not been qualified as <tt class=
"literal">static</tt>? If it is only used by <tt class=
"varname">crc32</tt>, why hasn't it been declared as a <tt class=
"literal">static</tt> local variable in <tt class=
"function">crc32()</tt>? Finally, why has it been laboriously
initialised in the file? Suppose that some other table of values
was wanted, this mechanism requires recompilation. Try the
following:</p>
<pre class="programlisting">
unsigned long compute_crc32(char const * infile_name) {
  static unsigned long * crc32_table =  init_crc32_table();
/* rest of function */
</pre>
<p>Note that even though <tt class=
"function">init_crc32_table()</tt> will only be called once during
execution of the program I still make it a free-standing function.
I do not want to clutter up <tt class=
"function">compute_crc32()</tt> with details of initialising the
look-up table. I cannot emphasise too strongly how important such
extraction of source code is. Also, note the change of name of the
function. Another small issue but helpful. Function names should
look like verbs because that is the way they are used. crc32 should
be a value because that is the way we would read it.</p>
<p>I have changed the parameter both because I want to <tt class=
"literal">const</tt> qualify it (string literals will be arrays of
<tt class="type">const char</tt> in future) and because it is a C
idiom. Don't just do things differently because you think it might
be better; that makes it harder for others to read your code.
Finally I changed the parameter name to make it more
descriptive.</p>
<p>Next look at how the file is being processed. It looks as if the
programmer does not know that files are already buffered in C.
Indeed the buffering should have been chosen by the implementers to
provide efficient access. I am not saying that the programmer's
mechanism is positively wrong but at the very least there should be
a comment explaining it. Adding your own buffering adds places
where things can go wrong. Until I had a very good reason to do
otherwise I would write something like:</p>
<pre class="programlisting">
  FILE *input_file;
  unsigned long crc32 = 0xFFFFFFFF; /* preconditioning sets non zero value / 
  /* open file */
  input_file = fopen(infile_name, &quot;rb&quot;)
  if(!input_file){
      fprintf(stdout, &quot;CRC32: Can't open file %s\n&quot;, infile_name);
      return file_failure;
  }
    /* loop through the file and calculate CRC */
  {
    int next_byte = getc(input_file);
    while(next_byte != EOF){
      next_byte =(crc 32^ next_byte) &amp; 0x000000FFL;
      crc32=((crc 32&gt;&gt; 8) &amp; 0x00FFFFFFL) ^ crc32_table[next_byte];
      next_byte = getc(input_file);
    }
  }
  crc32=~crc32; /* postconditioning */
  fclose(input_file);
  return crc32;
}
</pre>
<p>Now please note that there were some errors in the original
code. The two error <tt class="literal">return</tt>s propose to
return a negative value through an <tt class="type">unsigned
long</tt>. A little thought reveals that unless you can guarantee
that one or more of the values in the <tt class="type">unsigned
long</tt> range can never be created by the algorithm on which this
function is based then there are no available values for error
return. We cannot even fix the problem the way that <tt class=
"function">getc()</tt> does (by returning an <tt class=
"type">int</tt>, when all the possible non-error values only use an
<tt class="type">unsigned char</tt>) because we have nothing larger
than an <tt class="type">unsigned long</tt> (many systems provide
<tt class="type">long long</tt> but that is not standard at least
until C9X is released) to provide surplus values.</p>
<p>We have a number of possible solutions:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>We can have an extra parameter that is used to return an error
status.</p>
</li>
<li>
<p>We can return the CRC in a parameter and use the return value to
signal success/failure</p>
</li>
<li>
<p>We can raise a signal</p>
</li>
<li>
<p>We can use a global variable to signal an error (like the use of
<tt class="varname">errno</tt> by the standard library)</p>
</li>
</ol>
</div>
<p>The last of these is dangerous for two reasons:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>too many programmers do not bother to check</p>
</li>
<li>
<p>it does not work in multi-threaded environments.</p>
</li>
</ol>
</div>
<p>The penultimate relies on using signal handlers and that would
seem to be something of overkill for this kind of problem.</p>
<p>Of the first two methods, I have a strong preference for the
second one because it is easy to use and therefore stands a good
chance of being used.</p>
<p>Before I provide my final version there are two other points
that need attention. I do not understand why the function returns
the complement. That is the kind of line that deserves a comment.
As I do not know the answer, I cannot supply that comment. The
second issue is that buried in the implementation is the implicit
requirement that the look-up table contains 256 elements. We would
be advised to make this more explicit in our code.</p>
<pre class="programlisting">
int compute_crc32(char const * infile_name, unsigned long * crc32) {
  static unsigned long  crc32_table[256]; /* this algorithm requires a look-up table of 256 elements */
  static int initialised = false;
  FILE *input_file = fopen(infile_name, &quot;rb&quot;);
  if(!initialised ) {
     init_crc32_table(crc32_table); initialised = true;
  }
  /* check file opened OK */
  if(!input_file){
      fprintf(stdout, &quot;CRC32: Can't open file %s\n&quot;, infile_name);
      return false; /* failed */
  }
  *crc32 = 0xFFFFFFFF; /* preconditioning sets non zero value */ 
    /* loop through the file and calculate CRC */
  {
    int next_byte = getc(input_file);
    while(next_byte != EOF){
      next_byte =(*crc 32^ next_byte) &amp; 0x000000FFL;
      *crc32=((*crc 32&gt;&gt; 8) &amp; 0x00FFFFFFL) ^ crc32_table[next_byte];
      next_byte = getc(input_file);
    }
  }
  *crc32=~*crc32; /* postconditioning */
  fclose(input_file);
  return true; /* success */
}
</pre>
<p>I am assuming that such values as <tt class=
"literal">true</tt>', <tt class="literal">false</tt>' have been
provided via an <tt class="literal">enum</tt>. I have left the
writing of <tt class="function">init_crc32_table()</tt> to the
reader because there are several choices available. The values
could be hard coded into that function but they are more likely to
be read from a file. The name of that file might be hard-coded or
it might be provided by an environment variable or by some other
means. The point is that the mechanism by which the table of values
is provided has nothing to do with the implementation of the
algorithm and so should be kept separate.</p>
<p>Note that the absence of <tt class="literal">const</tt>
qualification to the (new) second parameter signifies that it is
for writing back a value. These small details make it easier to
understand the intent of the programmer.</p>
<p>I think I have space and time to look at one more function from
Spencer's code. So here it is. Try reading it first and see if you
can follow it. If you cannot, can you explain what the difficulty
is? I have some ideas but try to form your own to start with.</p>
<pre class="programlisting">
/* Pass either argv[0] or a filename as the progname and a char pointer to act
 as the ini_file buffer.  Note: Only parses lines which begin with a character greater than '#' i.e.
an alphanumeric.  Also the list of statements and values are seperated by either '=',' ' or ','
and each entry is seperated from the next by '\0'.  The last entry is designated by &quot;\0\0&quot;. */
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
#include &lt;stdlib.h&gt;
#include &quot;ini_proc.h&quot;
extern char *ini_buffer;
int ini_proc(char ini_name[]) {
  char *chr_ptr;
  char module_name[]=&quot;INI_PROC&quot;;
  char inifilename[14] = {'\0'};
  long int file_sz;
  char linein[80];
  char *statement;
  char *value;
  FILE *inifile;
  /* Construct INI filename from program name */
  if(strlen(ini_name) &gt; 14) {
    if(!(chr_ptr = strrchr(ini_name, '\\'))) { /* Get last filename from path */
      fprintf(stdout, &quot;%s: Invalid filename to process.\n&quot;, module_name);
      return -4;
    }
    ++chr_ptr;
  } else if(strlen(ini_name) &lt; 2) {
    fprintf(stdout, &quot;%s: Invalid filename to process.\n&quot;, module_name);
    return -3;
  } else {
    chr_ptr = ini_name;
  }
  strcpy(inifilename, chr_ptr);
  if(chr_ptr = strtok(inifilename, &quot;.&quot;)) {
    chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
  }
  strcat(inifilename, &quot;.ini&quot;);
  /* Open and process inifile */
  if((inifile = fopen(inifilename, &quot;r&quot;)) == NULL) {
    fprintf(stdout, &quot;%s: Unable to open %s.\n&quot;, module_name, inifilename);
    return -1;
  }
  /* Allocate suitably sized block of memory for ini_file contents */
  file_sz = filesize(inifile);
  if(!(ini_buffer=(char *)calloc(file_sz, sizeof(char)))) {
    fprintf(stdout, &quot;%s: Failed to allocate %ld bytes for ini file buffer.\n&quot;, module_name, file_sz);
    return -2;
  }
  /* Now read ini file statements into block */
  chr_ptr = ini_buffer;
  while(fgets(linein, 80, inifile) != NULL ) {
    /* if line starts with # or less then ignore */
    if(linein[0] &gt; 35) {
      statement = strupr(strtok(linein, &quot;= ,&quot;));
      value = strtok(NULL, &quot;#;&quot;);
      /* add new entry */
      strcpy(chr_ptr, statement);
      /* go beyond the current entry */
      chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
      /* add new entry */
      strcpy(chr_ptr, value);
      while((chr_ptr[(strlen(chr_ptr)-1)] &lt; 33) &amp;&amp; (chr_ptr[(strlen(chr_ptr)-1)] &gt; 0)) {
        chr_ptr[(strlen(chr_ptr)-1)] = '\0';
      }
      /* go beyond the current entry */
      chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
    }
    /* add second null for end of statement/value entry */
    *chr_ptr = '\0';
  } ++chr_ptr;
  /* Now set last item entry */
  strcpy(chr_ptr, &quot;lAsTeNtRy&quot;);
  chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
  strcpy(chr_ptr, &quot;lAsTeNtRy&quot;);
  chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
  fclose(inifile);
  return 0;
}
</pre>
<p>OK, the first thing that scares me with this code is seeing that
<tt class="type">extern char *</tt> out in global space. At the
very least it can do with a better name and a comment telling me
what it is for and which file initialises it because as it stands
we have a pointer floating around with no idea what it is pointing
at.</p>
<p>At this stage I need a justification for a global variable, and
doubly so if it is a pointer. This is one of the places where I
want to see a comment. I would also like a more informative name
for the function and some brief explanation as to what it does. I
suspect that it is intended to initialise some process. Fine, why
not call it <tt class="function">initialise_process()</tt>. Even
better add something that tells the reader what process is being
initialised. I know that strictly conforming code has a very short
external identifier length (six characters, case insignificant) but
real world compilers are far more generous these days.</p>
<p>Now look at the parameter name, does that suggest a filename
possibly proceeded by a path? While you are trying to answer that
question notice the return statements with magic numbers. Let us
fix that one while we are about it (and note that the exact values
are irrelevant, all we want is to be able to distinguish between
different failure modes. Putting the following in the relevant
header file should help:</p>
<pre class="programlisting">
enum errors {success, name_too_long, name_missing, file_missing, insufficient_RAM};
</pre>
<p>Now back to that parameter. There is a comment that talks about
last filename on the path. Now that does not make sense. Are we to
suppose that the parameter passes a path from which we have to
extract the file name that terminates it? In which case what is the
significance of 14? (Take a guess, this is a command line parameter
which starts with a backslash and refers to a file that is in
standard MSDOS 8.3 format and we need a terminator. No that won't
work because <tt class="function">strlen()</tt> will ignore the
null terminator. You see what I mean? We need better identifiers
and some explanation about what is being done.)</p>
<p>Let us try to reorganise the checks so that they make some sense
(note that long list of local variables riddled with uninitialised
pointers is worrying me - bound to be a source of errors and
maintenance problems). Surely we must first check that the argument
passed is not a null-pointer and points to a string long enough to
be some use. That gives me:</p>
<pre class="programlisting">
if (!ini_name) return name_missing;
if(strlen(ini_name&lt;2)) return name_missing;
</pre>
<p>I think that we should next pull off whatever comes after the
last backslash if there is one. That results in something like:</p>
<pre class="programlisting">
if((chr_ptr = strrchr(ini_name, '\\'))) chr_ptr++;
else chr_ptr = ini_name;
</pre>
<p>At this stage chr_ptr points to everything after the last
backslash, or everything if there is no backslash. Now we must
check that we still have a non-null string with:</p>
<pre class="programlisting">
if(!chr_ptr[0]) return name_missing; /* check first character is not null */
</pre>
<p>Now reading the code it seems that we must add a <tt class=
"filename">.ini</tt> extension but look at this piece of code from
the above:</p>
<pre class="programlisting">
  strcpy(inifilename, chr_ptr);
  if(chr_ptr = strtok(inifilename, &quot;.&quot;)) {
    chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
  }
  strcat(inifilename, &quot;.ini&quot;);
</pre>
<p>I can see what the first line does, and I am not ready for that
yet but the next two lines baffle me. <tt class=
"function">strtok</tt> will return the address of the first
character after a dot if there is one else it returns null. So if
it finds a dot we set <tt class="varname">chr_ptr</tt> to point to
the null terminator after that dot and then make no use of it
because the next time <tt class="varname">chr_ptr</tt> occurs it is
on the left of an assignment. Now I think that the intent is to add
the extension <tt class="filename">.ini</tt> if the tentative
filename does not have an extension. But as written it will
over-write any extension with <tt class="filename">.ini</tt>. The
trouble is that we must devise a mechanism that will correctly
detect filenames that are too long. The first step is to copy nine
characters from where <tt class="varname">chr_ptr</tt> is pointing
(ignore the block of code above, it is pointing to the start of the
putative filename) into <tt class="varname">inifilename</tt>.</p>
<pre class="programlisting">
strncpy(inifilename, chr_ptr, 9);
strtok(inifilename, &quot;.&quot;);  /* replace the first dot by null if there is one */
if(strlen(inifilename)&gt;8) return name_too_long;
strcat(inifilename, '.ini');
</pre>
<p>I am not entirely happy with this because it forcibly makes the
extension .ini. More reasonable would be to make that the default
extension. The simplest way is to replace the last line above
by:</p>
<pre class="programlisting">
strncpy(inifilename, chr_ptr, 13);
if(inifilename[12]) return name_too_long;   /* strncpy fills end of string with nulls */
if (!strrchr(inifilename, '.')) strcat(inifilename, '.ini');
</pre>
<p>By testing the thirteenth character in <tt class=
"varname">inifilename</tt> after the second use of <tt class=
"function">strncpy</tt> we can ensure that fewer than 13 characters
were copied. Actually, I still have not tested any existing
extension to see if it is more than three characters. I am leaving
that as an exercise for the reader. By the way, at this stage you
should not be concerned about code speed because this code only
runs once. Even if it is terribly inefficient it will have no
impact on the overall performance of your program.</p>
<p>Now we are ready to open the file. I abhor unnecessary
complexity in statements so I would write:</p>
<pre class="programlisting">
inifile = fopen(inifilename, &quot;r&quot;);
if(!inifile) return file_missing;
</pre>
<p>I suppose it is about time I explained why I am not providing
error messages. This is the wrong place for them. It is for the
user of the function to decide what messages to print and where to
print them. However if you really want to compose the messages in
this function a slight change will allow you to do so. Change the
return type of the function to <tt class="type">char const *</tt>.
Provide a local static array large enough for the longest message.
Place the message in that and return its address. For the
successful run just return the null pointer. By starting each error
message with a unique digit you can even return the nature of the
problem as well as an appropriate message. This only one way of
solving this kind of problem. The advantage is that more
information is provided in the error message, the disadvantage is
that you provide more clutter in the local source code and that
inevitably leads to mistakes.</p>
<p>Now to another of the errors hiding in this code:</p>
<pre class="programlisting">
  file_sz = filesize(inifile);
  if(!(ini_buffer=(char *)calloc(file_sz, sizeof(char)))) {
    fprintf(stdout, &quot;%s: Failed to allocate %ld bytes for ini file buffer.\n&quot;, module_name, file_sz);
    return -2;
  }
</pre>
<p>Did you spot that <tt class="varname">file_sz</tt> is a
<tt class="type">long int</tt>? Yes that is what the user provided
<tt class="function">filesize()</tt> returns. But this code seems
to be written for an MSDOS system. The first parameter of
<tt class="function">calloc</tt> is <tt class="type">size_t</tt>
and that is usually a <tt class="literal">typedef</tt> for an
<tt class="type">unsigned int</tt> on an MSDOS system. This code
runs a severe risk of allocating too little memory if the requisite
file is larger than 64k. There clearly needs to be some more work
done here. By the way, <tt class="function">sizeof(char)</tt> is
always one, the C standard requires that to be the case. Actually
because this is providing memory for a buffer, I would write a
separate function to do it and my code would look something
like:</p>
<pre class="programlisting">
if(!allocate_space(ini_buffer, filesize(inifile))) return insufficient_RAM;
</pre>
<p>It is difficult to comment on the rest of the code because I do
not know what the writer is trying to achieve. I must admit that I
am worried by the intent to read an entire file into RAM. I feel
that there needs to be some kind of validation on the file size.
One important thing when writing code in any language is to try to
stick to the idioms of the language. For example</p>
<pre class="programlisting">
chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
</pre>
<p>Occurs in several places. While it achieves its objective of
pointing to the null terminator of your current string I would feel
it more natural to write:</p>
<pre class="programlisting">
chr_ptr += strlen(chr_ptr) + 1;
</pre>
<p>The failure to use standard idioms is something that makes code
difficult to read. The experienced programmer expects to read code
and understand it. When faced with non-idiomatic usage the expert
is surprised and probably assumes that the writer does not know the
language (C in this instance). In natural language, failure to use
local idioms reveals that you are an outsider. Outsiders are
usually suspect (that is part of our inheritance as social
animals). In order to learn idioms you must converse with native
speakers. In order to learn program idioms you must study the code
of good programmers.</p>
<p>Let me give you a simple example form the above that illustrates
the problem of non-idiomatic code. Spencer declares his function
with:</p>
<pre class="programlisting">
int ini_proc(char ini_name[])
</pre>
<p>Experienced programmers know that this is converted to</p>
<pre class="programlisting">
int ini_proc(char * ini_name)
</pre>
<p>by the compiler. Many programmers sensibly use the array format
to indicate to users that the address of an array is expected.
However the case of arrays of <tt class="type">char</tt> is
different. It would be quite exceptional to pass the address of a
single <tt class="type">char</tt> (so much so that a comment should
be made any place where that is intended). On the other hand string
literals are semantically (i.e. in their behaviour) read only and
so should be <tt class="literal">const</tt> qualified. C originally
chose not to require that qualification. C++ (and, in future, C)
specifies that the type of a string literal is array of <tt class=
"type">const char</tt> which can decay to <tt class="type">char
*</tt> (unqualified). That conversion is deprecated. Now
consider:</p>
<pre class="programlisting">
int ini_proc(char const ini_name[])
</pre>
<p>Does that convert to:</p>
<pre class="programlisting">
int ini_proc(char const * ini_name)
</pre>
<p>or</p>
<pre class="programlisting">
int ini_proc(char * const ini_name)
</pre>
<p>Even if you know which, will those that read your code? By using
the array form you make those reading your code think in ways that
they would prefer not to. Just write what you mean in the idiomatic
form that others will use. In this case that is:</p>
<pre class="programlisting">
int ini_proc(char  const  *ini_name)
</pre>
<p>Some readers may think that this is being pedantic. After all
there is nothing wrong with what was originally written. That is
true. But we also need to be understood. So, over to the readers
for their comments.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
