    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: A Critique of Some Code Revisited</title>
        <link>https://members.accu.org/index.php/journals/744</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 11, #1 - Nov 1998</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c134/">111</a>
                    (19)
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;A Critique of Some Code Revisited</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 November 1998 13:15:28 +00:00 or Tue, 03 November 1998 13:15:28 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>Dear Francis &amp; Spencer,</p>
<p>Please feel free to forward this e-mail to the Harpist &amp; do
anything you like with the sources provided - including throwing it
away, if you wish!</p>
<pre class="programlisting">
extern char *strtok(char * s1, const char * s2);
        strcpy(inifilename, chr_ptr);
        if (chr_ptr = strtok(inifilename, &quot;.&quot;)) {
          chr_ptr = chr_ptr + (strlen(chr_ptr)+1);
        }
        strcat(inifilename, &quot;.ini&quot;);
</pre>
<p>The Harpist is incorrect to say that <tt class=
"function">strtok</tt> returns the address of the character after
the dot if there is one else it returns <tt class=
"literal">NULL</tt>. The only circumstance (see
addendum/corrigendum at end) in which <tt class=
"function">strtok</tt> can return <tt class="literal">NULL</tt>
when <tt class="varname">s1</tt> is non-<tt class=
"literal">NULL</tt>, is if <tt class="varname">s1</tt> contains
only characters from <tt class="varname">s2</tt> (which includes
the case where <tt class="varname">s1</tt> is a zero-length
string). Therefore, inifilename would have to be a string
consisting of just 0 or more &quot;.&quot; characters for a <tt class=
"literal">NULL</tt> return.</p>
<p><tt class="function">strtok</tt> is a complex function that has
potentially undesirable output unless you know precisely what it
does, including behaviour regarding such things as consecutive
separator characters in the source string and which characters it
overwrites in the source string. <tt class="function">strtok</tt>
also has the disadvantage of requiring static storage to remember
pointers for future calls which can cause threading problems, so my
personal preference is to avoid <tt class="function">strtok</tt>
and use the other related library calls directly: <tt class=
"function">strspn</tt>, <tt class="function">strcspn</tt> and
<tt class="function">strpbrk</tt> - which many people I've spoken
to don't appear to have heard of.</p>
<p>I've thrown away the entire function as it stands and replaced
it with the following collection of functions that should do what
the programmer originally wanted except for the method of storing
the contents of the file in memory. I don't like <tt class=
"function">strupr</tt> as it infringes on the C library's
namespace, so I've changed the name used. This set of functions
should be more portable, with the filename-style specific code
containing in just the single set of declarations at the top of
<tt class="function">validate_ini_filename()</tt>. As the Harpist
found too, the format of the file is in indeterminate - some of the
comment in the original source seems to apply to the file and some
to the memory buffer, and then there's the magic <tt class=
"literal">lAsTeNtRy</tt> which is completely undocumented.</p>
<p>I realise that Spencer said that his manager would rewrite
everything totally and not give understandable explanations, but
what I hope that my version of <tt class=
"function">validate_ini_filename()</tt> demonstrates is that code
can be written in such a way that it follows a natural language
version of the algorithm directly. For example, the statements:</p>
<pre class="programlisting">
  spn = strcspn(ptr, extn_separator_str);
  if (spn &gt; max_baselen) return name_too_long;
</pre>
<p>would match with &quot;if the base length is longer than maximum
allowed, then it should be rejected with a 'too long' error.&quot;.
Hopefully, it should be obvious that the code matches that
description rather better than:</p>
<pre class="programlisting">
  strtok(ptr, &quot;.&quot;);
  if (strlen(ptr) &gt; 8) return name_too_long;
</pre>
<p>(although this does rely on the person reading the code knowing
what the <tt class="function">strcspn</tt> function does!)</p>
<p>The removal of magic numbers and of complex pointer arithmetic
should make it easier to read through the functions and understand
what each one is doing, even in the absence of many of the
comments.</p>
<p>I've added a short <tt class="function">main()</tt> function to
drive the rest of the code and dump out the table at the end of the
run.</p>
<pre class="programlisting">
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;stddef.h&gt;
#include &lt;ctype.h&gt;

typedef struct lookup_table_entry {
  struct lookup_table_entry *next;
  const char *value;
  char key[1];
} lookup_table_entry;

typedef struct lookup_table {
  lookup_table_entry *head;
} lookup_table;

enum errors {
  success,
  name_too_long,
  name_missing,
  file_missing,
  insufficient_RAM,
  line_too_long
};

lookup_table *new_lookup_table(void) {
  lookup_table *t = malloc(sizeof(lookup_table));
  if (t != NULL) t-&gt;head = NULL;
  return t;
}

enum errors add_lookup_pair(lookup_table *t, 
                  char *key, char *value) {
  const size_t key_len = strlen(key) + 1;
  const size_t value_len = strlen(value) + 1;
  const size_t te_len = 
            offsetof(lookup_table_entry, key);
  lookup_table_entry *te = 
        malloc(te_len + key_len + value_len);

  if (te == NULL) return insufficient_RAM;
  te-&gt;next = NULL;
  te-&gt;value = te-&gt;key + key_len;
  memcpy(te-&gt;key, key, key_len);
  memcpy(te-&gt;key + key_len, value, value_len);
  if (t-&gt;head == NULL) t-&gt;head = te;
  else {
    lookup_table_entry *te_srch = t-&gt;head;
    while (te_srch-&gt;next != NULL)
                   te_srch = te_srch-&gt;next;
    te_srch-&gt;next = te;
  }
  return success;
}

static void strupr(char *ptr) {
  for (;*ptr; ++ptr) if (islower(*ptr)) *ptr = toupper(*ptr);
}

/* Caller must always free the value placed into 
     *filename.  This code  guarantees that the value
      will always be valid to pass to free()
 */
enum errors validate_ini_filename
                     (const char *arg, char **filename) {
  /* Filename structure definitions and limits */
  const size_t max_baselen       = 8;
  const size_t max_extnlen       = 3;
  const char   dir_sep           = '\\';
  const char   extn_sep          = '.';
  static const char extn_separator_str = &quot;.&quot;;
  static const char dflt_ext[]   = &quot;.ini&quot;;

  size_t spn;
  char *buffer, *ptr;

  /* Allocate enough memory, set return value to 
        point to result  or NULL so that *filename is 
       always valid for safety.  Ensure  that there is 
       room for the extension we may want to add.
   */
  *filename = buffer = 
        malloc(strlen(arg) + sizeof(dflt_ext));
  if (buffer == NULL) return insufficient_RAM;
  strcpy(buffer, arg);

  /* Locate leafname */
  ptr = strrchr(buffer, dir_sep);
  if (ptr != NULL) ++ptr; else ptr = buffer;

  spn = strcspn(ptr, extn_separator_str);  
/* find length of base name */
  /* Check part before the extension separator - is it too long? */
  if (spn &gt; max_baselen) return name_too_long;
  /* Was it zero length? eg.  &quot;.ini&quot; */
  if (spn == 0) return name_missing;

  if (ptr[spn] != extn_sep) {
    /* There wasn't an extension, so add one */
    strcpy(ptr + spn, dflt_ext);
  }
  else if (strlen(ptr + spn + 1) &gt; max_extnlen) {
    /* There was an extension already but was too long */
    return name_too_long;
  }
  return success;
}

enum errors read_ini_file(FILE *f,
               lookup_table *table){
  char buf[80];
  enum errors result = success;

  while (result == success 
    &amp;&amp; fgets(buf, sizeof(buf), f)) {
    char *statement, *value;
    /* Test for truncated line */
    char *ptr = strpbrk(buf, &quot;\n\r&quot;);
    if (ptr == NULL &amp;&amp; !feof(f)) {
      result = line_too_long;
      continue;
    }
    /* Strip unnecessary stuff from end of value.
          Search for  any line or value terminators. */
    ptr = strpbrk(buf, &quot;\n\r#;&quot;);
    if (ptr != NULL) *ptr = '\0';
    /* delimit the statement */
    statement = strtok(buf, &quot;=, &quot;);
    if (statement == NULL 
        || !isalpha(*statement)) continue;
    strupr(statement); 
              /* XXX: non ANSI str[a-z].*  */
    /* Find value - if present */
    value = strtok(NULL, &quot;&quot;);
    if (value != NULL) {
      char *ptr;
      /* Skip leading whitespace */
      value += strspn(value, &quot;=, &quot;);
      /* Chop off trailing whitespace */
      ptr=value;
      while (*ptr  &amp;&amp; !(isspace(*ptr)
             ||iscntrl(*ptr)))  ++ptr;
      *ptr = '\0';
      result = add_lookup_pair(table, statement, value);
    }
  }
  return result;
}


int ini_proc(const char *arg,
                 lookup_table *table) {
  enum errors result = name_missing;

  if (arg != NULL) {
    char *filename;
    result = validate_ini_filename(arg, &amp;filename);
    if (result == success) {
      FILE *inifile = fopen(filename, &quot;r&quot;);
      if (inifile != NULL) {
        result = read_ini_file(inifile, table);
        fclose(inifile);
      }
      else {
        result = file_missing;
      }
    }
    free(filename);
  }

  return result;
}

int main(int argc, char *argv[]) {
  int result = insufficient_RAM;
  lookup_table *table = new_lookup_table();
  if (table != NULL) {
    lookup_table_entry *lte;
    result = ini_proc(argv[1], table);
    for (lte=table-&gt;head; lte; lte=lte-&gt;next) {
      printf(&quot;%s --&gt; %s\n&quot;
                  , lte-&gt;key, lte-&gt;value);
    }
  }
  return result;
}
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e115" id="d0e115"></a>Addendum
&amp; Corrigendum</h2>
</div>
<p class="c2"><span class="remark">At first glance you are both
wrong &#9786; The Harpist's assertion is incorrect as is your
assertion that strtok can only return a null pointer if s1 is
non-null. By my reading it can also do so if s2 points to the end
of the string (I think that means that it points one beyond the
end)</span></p>
<p>Did I really say that? &#9786; What I meant was:</p>
<p><tt class="function">strtok(s1,s2)</tt> can return <tt class=
"literal">NULL</tt> under two circumstances:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p><tt class="literal">s1 != NULL</tt> and <tt class=
"varname">s1</tt> contains only 0 or more characters from
<tt class="varname">s2</tt>.</p>
</li>
<li>
<p><tt class="literal">s1 == NULL</tt> and the remainder of the
tokenised string is only 0 or more <tt class="type">char</tt>s from
<tt class="varname">s2</tt>.</p>
</li>
</ol>
</div>
<p class="c2"><span class="remark">If I understand it, the first
call (with a non-null pointer) sets it up and subsequent calls
return tokens until there are no more at which stage it returns a
null pointer.</span></p>
<p>The initial call also returns the first token, if there are any.
I think this just underlines that complexity of <tt class=
"function">strtok</tt>! This, together with the static storage and
the obliteration of the source string, is the reason why I prefer
to use <tt class="function">strspn</tt>, <tt class=
"function">strcspn</tt> &amp; <tt class="function">strpbrk</tt> and
do any overwriting myself. After all, all the implementation of
strtok on my machine does is call the first two with bits of glue
to remember points and write out the zero bytes.</p>
<p>The other issue which I found was not addressed to my
satisfaction by many of the books I have read is what happens when
you change the separator characters. For example, the sequence:</p>
<pre class="programlisting">
strtok(&quot;f1:f2::,f3,&quot;, &quot;:&quot;);
strtok(NULL, &quot;:&quot;);
strtok(NULL, &quot;,&quot;);
strtok(NULL, &quot;,&quot;);
strtok(NULL, &quot;,&quot;);
</pre>
<p>returns what sequence of results? Certainly at first, I was
unsure what I would get - but it should return &quot;f1&quot;, &quot;f2&quot;, &quot;:&quot;,
&quot;f3&quot;, NULL.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
