    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Anonymous Code Exercise Rewrite</title>
        <link>https://members.accu.org/index.php/journals/922</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 11, #6 - Oct 1999 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c129/">116</a>
                    (22)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c65/">Programming</a>
                    (877)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c129-65/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c129+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Anonymous Code Exercise Rewrite</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 06 October 1999 13:15:33 +01:00 or Wed, 06 October 1999 13:15:33 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>(This is based on the code presented on page page 36 of C Vu
11.5.</p>
<p>The program should take a file whose lines are terminated with
carriage returns ('\r') and display/print it to stdout with the
carriage return replaced by a newline ('\n'). These files occur on
Unix platforms and also on Apple MacIntosh.</p>
<p>I have attempted to include as much error checking as possible
while retaining the original program structure. My code is written
in traditional C, not C++.</p>
<p>The principle has been to check return codes when they are
available. In real life I have seen problems where a failure to
check all return codes has caused much greater problems. One
example was a Korn Shell script which performed a backup on a Unix
box. Every return code was checked, except for the one coming back
from the &quot;mount&quot; instruction which mounted the destination file
system. On one occasion the destination file system was not
available and the temporary area was used by default. The temporary
area filled and a number of critical functions failed to run
overnight as they also required temporary file space. The moral of
the story is that return codes are put there for a purpose.</p>
<p>I have identified one case where the program will fail. If the
file name contains spaces, and it is not enclosed in quotation
marks, then the program will take the argument to be the text up to
the first white space character. I have hit this problem in real
life with command line utilities on Windows NT.</p>
<p>In performing the error checking I have tried to cater for
worst-case failures. I have considered that both stdout and stderr
may have been redirected and so might have problems with output. In
the event of a failure to write a character to stdout we try to
write a message to stderr. If we cannot write a message to stderr,
then we are in trouble and we call the abort() function to clean
up. The possibility of more general character translation is
allowed for with the structure as presented. If we had to translate
more than a few characters I would have used a select/case
construct.</p>
<p>I haven't been able to identify any areas where the program is
non-conformant - but then I don't have a copy of the C standard. I
have checked the library functions in the documentation and it
passes through lint with the exception of a very strange comment
about the enum and variable types.</p>
<p>Catriona</p>
<pre class="programlisting">
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;

int main(int argc, char **argv) {
        /*  Re-write of Exercise on Page 36 of CVu 11.5 (July 1999).
          * Program will read a file and put the output to stdout replacing  line returns with newlines.
          * Catriona Siobhan O'Connell 10 July 1999   */
        int ch;
        int newch;
        FILE *infile;
        
        enum ERROR_CODE {ERROR_SUCCESS,
                                        ERROR_MISSING_ARGUMENT,
                                        ERROR_OPEN_INPUT,
                                        ERROR_OUTPUT,
                                        ERROR_CLOSE_INPUT};
</pre>
<p class="c2"><span class="remark">I have three comments to make on
the above. (I have adopted italic arial narrow to identify me. The
Harpist) First, I generally think it is a poor idea to declare an
enum at block scope. I guess that is only a personal preference,
but I think it indicates a style of programming where too much is
placed in a single function.</span></p>
<p class="c2"><span class="remark">Second, I think there is very
little purpose served by naming this kind of enum. You do not
intend to have variables of this type, indeed it is probably a
mistake to have more than a single variable of this kind. Consider
the effect of having a global enum and using that to provide error
returns to various functions that can fail fatally.</span></p>
<p class="c2"><span class="remark">Thirdly, and by far the most
important, never use identifiers that lack any lower case letters
unless they are pre-processor ones. And never use lower case
letters in pre-processor identifiers. That way you can reduce the
chances of an identifier being replaced by the pre-processor.
Consider having the following defined at file scope:</span></p>
<pre class="programlisting c3">
<span class=
"remark">enum { success, missingargument, inputfailure, outputfailure, fileclosefailure};</span>
</pre>
<p class="c2"><span class="remark">Note that the enumerated values
are more descriptive and I have avoided that very ugly
ERROR_SUCCESS. I am flirting with the idea of not using uppercase
and underscores in my identifiers. If you really like this upper
case style use the traditional C <tt class=
"literal">#define</tt>s.</span></p>
<pre class="programlisting">
 /*  Check if exactly one argument has been passed to the program else raise an error and exit */
         
        if (argc != 2) {
                if (EOF == fprintf(stderr,&quot;No file name given.\n&quot;)) abort();
</pre>
<p class="c2"><span class="remark">Did you check the spec for the
return value of <tt class="function">fprintf</tt>? <tt class=
"constant">EOF</tt> does not come into it. You must
write:</span></p>
<pre class="programlisting c3">
<span class=
"remark">                if (fprintf(stderr,&quot;No file name given.\n&quot;)&lt;0) abort();</span>
</pre>
<p class="c2"><span class="remark">because <tt class=
"function">fprintf</tt> is only required to return an unspecified
negative value if it fails. There are several places where you made
this mistake. If you want to say that the Standard C Library is
insistent, I agree but we have to work with what we
have.</span></p>
<pre class="programlisting">
            if (EOF == fprintf(stderr,&quot;Syntax: %s filename\n&quot;,argv[0])) abort();
                exit(ERROR_MISSING_ARGUMENT);
        }
</pre>
<i><span class="remark">Why did you not detect too many parameters?
That would help with the whitespace problem.</span></i>
<pre class="programlisting">
 /*  Try to open the argument as a file.  If this fails then raise an error and exit.  */
         
        if ((infile = fopen(argv[1], &quot;rb&quot;)) == NULL) {
</pre>
<p class="c2"><span class="remark">I am coming to prefer this line
written this way:</span></p>
<pre class="programlisting c3">
<span class=
"remark">                     if (!(infile = fopen(argv[1], &quot;rb&quot;))) {</span>
</pre>
<p class="c2"><span class="remark">because it clearly say what it
means 'if not infile...'</span></p>
<pre class="programlisting">
          if (EOF == fprintf(stderr,&quot;Cannot open file %s\n&quot;,argv[1])) abort();
                exit(ERROR_OPEN_INPUT);
        }
        
        /*  Read each character from the file.
          * Perform any required character transformations and  then output to stdout.
          * If the output to stdout fails then an error will be raised and the  program will exit.  */
         
        while (EOF != (ch = getc(infile))) {
                if (ch == '\r') newch = '\n';
                else  newch = ch;
</pre>
<p class="c2"><span class="remark">I am not sure why you did not
write:</span></p>
<pre class="programlisting c3">
<span class=
"remark">       while(EOF !=(newch = getc(infile)))     if (newch == '\r') newch = '\n';</span>
</pre>
<p class="c2"><span class="remark">why the apparently purposeless
intermediate variable?</span></p>
<pre class="programlisting">
           if (EOF == putchar(newch)) {
                        if (EOF == fprintf(stderr,&quot;Problem writing to stdout.\n&quot;)) abort();
                        exit(ERROR_OUTPUT);
                }
        }
        
        /*  Close the input file. 
           * If this fails, we exit the program and allow the system to attempt  the clean-up process. */
         

        if (EOF == fclose(infile)) {
                if (EOF == fprintf(stderr,&quot;Problem closing the input file.\n&quot;)) abort();
                exit(ERROR_CLOSE_INPUT);
        }
        
        /* Despite everything which could have gone wrong we have reached the  end of the file successfully.  */
        return(ERROR_SUCCESS);
</pre>
<p class="c2"><span class="remark">Don't you agree that the above
is a particularly ugly line of code? And while we are considering
it, why return this time and exit everywhere else? Let me offer a
little challenge: rewrite this function so that it has a single
exit point (via a final return statement. And using <tt class=
"literal">goto</tt> is not an acceptable solution.</span></p>
<pre class="programlisting">
}
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e118" id="d0e118"></a>The Harpist's
Code Review Reviewed by the Tuner</h2>
</div>
<p>Of course most of the article (C Vu Sept 98 (page 18) is
excellent, but I have some niggles. I'll simply take them in order
within the article, so these comments will be rather
disjointed.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;Every time you declare some identifier at global scope you
introduce one more place where the linker can become confused&quot;</p>
</blockquote>
</div>
<p>I feel this anthropomorphic comment is unhelpful, especially for
an inexperienced reader. Linkers don't become confused, any more
than kettles or televisions become confused. The linker does what
it's told to do. If the programmer breaks type safety then if
anyone is confused, it's not the linker, it's the programmer.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;Why has it [the look-up table 'crc_table'] been laboriously
initialised in the file? Suppose that some other table of values
was wanted, this mechanism requires recompilation.&quot;</p>
</blockquote>
</div>
<p>A reasonable question to ask, but the Harpist seems to conclude
that there is only one answer. Taking this point of view to the
limit, no data within a program should be initialised, everything
should be read from data files. I hope that is obviously
ridiculous.</p>
<p>Whether to initialise within a program, or use external data, is
a design decision. I assume the given code is calculating a &quot;well
known&quot; CRC (and of course comments referring to such an algorithm
description would have improved the code). Putting the data in an
external file might make the code more flexible, but at the risk of
the data file being lost during distribution of the program binary,
or 'well-intentioned' or accidental modification of that data by
some user. Moreover, by introducing generality in this way, the
reader must consider not only the code, but also look elsewhere for
the look-up table - so in a sense the data has become &quot;more global&quot;
(a downside that the Harpist is rightly conscious of in other
contexts).</p>
<p>And what is &quot;laborious&quot; about its initialisation?</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;I do not understand why the function returns the
complement.&quot;</p>
</blockquote>
</div>
<p>This is fairly common in checksum and CRC algorithms. I haven't
taken time to look at the detail of this one, but consider a
trivial example. Form a checksum by adding values together. A
sequence of all zeros will have a zero checksum. Imagine this is
being used on some electronic board, a wire comes loose, the data
is corrupted to all zeros. The problem is not detected because the
checksum is also corrupted to zero.</p>
<p>Inverting the checksum is a way of solving this, but checksums
are not as good as CRCs in detecting some other errors (e.g. data
reversals). A well-designed CRC algorithm takes into account many
different failure modes. The comment &quot;postconditioning&quot; might well
be a sufficient comment if only the author had made reference at
the start of the function to the source of the algorithm.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;.. buried in the implementation is the implicit requirement
that the look-up table contains 256 elements.&quot;</p>
</blockquote>
</div>
<pre class="programlisting">
[Harpist's new code] static unsigned long crc32_table[256]; /* this algorithm requires a look-up table of 256 elements */ 
</pre>
<p>Oh, come on! This is a little too close to the commenting style
that gives us</p>
<pre class="programlisting">
        count = 0;  /* Set count to zero */
</pre>
<p>If you're calculating a CRC for an array of bytes, then unless
you're doing something pretty strange (something that would indeed
need a comment), you're stepping through the array a byte at a time
and using each byte to index into the look-up table. A byte
(unsigned) ranges from 0 to 255, hence 256 values in the table.</p>
<p>If any of that comes as a bolt from the blue, then such a reader
has no business altering code that calculates CRCs. The Harpist
himself says &quot;Comments are supposed to help you understand how the
code solves the problem and to highlight dangerous or arcane lines
of source code&quot;. Adding trivial comments does not add value, it
adds clutter.</p>
<pre class="programlisting">
[Harpist's new code] if ((chr_ptr = strrchr(ini_name, '\\'))) chr_ptr++; else chr_ptr = ini_name; 
</pre>
<p>Bearing in mind that the article is tutorial in nature, I would
prefer to see something like:</p>
<pre class="programlisting">
chr_ptr = strrchr(ini_name, '\\');
if (chr_ptr != NULL) chr_ptr++;
</pre>
<p>Confusing '<tt class="literal">==</tt>' with '<tt class=
"literal">=</tt>' is too easy to do, though modern compilers issue
warnings. Such a style in a tutorial article may encourage people
to turn off those useful warnings. When you take a compiler out of
its box, you should flip all its warning switches to on, I
(strongly) believe. And if you ever need to turn any off, you
should have a very good reason.</p>
<pre class="programlisting">
[existing code] if (chr_ptr = strtok(inifilename, &quot;.&quot;)) {   chr_ptr = chr_ptr + (strlen(chr_ptr) + 1); 
</pre>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;.. but the next two lines baffle me .. <tt class="literal">set
chr_ptr</tt> .. and then make no use of it because the next time
<tt class="varname">chr_ptr</tt> occurs it is on the left of an
assignment&quot;</p>
</blockquote>
</div>
<p>Huh? And how about the occurrences on the RIGHT of the same
assignment? Of course it's made use of. &quot;<tt class=
"literal">strncpy(inifilename, chr_ptr, 9);</tt>&quot;</p>
<p>I'm not certain that <tt class="varname">chr_ptr</tt> invariably
points at a string less than 9 <tt class="type">char</tt>. Will the
destination be terminated?</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;Provide a local static array large enough for ..&quot;</p>
</blockquote>
</div>
<p>On page 24 the Harpist suggests using a local static array for
returning an error message. Yet on page 21 he is concerned about
mechanisms that don't work in multi-threaded environments.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;.. worried by the intent to read an entire file into RAM.&quot;</p>
</blockquote>
</div>
<p>If the .ini file is generated by the same (or a companion)
program, perhaps the designer has ensured that such files are less
than 64K. Sometimes it's very convenient to hold a small file as an
array in memory. But, as the Harpist says, a file size check is
desirable.</p>
<p>Even if the files are externally generated, assuming they're
small might be a practical restriction. &quot;Dir \windows\*.ini /os&quot; on
my system shows the largest file is about 10K. Microsoft provides
the Notepad text editor, which can't handle files over 64K. If
Microsoft does it, it must be right. Ooops, I think I can feel a
bullet hole in my foot!</p>
<p>Still, before anyone runs away with the idea that I thought
badly of the article, quite the contrary. It's not much to take
issue with, in an article of 8 densely packed pages of text.</p>
<p>As for Spencer, I hope he's encouraged to persevere, it's clear
he's on an upward spiral. Many of us have seen far worse code than
his, in production code that has been paid for. And, most serious
of all, those concerned are often uninterested in improving.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
