    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Student Code Critique Competition</title>
        <link>https://members.accu.org/index.php/articles/1175</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Student Code Critiques from CVu journal. + CVu Journal Vol 14, #3 - Jun 2002</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c184/">Journal Columns</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c183/">Code Critique</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c114/">143</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c183-114/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c183+114/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Student Code Critique Competition</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 05 June 2002 13:15:52 +01:00 or Wed, 05 June 2002 13:15:52 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e13" id="d0e13"></a></h2>
</div>
<p><span class="bold"><b>Prizes provided by Blackwells Bookshops
&amp; Addison-Wesley</b></span></p>
<p class="c2"><span class="remark">Please note that participation
in this competition is open to all members. The title reflects the
fact that the code used is normally provided by a student as part
of their course work.</span></p>
<p class="c2"><span class="remark">Note that this item is part of
the Dialogue section of C Vu which is intended to designate it as
an item where reader interaction is particularly important.
Readers' comments and criticisms of published entries are always
welcome.</span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e24" id="d0e24"></a>Student Code
Critique 15: The Question</h2>
</div>
<p>There is a problem with this code to reverse a given string in
place with the use of pointers. It gives a segmentation fault
whenever it tries to swap two characters in the string. The fault
comes in the line indicated in the following code:</p>
<pre class="programlisting">
#include &lt;string.h&gt;
void swap(char *a, char *b){
  char tmp;
  tmp = *a;
  *a = *b;  
//this is where the seg. fault comes
  *b = tmp;
}
void reverse(char *str){
  char *a, *b;
  for(int i=0; i&lt;strlen(str)/2; i++){
    a = &amp;(str[i]);
    b = &amp;(str[strlen(str)-i-1]);
    printf(&quot;swapping %c and %c...\n&quot;,*a, *b);
    swap(a, b);
  }
  printf(&quot;the reversed string is: %s\n&quot;, str);
}
main() {
  reverse(&quot;cheese&quot;);
}
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e31" id="d0e31"></a>The
Entries</h2>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e34" id="d0e34"></a>Solution from
Michael Ellis <tt class="email">&lt;<a href=
"mailto:mike@miju.demon.co.uk">mike@miju.demon.co.uk</a>&gt;</tt></h3>
</div>
<p>Having read them for a couple of years, I thought it was about
time I had a go at one of these code critiques, so here goes...</p>
<p>The first task is obviously to find out why this code seg
faults. A segmentation fault occurs when the processor is asked to
read or write to memory that doesn't exist or write to memory
marked as read-only. Strictly speaking, the first case is actually
a bus-error, but many systems merge these two classes of faults
into one.</p>
<p>In this code the reason is pretty obvious in that the test
harness is calling <tt class="function">reverse()</tt> with a
constant string: constants are often held in read-only memory, and
reversing a section of read-only memory is probably going to be
quite hard. Some compilers automagically copy constant strings into
writable memory when passed to functions expecting non-constant
arguments (such as <tt class="function">reverse()</tt>) however
this is very much a compiler dependant feature (<i><span class=
"remark">Not one I have ever come across, and it would worry me if
I did. Francis</span></i>). I believe that most compilers will have
a switch to enable a warning when such code is compiled, but many
will ship with it turned off.</p>
<p>The first task, therefore, is to re-write the test harness to
operate with a writable chunk of memory. It is also worth
considering the possible ways in which <tt class=
"function">reverse()</tt> might fail: what happens if it's passed a
NULL pointer, an empty string, a string with an odd number of
characters and so on. Testing for the correct handling of a NULL
pointer is tricky since <tt class="function">reverse()</tt> doesn't
return any value, so the best we can do is call <tt class=
"literal">reverse(NULL)</tt> and see whether it seg.faults.</p>
<p>Whenever I write functions like <tt class=
"function">reverse()</tt>, I always include a complete test harness
inside &quot;<tt class="literal">#ifdef TEST/#endif</tt>&quot;. Needless to
say, functions like <tt class="function">reverse()</tt> exist in
their own compilation unit (roughly a .c file). For reverse(), my
test harness would look a little bit like this (modified to reduce
the number of lines):</p>
<pre class="programlisting">
#ifdef TEST
#include &lt;stdio.h&gt;
int main(void) {
  bool failed = false;
  struct{
    char original[10];
    char const * result;
  } tests[] = {
      { &quot;&quot;,        &quot;&quot;},
      { &quot;a&quot;,       &quot;a&quot;},
      { &quot;ab&quot;,      &quot;ba&quot;},
      { &quot;abc&quot; ,    &quot;cba&quot;},
      { &quot;abcd&quot;,    &quot;dcba&quot;}
    };
  for(int test = 0; test &lt; 
    sizeof(tests)/sizeof(tests[0]); ++test) {
    printf(&quot;Reversing \&quot;%s\&quot;\n&quot;,
         tests[test].original);
    printf(&quot;expecting \&quot;%s\&quot;\n&quot;,
         tests[test].result);
    reverse(tests[test].original);
    printf(&quot;      got \&quot;%s\&quot;\t&quot;,
         tests[test].original);
    if(strcmp(tests[test].original,
         tests[test].result) == 0) {
      puts(&quot; Pass.\n&quot;);
    } else {
      puts(&quot; Failed!\n&quot;);
      failed = true;
    }
  }
  if(failed) puts(&quot;\n\nSome tests failed.\n&quot;);
  else   puts(&quot;\nAll tests passed.\n&quot;);
// Test for handling of NULL
  puts(&quot;testing NULL pointer handling.\n&quot;);
  reverse(NULL);
  puts(&quot;Well,  it didn't seg-fault!.\n&quot;);
  return 0;
}
#endif
</pre>
<p>Note that I can add new tests very easily just by extending the
test array - a little trickery using <tt class=
"function">sizeof()</tt> means I don't even have to update a
&quot;number of tests&quot; variable. There is a slight issue with the fixed
size of ten characters for the original string that I could get
around, but it doesn't seem worth it for a test harness.</p>
<p>Notice also that every variable is initialised as soon as it is
declared - this is a nice feature of the new version of C which I
have used in the for-loop. Sadly I can't use it in reality as I
don't have a C99-compliant compiler, however as the original code
uses exactly this technique inside <tt class=
"function">reverse()</tt> I guess the student must be richer than
my employer! (<i><span class="remark">or uses a C++ compiler.
Francis</span></i>)</p>
<p>While writing the new test harness I started to think about the
function signature for <tt class="function">reverse()</tt>:</p>
<pre class="programlisting">
void reverse(char * str)
</pre>
<p>I don't like this for two reasons - first, &quot;reverse&quot; sounds like
a more general-purpose array reversal function than a specific
string-reversal function. Second, the formal parameter name of
&quot;str&quot; is reserved by the C-standard, as is every other name
beginning str. Next, the formal parameter is a non-constant
pointer, but as a general rule I like to make pointer parameters
constant so that I have some confidence in their value. If I need
to change the pointer, I can always copy it to a local variable
inside my function, and being a constant pointer doesn't mean I
can't write to whatever it actually points to. Finally, <tt class=
"literal">void</tt> functions miss out on the important trick shown
by many of the standard library functions of returning their
calling parameter (e.g. <tt class="function">strcat()</tt>). This
makes the function much easier to use in some cases, for example
inside the parameter list of a call to another function like
<tt class="function">sprintf()</tt>. I'd therefore change the
function signature to be more like:</p>
<pre class="programlisting">
char * my_strrev(char * const ptr)
</pre>
<p>Moving on to the actual code of <tt class=
"function">reverse()</tt>, it looks like the code is going to be
very inefficient: just a cursory look showed that <tt class=
"function">strlen()</tt> would be called at least once for every
character in the original string, yet reversing a string doesn't
alter its length, so why not call <tt class=
"function">strlen()</tt> once and hold the result in a (constant)
variable?</p>
<p>Looking further, there are a lot of array subscripting
operations, and my experience with embedded systems has taught me
the hard way that array subscripting is a very slow operation best
reserved for when you need to access elements of an array in a
random order. As <tt class="function">reverse()</tt> only needs to
access characters sequentially we would be far better off to use
pointers directly, and given the original specification (&quot;reverse a
string in place with the use of pointers&quot;) which sounds
suspiciously like a homework assignment, <tt class=
"function">reverse()</tt> probably isn't what the lecturer wanted
as pointers are only used to pass two characters into <tt class=
"function">swap()</tt>.</p>
<p>Obviously we need to access the string from both ends, so two
pointers moving in opposite directions along the string are the
easiest way to proceed. Initialising the first pointer is easy -
it's the formal parameter to the function, but the second pointer
will need to be &quot;walked&quot; along the string to find the end. As each
character is copied, the pointers are moved closer together by one
character each, and the whole string has been reversed when the
second pointer reaches or passes the first.</p>
<p>The next thing I spotted was that a helper function called
<tt class="function">swap()</tt> is used to actually exchange two
characters in the string. In C99 this could be marked as an inline
function, or in C89 it could be implemented as a macro. I have to
be honest and say that I probably wouldn't do either in this case:
I'd probably but the exchanging directly into <tt class=
"function">my_strrev()</tt>.</p>
<p>The final thing to hit me was that <tt class=
"function">reverse()</tt> doesn't just reverse the string - it also
prints it out. I'm going to assume this is debugging added to find
out why the original was seg.faulting, however I would have
preferred to see this sort of code enclosed in some way to make it
obvious, for example a call to <tt class=
"function">debug_printf()</tt> or wrapped in &quot;<tt class=
"literal">#ifdef DEBUG/#endif</tt>&quot;.</p>
<p>So what does <tt class="function">my_strrev()</tt> look like
after all these changes?</p>
<pre class="programlisting">
char * my_strrev(char * const ptr) {
  char * beg = ptr;
  char * end = ptr;
// Handle a NULL pointer gracefully...
  if(ptr == NULL) return NULL;
// Walk the end pointer up to the nul character
  while(*end != '\0') ++end;
// end now points to the nul - step back by one
  --end;
// Now reverse the string until the pointers 
// cross over
  while(beg &lt; end) {
// Swap two characters and move the pointers
    char tmp = *beg;
    *beg++ = *end;
    *end-- = tmp;
  }
  return ptr;
}
</pre>
<p>You can see that I've actually walked the end pointer one
character too far and have had to back it up again. There are a
whole host of other ways I could have achieved the same result by
adding +1 and -1 all over the code, however I feel this is the most
obvious. I've also used post-increment on both of the pointers:
normally I'd actually write this as two separate stages (swapping
the characters and then moving the pointers) however I know that
the number of lines of code needs to be kept down for printing in
C-Vu and this code is still pretty readable, especially as the
comment highlights the pointers moving.</p>
<p>Fixing the original seg.fault was quite easy. [<i><span class=
"remark">well implicitly you did, but being picky, you did not do
so explicitly. Francis</span></i>] Doing a code-review of the rest
of the functions shows that a lot can be changed, but how much
better is the new code? That depends on how you define &quot;better&quot;.
One metric might be how fast the code is, and fortunately I have
access to some really powerful code profiling tools. I ran the same
tests on both <tt class="function">my_strrev()</tt> and <tt class=
"function">reverse()</tt>. <tt class="function">reverse()</tt> took
an average of 1.64us, of which only 0.34us was in <tt class=
"function">reverse()</tt> itself, the rest being in <tt class=
"function">swap()</tt> and <tt class="function">strlen()</tt>.
<tt class="function">my_strrev()</tt> took an average of 0.43us
with no function calls. So was it worth it? Possibly. If <tt class=
"function">my_strrev()</tt> is only used once or twice inside a
huge application, a saving of 1.21us simply isn't worth the effort.
However, if <tt class="function">my_strrev()</tt> is used in the
middle of a tight loop called a hundred thousand times per second
in a real-time critical system, saving 121us might make the
different between your code being fast enough and crashing. As I
spend most of my time in the latter environment, it's probably
worth it for me. As the original code was probably a homework
assignment, the time saving really isn't worth it. Perhaps this
goes to prove the old adage regarding optimisation:</p>
<p>Rule 1 (for all programmers): Don't do it.</p>
<p>Rule 2 (for experts only): Don't do it yet.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e199" id="d0e199"></a>Solution from
Robert Lytton</h3>
</div>
<p>There are two issues that need to be addressed in this code;
what is 'C' and the use of constant data.</p>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e204" id="d0e204"></a>What is 'C'</h4>
</div>
<p>This my second attempt at this paragraph. I have been having
nightmares about Francis' corrections and must try to minimise them
before anything gets published! Many compilers today are willing to
accept a wide range of code 'standards' in the source files. Some
compilers will pass back warnings that the code is not conforming
to a certain standard, some accept the source unquestioning, whilst
others will refuse to compile it. The two main standards to be
aware of are the original 'ANSI C' (ISO/IEC 9899:1990) (now
referenced as C89 for clarity) and the new 'C99' (ISO/IEC
9899:1999). So what standard should I adopt? I guess in time I will
be using C99 but at present I need the security of being able to
move code between different compilers. There are some target
processors that may take years to be fitted with a C99 compiler,
the likes of lcc are important tools and I know nothing of its
future conformity. There are two choices; the chosen compiler will
decide the standard or write code that conforms to both the old C
Standard and the new 'C99' one. For me the latter is the preferable
solution and so the // commenting style is out along with a few
other things. Turning to the code, it conforms to neither standard
at present, the following steps must be taken:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>The function <tt class="function">main()</tt> always returns an
<tt class="type">int</tt>. In C89 this is the implicit type but C99
requires us to state it. C99 will add an implicit <tt class=
"literal">return 0</tt> for us but a return statement is needed by
C89. The function arguments, it seems, are optional in both
standards&hellip; (further investigation needed).</p>
<pre class="programlisting">
  int main() {
    return 0;
  }
</pre></li>
<li>
<p>In ANSI, the for-loop iterator <tt class="varname">i</tt> must
be declared at the start of a block (after curly brackets),</p>
</li>
<li>
<p>The for-loop iterator <tt class="varname">i</tt> should be of
type <tt class="type">size_t</tt> so that it may be compare with
the return value of <tt class="function">strlen()</tt>,</p>
<pre class="programlisting">
  {
    size_t i;
    for(i=0; i&lt;strlen(str)/2; i++)
    {
</pre></li>
<li>
<p>To use printf(), the stdio.h header needs to be included,</p>
<pre class="programlisting">
  #include &lt;stdio.h&gt;
</pre></li>
</ol>
</div>
</div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e249" id="d0e249"></a>The use of
constant data</h4>
</div>
<p>When a compiler creates a binary it will separate the address
range into areas for code, constant data and writable data (this is
simplified). All these areas may reside in RAM but on some systems
constant data and code may reside in ROM or be protected by an MMU
from being written to. In the example code the string literal
<tt class="literal">&quot;cheese&quot;</tt> will be placed in the constant
data area. The compiler is quite entitled to overlay the memory
used to store the string literal for other string literals, such as
<tt class="literal">&quot;cheese&quot;</tt> and <tt class=
"literal">&quot;bigcheese&quot;</tt>, declared else where in the code. The
compiler assumes that this data is un-modifiable and is therefore
making the best use of the memory.</p>
<p>When the code tries to reverse the string it uses the original
constant data as its work-space. This will either fail due to the
data being in ROM, cause the MMU to issue a segmentation fault or
succeed, breaking the expectation of the code and compiler! The
last scenario is very dangerous, the next time we come to use the
expected constant value <tt class="literal">&quot;cheese&quot;</tt> we will
in fact be using <tt class="literal">&quot;eseehc&quot;</tt>.</p>
<p>The problem may be simply solved by 'moving' the string from
constant storage into writable storage, viz:</p>
<pre class="programlisting">
  char word[] =&quot;cheese&quot;;
  reverse(word);
</pre>
<p>The string literal <tt class="literal">&quot;cheese&quot;</tt> will still
be compiled into the constant data area but on calling the function
<tt class="function">main()</tt>, the data will be copied onto the
stack and pointed to by the variable <tt class="varname">word</tt>.
<tt class="varname">word</tt>'s type informs the compiler that the
object it is pointing to is non-constant.</p>
<p>It would be nice if the compiler stopped us when we were trying
to write to constant data. The problem is that C allows us to pass
a <tt class="type">const char*</tt> as a parameter to a function
that expects a non-const <tt class="type">char*</tt></p>
<p>[<i><span class="remark">not on any compiler I have used. The
problem is that some Standard C Library functions return a plain
<tt class="type">char*</tt> which is based on a <tt class=
"type">char const *</tt> parameter. Francis</span></i>].</p>
<pre class="programlisting">
  const char word[] =&quot;cheese&quot;;  
  reverse(word); /* word is of type const
                 char* */
</pre>
<p>In C++ this should cause a compile time error, where as in C you
may get a warning. To compile the C++ code, or just remove the
warning, you would have to change <tt class=
"function">reverse()</tt> to accept a parameter of type <tt class=
"type">const char*</tt>. Inside <tt class="function">reverse()</tt>
you would then have to change variable a and b to type <tt class=
"type">const char*</tt> and also the function <tt class=
"function">swap()</tt> to take <tt class="type">const char*</tt>
parameters. The function <tt class="function">swap()</tt> would
then cause a compile time error when you tried to write to the
de-referenced <tt class="type">const char*</tt>. This is exactly
what you want to happen, because it is an error!</p>
<p>[<i><span class="remark">Note that as far as standards are
concerned 'warnings' and 'errors' both comply with any requirement
for a diagnostic. Francis</span></i>]</p>
<p>There is another way around the problem of the C++ compiler
refusing to compile buggy code or just to remove the compiler
warnings in C. The constant data may be cast to be non-constant,
viz:</p>
<pre class="programlisting">
  const char word[] =&quot;cheese&quot;;
  reverse((char*)word);
</pre>
<p>If you need a cast to make something compile without error or
warning... then think again!! [<i><span class="remark">yes, if you
lie to the compiler the resulting mess is entirely your fault.
Francis</span></i>]</p>
<p>In C, string literals are specified to be un-modifiable.
However, string literals do not have the type <tt class=
"type">const char[]</tt> (thereby allowing them to be passed to
library functions that take <tt class="type">char*</tt>). In C++ it
also seems that the constness of string literals is not revealed to
the compiler and hence a C++ compiler will compile <tt class=
"literal">reverse(&quot;cheese&quot;)</tt> without an error and possibly
without a warning. [<i><span class="remark">But if you overload
reverse to provide a version for arrays of <tt class="type">const
char</tt> it will select that one. Francis</span></i>]</p>
<p>The lessons to be learnt are:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>Use string literals with care;</p>
</li>
<li>
<p>Use the const qualifier more rigorously;</p>
</li>
<li>
<p>Heed compiler warnings on constness.</p>
</li>
</ul>
</div>
<p>The corrected code is presented below.</p>
<p>I have taken the opportunity to remove the overhead of several
calls to <tt class="function">strlen()</tt>. I decided not to
inline <tt class="function">swap()</tt> at this stage but have made
it available only to functions in this file by adding the keyword
<tt class="literal">static</tt>. I have taken the precaution of
initialising the pointers <tt class="varname">a</tt> and <tt class=
"varname">b</tt> in <tt class="function">reverse()</tt> to zero. By
always initialising pointers, you are more likely to discover bugs
caused by de-referencing pointers that would otherwise be randomly
initialised. A more effective step would have been to reduce the
scope of <tt class="varname">a</tt> and <tt class="varname">b</tt>
by placing them inside the for-loop and then initialise them with
their final values, viz:</p>
<pre class="programlisting">
for(;;){
  char *a = &amp;(str[i]);
  char *b = &amp;(str[len-i-1]);
</pre>
<p>Finally it is always wise to check if you are re-inventing the
wheel and in this case someone has already gone to the effort of
writing the code for you. The function <tt class=
"function">strrev()</tt>, found in <tt class=
"filename">string.h</tt>, [<i><span class="remark">Only if you are
using an extended Standard C Library as that function is not
supported by the Standard. Francis</span></i>] may be used in place
of <tt class="function">reverse()</tt> and <tt class=
"function">swap()</tt>. However the need to 'move' the string
literal into writable memory still exists. If we were using C++ we
could have used a <tt class="type">string</tt> type, protecting us
from the problems associated with string literals.</p>
<pre class="programlisting">
#include &lt;string.h&gt;
#include &lt;stdio.h&gt;
static void swap(char *a, char *b){
  char tmp;
  tmp = *a;
  *a = *b;
  *b = tmp;
}
void reverse(char *str){
  char *a=0, *b=0;
  size_t i, len=strlen(str);
  for(i=0; i&lt;len/2; i++){
    a = &amp;(str[i]);
    b = &amp;(str[len-i-1]);
    printf(&quot;swapping %c %c...\n&quot;, *a, *b);
    swap(a,b);
    printf(&quot;the reversed string is: %s\n&quot;, str);
  }
}
int main(){
  char word[] = &quot;cheese&quot;;
  reverse(word);
/* reverse(word) may be replaced by strrev(word) */
  return 0;
}
</pre></div>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e430" id="d0e430"></a>Solution from
Victoria Catterson <tt class="email">&lt;<a href=
"mailto:vic@cowlet.org">vic@cowlet.org</a>&gt;</tt></h3>
</div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e435" id="d0e435"></a>The Problem</h4>
</div>
<p>The fundamental problem with this program is that the parameter
passed to the reverse function is a string literal. Also called a
string constant, a string literal is an array of characters
enclosed in double quotes, in this case &quot;cheese&quot;. Attempting to
change the contents of a string literal results in undefined
behaviour[1].</p>
<p>In this program, the string literal is being modified by the
function swap, which attempts to swap two characters, the addresses
of which are passed as parameters. The first time swap is called,
the addresses of the first and last characters (c and e) of the
string are the parameters. The line:</p>
<pre class="programlisting">
*a = *b;
</pre>
<p>attempts to overwrite the character 'c' with the character 'e',
thus altering the string literal. It is at this point that the
segmentation fault occurs.</p>
<p>There is a simple solution to this problem: do not pass a string
literal to the reverse function. Instead, declare a character array
with &quot;cheese&quot; as its contents, and then pass the address of the
array to the function. The main function then becomes:</p>
<pre class="programlisting">
main () {
  char food[] = &quot;cheese&quot;;
  reverse (food);
}
</pre>
<p>It should be noted that &quot;cheese&quot; is still a string literal, but
it is being used to initialise the array food in this version. It
is therefore the characters of food that are being modified by
swap, and not the string literal itself. This is an important point
to understand, as the following declarations are completely
different.</p>
<pre class="programlisting">
char hunk[] = &quot;cheese&quot;;
char *slice = &quot;cheese&quot;;
</pre>
<p>The first allocates an array of <tt class="type">char</tt>, of
size (<tt class="literal">strlen (&quot;cheese&quot;) + 1</tt>), and copies
the characters of the string literal to the characters of the
array. The second merely sets a pointer to <tt class=
"type">char</tt> to the address of the string literal. Passing hunk
to reverse is fine, because it is the contents of the array that
are modified. Passing slice is illegal, because the code will
attempt to modify the string literal itself.</p>
</div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e465" id="d0e465"></a>Other
Comments</h4>
</div>
<p>While the string literal was the major problem with the program,
there are a couple of additional points to address.</p>
<p>The <tt class="function">main</tt> function must be defined as
returning <tt class="type">int</tt>. As there is no return type
explicitly stated for this <tt class="function">main</tt>, the
default return type of <tt class="type">int</tt> applies
(<i><span class="remark">But that is no longer true in the latest C
Standard, where support for implicit <tt class="type">int</tt> has
been removed Francis</span></i>). However, no value is returned at
the end of the function. The value returned by <tt class=
"function">main</tt> can be used to determine if any errors
occurred in execution of the program [<i><span class="remark">More
correctly, the value returned by <tt class="function">main</tt> is
passed to <tt class="function">exit()</tt>. C programmes run as if
you had invoked them with <tt class="literal">exit(main())</tt>
Francis</span></i>]. By convention, a zero return value indicates
that the program ran successfully.</p>
<p>Additionally, <tt class="function">main</tt> is always defined
as either <tt class="literal">int main (int argc, char
**argv)</tt>, or <tt class="literal">int main (void)</tt>. In this
case, no arguments are required and so <tt class=
"literal">void</tt> should be used. Omitting the <tt class=
"literal">void</tt> is old-style C, and not ANSI C.</p>
<p>The student has used the C++ or Java style comment, beginning
the line with &quot;//&quot;. C comments are enclosed within slashes and
stars<sup>[<a name="d0e524" href="#ftn.d0e524" id=
"d0e524">1</a>]</sup>, i.e.:</p>
<pre class="programlisting">
/* This is a C comment */ 
</pre>
<p>[<i><span class="remark">but see below. Francis</span></i>]</p>
<p>This, in addition with the previous point, suggests that the
student learned C++ or Java before learning C. If this is the case,
then particular note should be made of these points, as the student
is likely to repeat them from habit otherwise.</p>
<p>In the reverse function, <tt class="function">strlen</tt> is
called more than once for the same string. If the length of the
string will not change, it is better to call <tt class=
"function">strlen</tt> only once, and save the result in a variable
for reuse. Efficiency is not an issue with this program, but it is
a good idea to make simple optimisations where possible, as long as
it is not at the expense of clarity. In this case, <tt class=
"function">strlen</tt> is called twice in every iteration of the
loop, which is inefficient and no more clear than calling it once
before the loop, saving the result to a variable, and using the
variable in place of the <tt class="function">strlen</tt>
calls.</p>
<p>The standard library function <tt class="function">printf</tt>
is called in the reverse function, and so stdio.h must be included,
in addition to string.h.</p>
<p>Some of the identifiers chosen could be better. The variable
name <tt class="varname">str</tt> is fine, but it is worth noting
that all identifiers beginning with &quot;str&quot; followed by a lowercase
letter are reserved[2]. Therefore names like string should be
avoided. In addition, the function named reverse reverses the
string, displaying each swap operation as it is carried out, and
then prints out the reversed string. As the name only describes
half the functionality of the function, either the name is wrong or
the function does too much. In this case, I would suggest moving
the call to <tt class="function">printf</tt> to the <tt class=
"function">main</tt> function, so reverse only reverses the string.
In general, a function name should describe what it does, and the
function should do as little else as possible.</p>
</div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e573" id="d0e573"></a>Corrected
Version</h4>
</div>
<pre class="programlisting">
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
void swap (char *a, char *b){
  char tmp;
  tmp = *a;
  *a = *b;
/* this is where the seg fault used to come */
  *b = tmp;
}
void reverse (char *s){
  char *a, *b;
  int i;
  int len = strlen(s);
  for(i = 0; i &lt; len/2; i++){
    a = &amp;(s[i]);
    b = &amp;(s[len-i-1]);
    printf(&quot;swapping %c &amp; %c...\n&quot;, *a, *b);
    swap(a, b);
  }
}
int main(void){
  char food[] = &quot;cheese&quot;;
  reverse(food);
  printf(&quot;the reversed string is: %s\n&quot;, food);
  return 0;
}
</pre></div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e578" id="d0e578"></a>My Solution</h4>
</div>
<pre class="programlisting">
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
void reverse(char *s){
  int i, j;
  char temp;
  int len = strlen (s);
  for (i = 0; i &lt; len/2; ++i) {
    j = len-i-1;
    temp = s[i];
    s[i] = s[j];
    s[j] = temp;
  }
}
int main(void){
  char food[] = &quot;cheese&quot;;
  printf(&quot;The string before reversal is: %s\n&quot;, food);
  reverse(food);
  printf(&quot;The reversed string is: %s\n&quot;, food);
  return 0;
}
</pre></div>
<div class="sect3" lang="en">
<div class="titlepage">
<h3><a name="d0e583" id="d0e583"></a>Notes on My
Solution</h4>
</div>
<p>I did not think there was anything to be gained from having a
separate swap function. The reverse function is relatively small,
and does not lose clarity by containing the code that performs the
swap. However, making it a separate function would slow the program
down, as the overhead of a needless function call would be incurred
every iteration of the loop.</p>
<p>Also, I added a second <tt class="function">printf</tt> call to
the <tt class="function">main</tt> function, to let the user know
what the string is before it is modified. This gives the user
slightly more information about what the program is doing, which
generally makes a program easier to use.</p>
<p>[1] The C Programming Language by Brian Kernighan and Dennis
Ritchie, 2nd Ed. Section 5.5: Character Pointers and Functions,
page 104</p>
<p>[2] ISO/IEC 9899:1999 subclause 7.26.10 and subclause
7.26.11</p>
</div>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e600" id="d0e600"></a>Solution from
Tony Houghton <tt class="email">&lt;<a href=
"mailto:tony@realh.co.uk">tony@realh.co.uk</a>&gt;</tt></h3>
</div>
<p>The problem with this code was quite easy to spot for a seasoned
C programmer; most would get the compiler to spot it for them by
enabling warnings (in gcc's case -Wwrite-strings). The problem is
that the author is trying to apply <tt class=
"function">reverse()</tt> to a string literal, while the
implementation treats literals as <tt class="literal">const</tt>
and has write-protected the memory it's stored in.</p>
<p>The moral is that the first thing to do when learning to use a
new compiler is find out how to enable as many warnings as
possible. I really think compilers should enable all warnings by
default and let advanced programmers disable them if necessary (for
legacy code etc), instead of the other way round.</p>
<p>An easy fix is to use <tt class="function">strdup()</tt> to
create a writable copy of the string. However, <tt class=
"function">strdup()</tt> is not strictly standard, and not always
provided. Luckily it's very easy to implement, so I've written an
implementation called <tt class="function">str_dup()</tt>. Most
names beginning with &quot;str&quot; are reserved by and for the standard,
but that only applies to those where the &quot;str&quot; is followed by
another lower-case letter.</p>
<p>I want to keep <tt class="function">reverse()</tt>'s
implementation of altering the input string, as I'll discuss later,
so I'll put the call to <tt class="function">str_dup()</tt> in
<tt class="function">main()</tt> for now. So the implementation of
<tt class="function">main()</tt> becomes:</p>
<pre class="programlisting">
reverse(str_dup(&quot;cheese&quot;));
</pre>
<p>The code now compiles without the warning about a discarded
qualifier, and more importantly runs and produces the expected
result instead of seg-faulting. However, as we've been sensible and
enabled other warnings as well, we see that <tt class=
"function">printf()</tt> needs declaring, <tt class=
"function">main</tt>'s declaration is wrong, and it needs a return
value. We should also be tidy and free the temporary string we've
created. So we add some includes (stdio.h for <tt class=
"function">printf()</tt>, stdlib.h for <tt class=
"function">free()</tt> and the <tt class="function">malloc()</tt>
in our strdup implementation) to the top of the source file, and
<tt class="function">main()</tt> becomes:</p>
<pre class="programlisting">
int main(void){
  char *str = str_dup(&quot;cheese&quot;);
  reverse(str);
  free(str);
  return 0;
}
</pre>
<p>I recently embarrassed myself on a newsgroup about the
definition of <tt class="function">main</tt>: if you're not using
arguments, I learnt the correct declaration is <tt class=
"literal">int main(void)</tt> rather than <tt class="literal">int
main()</tt>. [<i><span class="remark">But only in C.
Francis</span></i>]</p>
<p>I don't see a need to change <tt class="function">swap()</tt>;
there are a number of changes I'd like to make to <tt class=
"function">reverse()</tt> though. I'll keep the implementation that
it alters the string it's passed, because there'll often be times
when this is the desired behaviour for performance reasons, rather
than creating a new string. But it's common practice to return a
copy of the pointer to the string in such cases, for convenience
and to enable tail-call optimisations. The name <tt class=
"function">reverse()</tt> is fine, because it's obvious by the
argument type that it's a string we're reversing. The name &quot;str&quot;
for the argument is fine, not being followed by a lower-case
letter.</p>
<p>To make <tt class="function">reverse()</tt> more versatile, we
don't want it to print the result, nor the messages for debugging
swap if we want to use it for real. The best thing to do with the
final <tt class="function">printf</tt> is move it into <tt class=
"function">main()</tt>. For now the debugging messages can be sent
to <tt class="literal">stderr</tt> instead of <tt class=
"literal">stdout</tt>; then they can be easily be suppressed or
redirected to a file independently of the output we really
want.</p>
<p>Working down the <tt class="function">reverse()</tt> function,
<tt class="varname">a</tt> and <tt class="varname">b</tt> are
declared earlier than necessary, so we can move that inside the
loop and combine the declarations with assignments. We could do
without the variables altogether and use expressions, but with the
debugging statement it's more convenient to keep them, and even if
we remove the debugging, the use of temporary variables won't cause
any run-time overhead with a modern compiler. I prefer the syntax
of pointer arithmetic to logically dereferencing with [], then
taking the address again, so I've changed that in the assignment
statements too.</p>
<p>I've moved the declaration of i outside the loop, because I've
got into the habit of trying to make my code portable, and ANSI C
is still much more widely supported than C99. It should also be
<tt class="type">size_t</tt> instead of a simple <tt class=
"type">int</tt>. There's a strong case for using <tt class=
"function">strlen()</tt> in each instance that the length of a
string is needed, but I've decided to call it just once and store
the result in a variable here. This is because we're not changing
the length of the string, but as we are changing the content, the
compiler might decide it isn't safe to optimise it away to one
call, and performance would suffer, especially as it's used twice
each time round the loop.</p>
<p>Finally, as we wanted to use <tt class="function">reverse()</tt>
on, effectively, a <tt class="literal">const</tt> string, it would
be useful to provide an alternative version of the function that
operates on and returns a copy. I'll call this version <tt class=
"function">reverse_copy()</tt>; it's a very simple forwarding
function, so if I were still using C99 it would be a candidate for
inlining. Now all that remains is to alter <tt class=
"function">main()</tt> to use <tt class=
"function">reverse_copy()</tt> instead of <tt class=
"function">strdup()</tt> and <tt class=
"function">reverse()</tt>.</p>
<p>At the risk of going on too long for such a short piece of code,
there is one more thing that can be done. So far we've stuck to a
single source file for this example. The functions are all
potentially useful though, so it would be a good idea to put them
in a separate file from <tt class="function">main</tt> so that it
can be built into a useful library for use in other programs in
future. With this in mind, I'll go back on what I said before about
keeping the function names and give them all a prefix of <tt class=
"literal">str_</tt> to indicate they're user-supplied code to
supplement the standard library's string operations. It's good
practice to decide on a prefix for each source file in a large
program and use it religiously on every function name and non-local
variable: it aids debugging and reduces the chance of name
clashes.</p>
<p>Here is the code. Where a function has a comment next to its
header declaration I've saved space by not providing a similar
comment in the source file:</p>
<pre class="programlisting">
#ifndef STR_EXTRA_H
#define STR_EXTRA_H
/* Used instead of strdup() in case the latter isn't provided by the platform */
char *str_dup(const char *);
/* Swaps the characters pointed to by a and b */
void str_swap(char *a, char *b);
/* Reverses str and returns a copy of the pointer (not a copy of the string) */
char *str_reverse(char *str);
/* Returns a reversed copy of str, which may be free()'d */
char *str_reverse_copy(const char *str);
#endif /* STR_EXTRA_H */

/* str_extra.c */
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;
char *str_dup(const char *str){
  char *ret = malloc(strlen(str) + 1);
  strcpy(ret, str);
  return ret;
}
void str_swap(char *a, char *b){
  char tmp;
  tmp = *a;
  *a = *b;
  *b = tmp;
}
char *str_reverse(char *str){
  size_t i;
  const size_t len = strlen(str);
  for(i = 0; i &lt; len/2; ++i)  {
  char *a = str + i;
  char *b = str + len - i - 1;
  fprintf(stderr, &quot;swapping %c &amp; %c...\n&quot;, *a, *b);
  swap(a, b);
  }
  return str;
}
char *str_reverse_copy(const char *str){
  return reverse(str_dup(str));
}
/* End of str_extra.c */
/* main.c */
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &quot;str_extra.h&quot;
int main(void){
  char *str = reverse_copy(&quot;cheese&quot;);
  printf(&quot;the reversed string is: %s\n&quot;, str);
  free(str);
  return 0;
}
/* End of main.c */
</pre></div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e763" id="d0e763"></a>And the
Winner</h2>
</div>
<p>While I am happy to run this column for a while longer (the most
tedious part is formatting everything) I think it should be the
Editor to choose the winner. So over to James.</p>
<p class="c2"><span class="remark">I have chosen Michael Ellis,
even if he has a slightly performance driven perspective, because
of the clarity of what he writes, and the emphasis on testing and
avoiding pointless optimisations, even if he does miss out on the
minor trick that I think &quot;str&quot; is not technically reserved as a
name -- it ought to be avoided anyway.</span></p>
<p class="c2"><span class="remark">James</span></p>
<p>OK Mike, email me (<tt class="email">&lt;<a href=
"mailto:Francis.Glassborow@ntlworld.com">Francis.Glassborow@ntlworld.com</a>&gt;</tt>)
with your choice of book (under &pound;40) and I will arrange for
you to get it.</p>
<p><span class="emphasis"><em>Francis</em></span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e782" id="d0e782"></a>Student Code
Critique 16</h2>
</div>
<p>This one is a little bit different. Prompted by the emails for
help that Bjarne Stroustrup showed me, I want you to look at the
fillowing very simple piece of code. The problem is obvious, but
saying that is not going to help the student. We need to change his
mental model so that it is obvious to him too. So write a response
to the student. In the response you should try to identify the
causes of this kind of misunderstanding so that the student will
have a clear idea of the differences between a declaration,
definition and call of a function. You may think this is easy, but
if it is, why do so many students fall into this kind of trap? (Oh
and there are a couple of ordinary bugs in there as well, which
should not be ignored.)</p>
<pre class="programlisting">
#include&lt;stdio.h&gt; 

void count_up(int start, int end, int step){
  printf(&quot;enter the upper limit&quot;);
  scanf(%d; &amp;start)
  printf(&quot;enter your lower limit&quot;);
  scanf(%d; end);
  printf(&quot;enter step size&quot;);
  scanf(%d; step)
  int x;
  for(x=start; x&lt;=finish;x=x+step) {
    printf(&quot;\n %d&quot;,x);
  }
}

int main(){
  count_up(1, 10, 1);
  count_up(-10, 10, 2);
  return 0;
}
</pre>
<p>I want to give everyone a chance to enter, but please try to get
your entry in as early as possible and anyway not latter than July
16.</p>
</div>
<div class="footnotes"><br>
<hr class="c3" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e524" href="#d0e524" id=
"ftn.d0e524">1</a>]</sup> This is not entirely accurate, as the
student's version is supported by the C99 standard [<i><span class=
"remark">for both late declaration of variables and for // style
comments</span></i>]. However, as C99 is not widely used, it is
more likely the student is required to write C89 compatible code.
The student's version is invalid C89 code. [<i><span class=
"remark">It is even more likely that the student's instructor does
not know the difference. Francis</span></i>]</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
