    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Returning Early and Taking a Break</title>
        <link>https://members.accu.org/index.php/journals/718</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 8, #1 - Feb 1996</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c137/">081</a>
                    (7)
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Returning Early and Taking a Break</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 07 February 1996 13:15:26 +00:00 or Wed, 07 February 1996 13:15:26 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e16" id="d0e16"></a></h2>
</div>
<p>I read the rather over long 'Caught in the Net' column in C Vu
7.6 with rising frustration because there was far too much
'<span class="emphasis"><em>I do it this way</em></span>' instead
of '<span class="emphasis"><em>The advantage/disadvantage of doing
that is ...</em></span>' It reminded me strongly of one of my
Computer Science lecturers declaring '<span class="emphasis"><em>No
globals - they are bad</em></span>'. When I asked him what I should
do with data that was essentially program wide such as '<tt class=
"literal">errno</tt>' (or now-a-days, such C++ objects as
<tt class="literal">cout</tt> and <tt class="literal">cin</tt>, the
reply suggested that he had no idea as to what was wrong with using
global variables. His answer was to create a data structure that
could be passed as an argument to every function/procedure. I
contend that is a cure that is worse than the disease.</p>
<p>The real problem with globals is that they extend the test
environment for every line of code. The programmer must consider
possible effects that the state of every variable may have on her
code. For example any line using a math function in C may change
the state of <tt class="literal">errno</tt> (but as you never
check, it won't worry you ;-). <tt class=
"literal">cout</tt>/<tt class="literal">cin</tt> may go into a
non-functioning state as a result of any use. If you never check,
your program may behave oddly. In other words the state of
<tt class="literal">cout</tt>/<tt class="literal">cin</tt> is
relevant throughout the program. Putting such items in an
<tt class="literal">auto</tt> variable in <tt class=
"literal">main()</tt> and then relaying it everywhere does
absolutely nothing to help. It actually hinders, because passing
the data-structure will be an overhead of at least one pointer per
function call (probably eating up precious local memory resources)
as well as time taken to pass the value. Worse, <tt class=
"literal">errno</tt> will not work in this way. Think about it, it
is designed to communicate between the maths library and users of
the library. I know that we have alternatives such as exceptions in
C++, but my point is that they are alternatives and intelligent
programmers must choose the best compromise for the task they are
handling. Trite phrases such as '<span class="emphasis"><em>Globals
are bad</em></span>' do nothing to improve programming.</p>
<p>While I am on about some of the silly advice that gets thrown
around let me add a quick one about choice of identifiers. The
primary guide should be '<span class="emphasis"><em>whatever makes
code more readable</em></span>.' The programming analogues of
'nouns', 'verbs', 'adverbs' etc. should be easy to distinguish in
the context of their use. Parameters in prototypes should have
identifiers that make their use clear to the client (user of the
function). Parameters in definitions should be named to help the
implementor (and future maintainers) express algorithms etc.
cleanly. There is nothing wrong, per se, with short variable names,
and in many cases there is a lot wrong with overly long ones - they
can obscure the underlying structure of source code. Think of
single letter variables as akin to pronouns in natural languages -
only to be used where it is plain what they refer to.</p>
<p>On the subject of making code readable, it is worth reflecting
on the common commenting guidelines that are dear to the hearts of
so many. I think comments should be like footnotes, non-intrusive
extra information for the less well informed and incidental
information that all will need but that does not fit into the main
flow. If your code needs more than a comment every dozen lines then
it is poorly written. Experienced programmers should not need to
have common algorithms identified and the need to comment
individual identifiers is a sure indication of poorly chosen ones.
Even C programmers should avoid declaring variables remote from the
point of first use. A variable's lifetime should extend from the
point of first use (or just a bit earlier) to the point of last
use. Generally the scope (i.e. where the identifier is visible)
should be less than a dozen lines of code and if it is more than a
couple of dozen lines you should be re-examining your coding style
(sometimes large blocks are necessary, but often they just
represent the programmers fuzzy focus on a problem).</p>
<p>You should always comment on your reasons for making choices
between alternatives - e.g. your reason for selecting a specific
sort algorithm should be given even if it is just that you could
code it easily. You should also use comments to mark sensitive
parts of your code, or places that will need care in future. For
example C++ class designers should always comment where they are
allowing compiler provision of default constructors etc. because
the decision may need to be reviewed later on.</p>
<p>Too many comments are counter productive - they are not read and
often not updated. Like noisy compilers that give too many warnings
and thereby conceal the important ones, too many comments hide the
ones that matter.</p>
<p>Anyone who has had to endure the traditional style of formal
geometrical proof will know of their format of assertions followed
by justification. Actually the form of proof that was acceptable
earlier this century would have made Euclid's toes curl. As we
mature in our understanding of a subject we need fewer pointers to
help us follow it. But there is an important difference between
mathematical proofs and writing code, the former skips forward in
as large a step as the reader can manage with the comment
supporting the step. But code necessarily has to be complete - we
haven't yet got to compilers that fill in our intentions by reading
our comments.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e80" id="d0e80"></a>Returning</h2>
</div>
<p>'<tt class="literal">return</tt>' and '<tt class=
"literal">break</tt>' are controlled '<tt class=
"literal">gotos</tt>'. By the way, I believe that all programming
languages need a global goto mechanism. Pascal provides one which
Donald Knuth uses in both his TEX and METAFONT programmes in order
to handle programs in a 'panic' state - i.e. when the program has
detected that it is no longer behaving reliably. C provides setjmp
and longjmp to handle such circumstances, while C++ provides
exceptions. I do not know of any instance where a local goto (such
as that provided by C) is necessary, though I can speculate that in
the early days having such a mechanism might have helped the then
existing compiler technology. Using a goto to jump out of a nest of
structures has always seemed suspect to me as indicative of a poor
implementation.</p>
<p>First let me dismiss the spurious claim that allowing multiple
returns somehow increases the complexity of a piece of code - on
the contrary avoiding it can sometimes increase complexity by
introducing an otherwise unnecessary state variable (basically one
that says '<span class="emphasis"><em>The task is complete. I am
just passing through to get to the end.</em></span>') As far as I
am concerned you should return from a function precisely at the
point at which its task has been completed (do not forget that any
function should only supply functionality for a single concept).
(<span class="emphasis"><em>Francis has passed back a comment from
Sean Corfield on this style which I will tackle at the end of this
article.</em></span>)</p>
<p>By returning at the point of completion those reading the code
can see your intent and beliefs. Nothing is worse than maintaining
code where one has to wade through reams of code only to discover
that a particular thread was already complete.</p>
<p>Let me give you a short example where it is easy to code with
either a single return or with two.</p>
<pre class="programlisting">
void replace(char * destination, const char * source){
  if(destination == source) return;
  free (destination);
  destination = (char *)malloc(strlen(source)+1);
  strcpy(destination, source);
  return;
}
</pre>
<p>versus:</p>
<pre class="programlisting">
void replace(char * destination, const char * source){
  if (destination != source) { 
    free (destination);
    destination = 
      (char *)malloc(strlen(source)+1);
    strcpy(destination, source);
  }
  return;
}
</pre>
<p>(I assume that you can read these without comment, clearly
destination must be a pointer to dynamic memory. Oddly there is no
way that I know of whereby I can make that constraint statically
enforceable)</p>
<p>Should I check source is not a null pointer? I think, on
reflection, that I should.</p>
<p>What is the difference between the two implementations? In the
first I explicitly check for the exceptional condition (that
destination and source are aliases for the same string) and clearly
exclude it from further processing. In the second case, I check for
the norm and thereby implicitly exclude the exception case. I
believe that the first style is easier to follow but in the final
analysis it is like the difference between English and German
(where do you put the primary clause in a multi-clause sentence?).
On second thoughts, perhaps it is more like the difference between
writing paragraphs composed of short sentences instead of single
monolithic sentences.</p>
<p>Note that my focus is on making code intelligible to a
reason-ably experienced reader. This allows intent to be maintained
in future revisions. I know some (perhaps many) of the experts will
completely disagree but I ask them to reflect on the variety of
writing styles that co-exist for natural language.</p>
<p>My simple guideline is 'return when you have finished'. Of
course like all guidelines for writing it has to be used
intelligently, but for me the test is whether I can read the code
with understanding rather than struggle to use it as a guide to the
inner workings of the programmer's mind. Note that in a world of
reusable code, people really should not be changing working code.
If we can teach programming styles that help the programmer to get
it right first time then the needs of future maintainers are of
much less importance. It is a pre-requisite of code reuse that we
have correct code to reuse.</p>
<p>In a recent discussion with Francis about code complexity he
suggested a criterion for choosing between '<tt class="literal">if
... else</tt>' constructs and '<tt class="literal">switch ...
case</tt>' ones. He suggested that '<tt class="literal">if ...
else</tt>' should be restricted to genuine conceptual boolean
decisions while '<tt class="literal">switch ... case</tt>' should
be used in all cases where there were potentially more than two
choices even if only one case clause was actually written. I think
this has much to commend it. '<tt class="literal">switch ...
case</tt>' constructs are far easier to maintain by adding,
modifying or removing cases. I think that I would always include a
default clause even if it was a null statement. Nested ifs always
leave me feeling less than secure because they are so easy to get
wrong, none-the-less they are frequently necessary, but much less
often than they appear in current code.</p>
<p>Let us have a couple of pieces of code. First a case that is
often treated as <tt class="literal">boolean</tt> but isn't:</p>
<pre class="programlisting">
unsigned char retire_at(char gender){
  switch (gender) {
    case 'm' : return male_retirement_age;
    case 'f' : return female_retirement_age;
    default : return handle_unknown_gender();
  }
}
</pre>
<p>Some compilers may demand an extra <tt class=
"literal">return</tt> statement because they cannot do a flow
analysis to determine that all actual paths through the code
terminate with a return statement. Note that the default clause
calls a function whose value is returned. This function might not
return because it calls <tt class="literal">abort()</tt>,
<tt class="literal">exit()</tt> or uses <tt class=
"literal">longjmp</tt> or (in C++) throws an exception. It is not
the task of the <tt class="literal">retire_at()</tt> function to
determine the correct action if gender does not contain an
appropriate value it just needs to provide a peg on which the
solution can be hung. This assists code reuse because different
<tt class="literal">handle_unknown_gender()</tt> functions can be
provided as appropriate to the application being written.</p>
<p>Note the multiple return statements. Of course I could write the
code something like:</p>
<pre class="programlisting">
unsigned char retire_at(char gender){
  unsigned char return_value;
  switch (gender) {
    case 'm' : return_value = male_retirement_age;
      break;
    case 'f' : return_value = female_retirement_age;
      break;
    default : return_value = handle_unknown_gender();
  }
  return return_value; 
}
</pre>
<p>However this code cannot be more efficient, is probably less
efficient, and is vulnerable to forgetful programming (forgetting a
break). I believe that the coding style of the latter is defective
because it adds unnecessary detail and potential breakage
points.</p>
<p>Writing this function with '<tt class="literal">if ...
else</tt>' we have:</p>
<pre class="programlisting">
unsigned char retire_at(char gender){
  unsigned char return_value;
    if (gender == 'm') return_value = male_retirement_age;
    else if (gender =='f') return_value = female_retirement_age;
    else return_value = handle_unknown_gender();
  }
  return return_value;
}
</pre>
<p>This is even more defective because it requires more care from
the programmer to get it right. Just think how easy it is to patch
the first example to handle 'M' and 'F':</p>
<pre class="programlisting">
unsigned char retire_at(char gender){
  switch (gender) {
    case 'M' :
    case 'm' : return male_retirement_age;
    case 'F' :
    case 'f' : return female_retirement_age;
    default : return handle_unknown_gender();
  }
}
</pre>
<p>Or what about the change to manage potential use in non-English
applications?</p>
<pre class="programlisting">
enum Gender { male='m', female = 'f'};
unsigned char retire_at(char gender){
  switch (gender) {
    case male : return male_retirement_age;
    case female : return female_retirement_age;
    default : return handle_unknown_gender();
  }
}
</pre>
<p>Now only the <tt class="literal">enum</tt> Gender needs to be
touched to support characters that would be more logical in other
languages. (Of course <tt class="literal">male_retirement_age</tt>
and <tt class="literal">female_retirement_age</tt> are probably
also provided via an enum though there are other options that would
support late determination of these values).</p>
<p>Now to a case that is essentially boolean:</p>
<pre class="programlisting">
void something(int size){
  assert (size&gt;0);
  {
    int * handle;
    handle = (int *)malloc (size * sizeof(int))
    if (!handle) out_of_memory((void *)handle, size*sizeof(int));
    /* do rest of process */
  }
}
</pre>
<p>Once again I call a function to deal with the problem case. I do
not know if this function can fix the problem for me and then
return to continue the process so I pass it enough information so
that a fix can be provided if possible. Most likely the program
will terminate through a mechanism that is appropriate to whichever
of C or C++ that it is being compiled as.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e204" id="d0e204"></a>Breaking
early</h2>
</div>
<p>While I am happier with <tt class="literal">switch()</tt>
statements that either consistently finish each case with return,
or finish each with break I do not rule out mixtures where some
cases logically terminate the functions intent while others leave
more to do. I would strongly advocate some sensible layout
convention to make mixtures manifest (i.e. stand out). It is the
use of <tt class="literal">break</tt> in iterations that frequently
causes debate. I like to view it as akin to a split infinitive in
English, undesirable as a rule but tolerable in <span class=
"emphasis"><em>extremis</em></span>. Removing split infinitives
often requires substantial reworking of a sentence, removing break
from iterations often means reworking code. If you really cannot
find a readable alternative that avoids the use of <tt class=
"literal">break</tt>, use it but discuss it with other programmers
to see if you simply have a coding blind-spot.</p>
<p>On the subject of iterations, try to use the three options
provided by C to distinguish different intents. The <tt class=
"literal">for(...;...;...)</tt> is best reserved for counted
situations. Yes, it can be used to the exclusion of the other
alternatives but that often hides your intent. If you want to
iterate something n times use a <tt class="literal">for(i=0;
i&lt;n; n++)</tt>, if you want to run some code as long as a
condition is true use a <tt class="literal">while(...)</tt>. If the
code must be run once then use a <tt class="literal">do {...}
while(...);</tt> statement. Examine the following code
fragment:</p>
<pre class="programlisting">
/* other code */
char c=1;
while (c!='Q') {
  c=getc();
  /*iteration process */
}
</pre>
<p>Why has the <tt class="literal">c</tt> been initialised? To
ensure that the keyboard is read at least once. But consistent use
of the most appropriate construct will make this intent much
clearer even if you elect to initialise <tt class="literal">c</tt>
as a matter of principle and safer when you forget to initialise
<tt class="literal">c</tt>.</p>
<pre class="programlisting">
char c=0;
do {
  c=getc();
  /* iteration process */
}while (c!='Q');
</pre>
<p>What I am saying is that you should choose code constructs that
help readers understand what you intend to do. There is much more
to writing readable code than following some naming conventions and
layout rules.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e252" id="d0e252"></a>What About
<tt class="literal">continue</tt>?</h2>
</div>
<p>Well what about it? If you have understood what I have written
so far you will realise that my guideline is simply 'Use continue
when its use will help make code easier to understand. Don't use it
just to get out of a complicated spot.'</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e260" id=
"d0e260"></a>Pre-conditions and Post-conditions</h2>
</div>
<p>When Francis mentioned the gist of my first draft to Sean
Corfield, Sean quite correctly pointed out that the reason for
having a single return point in a function is to provide a point
where post-conditions can inserted by those maintaining or
reworking code. The problem with this is that it does not work in
C++ because the existence of exceptions may cause early return from
almost any code segment. I think it was Francis who suggested that
we could make use of a '<tt class="literal">atreturn()</tt>'
function to do for single functions what <tt class=
"literal">atexit()</tt> does for programs.</p>
<p>If you want to provide pre-conditions and post-conditions you
must examine other coding techniques. First you should consider
non-invasive methods (i.e. don't touch working code) such as
wrapping the function in another one. You will still have to make
some changes - you may need to rename the original function and
some value parameters may need to be converted to reference ones
(if you are using C++ to support that). Let me give a very simple
example:</p>
<pre class="programlisting">
int fn (int i);
</pre>
<p>It does not matter what this function does, but suppose that you
wish to apply a pre-condition that the argument is positive and a
post-condition that the return value is between 1 and 100 and that
i is in the range -50 to 50.</p>
<p>In the implementation file for fn() add the following:</p>
<pre class="programlisting">
int fn (int i) {
  assert (i&gt;0);
  int j=oldfn(i);
  assert (i&lt;51 &amp; i &gt;-51);
  assert (j&gt;0 &amp; j&lt;101);
  return j;
}
</pre>
<p>Change the original definition of <tt class="literal">fn()</tt>
so that the first line is now:</p>
<pre class="programlisting">
static inline int oldfn(int &amp; i) {
</pre>
<p>As long as you only want to impose conditions on the parameters
and return value this mechanism will work fine in C++. It is not so
easy in C because of the lack of reference variables makes checking
post conditions on parameters difficult.</p>
<p>If you want to apply post-conditions to internal variables of a
function you have no option but to touch the internals of the
function. If you want to make the post-conditions robust against
exceptions you have to start exploring such things as local classes
with destructors that handle the post-conditions but that is
another story.</p>
<p>Actually, I think retro-fitting post-conditions that are applied
to local variables is generally a poor idea. Organising code for
such purposes requires a somewhat different style from scratch.
Perhaps what I am saying is that when you write code for a safety
critical (or just mission critical) product you should not expect
to reuse code that was written in a less demanding environment. In
general code should be written to criteria that is appropriate to
its purpose. Just as applying the standards of games programming to
writing code for a fly-by-wire aircraft is inadequate, applying the
standards appropriate to writing code for the control systems of a
nuclear generator to a word-processor is probably a costly error.
'<span class="emphasis"><em>You know what I mean.</em></span>' will
do in some circumstances while '<span class="emphasis"><em>I need
know you know what I mean.</em></span>' is a minimal requirement in
others.</p>
<p>Good programming requires the use of an appropriate coding
style. A good programmer understands this a poor one does
everything the same way. Let me finish with a short Zen story.</p>
<p>It is said that one of the great Zen masters always answered
questions by holding up his index finger. One of his students noted
this and decided to adopt this deep and wise response to questions.
One day the master asked the student a question. When the student
raised his index finger, the master cut it off.</p>
<p>Rules of thumb are fine as long as you understand their meaning.
They are dangerous (even deadly) if you use them without
understanding.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
