    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Comment on &#8220;Problem 11&#8221;</title>
        <link>https://members.accu.org/index.php/articles/179</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + CVu Journal Vol 16, #1 - Feb 2004</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c104/">161</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-104/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+104/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Comment on &#8220;Problem 11&#8221;</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 01 February 2004 19:38:02 +00:00 or Sun, 01 February 2004 19:38:02 +00:00</p>
<p><strong>Summary:</strong>&nbsp;<p>The first step here in finding problems in the code is to
identify the problem the code is trying to solve. The discussion in
the C Vu article is basically about curiosities in the way in which
the C++ standard library std::istream is defined, but I will make
the perhaps unwarranted assumption that what the problem the code
is really about is not the uses of std::istream, but rather, more
generally, how to write a read routine that can effectively and
safely capture data from an input stream. Actually as the first
problem below illustrates neither of these issues can be
effectively addressed without the other.</p>
</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2>
</div>
<p>The proposed improvement to the templated read function is that
it starts an approach to handling different input conditions by
having the user distinguish between two types of stream ending
conditions, reading just an end-of-file and reading a carriage
return along with end-of-file. (Do I have this right?)</p>
<p>This is a start, but only useful to illustrate idiosyncrasies of
STL <tt class="computeroutput">istreams</tt>. It still has problems
with <tt class="computeroutput">std::istream</tt>, but as a lesson
in reading computer input it is deficient in the following
ways:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>The most basic problem here is that of &quot;separation of concerns&quot;
and for separate routines that each do one function and do it well.
This is particularly unfortunate here, since it is especially
important to avoid tight coupling between system support routines
(reading input) and client application routines (processing
input).</p>
<p>This basic problem is manifest here in multiple ways:</p>
<div class="itemizedlist">
<ul type="circle">
<li>
<p>The client routine is expected to test multiple stream ending
conditions, reported with different syntax and in two different
domains; one in that of the input mechanism, one in that of the
read routine.</p>
</li>
<li>
<p>The test for a dummy value is a clever, but is, at best, an
awkward and somewhat dubious general approach of detecting
particular conditions (should we perhaps label this a hack?).</p>
</li>
<li>
<p>Such approaches can easily lead to error prone code.</p>
<p>As implemented here, the two conditions to test are redundant,
since a dummy value has to be returned for end-of-file, whether a
carriage return was present or not. Thus not only is the client
code overly complex, but the strategy is faulty. Also, if the
&quot;dummy value&quot; actually happens to be present in the input stream,
it will indeed be treated as is any other value.</p>
</li>
<li>
<p>Detecting different ending conditions is relevant to the input
processing domain; processing different ending condition is
relevant to the client domain.</p>
</li>
<li>
<p>Testing multiple conditions in multiple ways will not scale
well, when other conditions are considered. The example considers a
special case, but, with slight extension for instance, the read
routine might be adapted easily to process console output directed
to a file, where there may be end-of-line, and possibly carriage
return characters, separating data items.</p>
</li>
</ul>
</div>
</li>
<li>
<p>The error handling is rigid with no flexibility for adaptation
to either the application environment or the client needs.</p>
<p>The read routine throws an exception for stream errors; but even
worse the routine buries its own private <tt class=
"computeroutput">fgw::bad_input</tt> exception. On the other hand,
the client routine may well wish to continue processing for bad
input, which may be either unreadable for the specified type (input
stream domain failure) or invalid (either as defined in the data,
the read routine or the client processing domain).</p>
</li>
<li>
<p>The <tt class="computeroutput">in.bad()</tt> condition is not
tested, which is the one more deserving of an exception. Actually
for a pre-standard library the <tt class="computeroutput">fail</tt>
bit may cover this case. But then, the read routine would throw a
<tt class="computeroutput">bad-data</tt> exception, when the error
actually is failure to read the data, whether good or bad.</p>
</li>
<li>
<p>For beginners especially, the code fails to take a valuable
opportunity to demonstrate basic and consistent mechanisms for
preventing invalid data values from getting past the application
external interfaces.</p>
</li>
<li>
<p>In any case, there needs to be consistent support for applying
both general overall application, as well as client routine
specific policies for both error handling and for error reporting.
Developing those policies is another subject, but the basic
interfaces can be made reasonably simple and crucial.</p>
</li>
<li>
<p>The input data appears to be constructed twice, once in the read
routine and once in the client routine, and probably with different
constructors. Typically this may not actually be a problem, but
this behavior can lead to subtle problems.</p>
</li>
<li>
<p>If, as suggested here, the client code needs to be abstracted
from the details of std::istream error conditions, why have any
dependency on std::istream? Perhaps, even more useful than
templating the input data type, is abstracting the concept of an
input source.</p>
</li>
<li>
<p>Names are critical. Here the routine does not read the input
stream; it reads the next item in the input stream. Hence the
routine could be called <tt class=
"computeroutput">readNext</tt>.</p>
</li>
<li>
<p>A simple, but important, advantage of abstracting the input
source type is that now the function of the routine is not merely
<tt class="computeroutput">readNext</tt>, but more generally
<tt class="computeroutput">getNext</tt>.</p>
<p>And, we already have a powerful and applicable mechanism in C++
for <tt class="computeroutput">getNext</tt> processing - in the
form of iterators, which are applicable here.</p>
</li>
<li>
<p>The routine is at too low a level for many uses, forcing the
client to devise one of many possible iteration constructs. In the
face of multiple exit conditions, these are too often error
prone.</p>
</li>
<li>
<p>The routine can only read input of one data type. This is
appropriate for &quot;self-defining&quot; streams, which, for instance,
provide tokens to identify the next item in the stream. There are
numerous other approaches to data type extension, probably well
beyond the intent here, but the applicability and limitation should
at least be noted.</p>
</li>
</ul>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Solution
Steps</h2>
</div>
<p>The problem issues above can be addressed systematically in a
series of steps. These are not all meant for one lesson, but each
is straightforward enough, even for beginners. They are all also
invaluable in their own rights for other problems. In fact, the
process here goes far towards an objective of teaching programming
based on principles and practices, rather than just belabouring
syntax and semantics.</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Provide a status variable parameter, which reports all
conditions that the application may or may not want to consider. In
its simplest form this a string of bit flags, although
supplementary data about the condition may be of interest also. A
higher level might introduce predicates, such as <tt class=
"computeroutput">status.isValid()</tt>.</p>
</li>
<li>
<p>Rather than directly reporting the failure codes particular to a
specific source, conditions need to be mapped to categories of
concern to the client.</p>
<p>Here, some such conditions might include: invalid parameters
(e.g., invalid port or URL), inaccessible input, un-initialized
(e.g., un-opened, un-connected ported ) input or un-initializable
input (e.g. open or connect failures), insufficient security
permissions, source failure, source warning, unreadable data,
special delimiter (carriage return, endof-line, white-space,
other), invalid data, along with provision for two or three
additional conditions to be used for specific implementations.</p>
</li>
<li>
<p>Allow the interface to set the conditions to abort on, to return
to the user, or to just skip over, and the conditions to be
reported to the application environment in any case.</p>
</li>
<li>
<p>Parse all errors reported by the source.</p>
</li>
<li>
<p>Issues of memory management, references, pointers, multiple
constructors - with possibly different behaviour, and data object
copying, all rear their awesome heads here as elsewhere. Better,
and simpler, is for the client routine to specify where the data is
to go.</p>
</li>
<li>
<p>Use the convention of returning a null, or invalid <tt class=
"computeroutput">end()</tt>, pointer, rather than attempting to
define dummy values. Think of all the fun, the C convention of
terminating strings with <tt class="computeroutput">\0</tt> has
caused.</p>
</li>
<li>
<p>Use a template parameter for the input source type as well as
the data type, and introduce template specialization to show
<tt class="computeroutput">std::istream</tt> handling.
Parameterizing the input source type is important, since it is, or
should be, an incidental focus of the application routine. In
particular, consistent handling of all input sources is invaluable
for an application and makes possible extensions to files,
communication protocols, database interfaces, GUIs, and sequences
in general.</p>
</li>
<li>
<p>Represent the source as a forward iterator parameter that wraps
either the actual source or an existing iterator. It is useful to
illustrate a complete templated iterator solution, but it is only
necessary to develop details for the basic template components, and
here only for <tt class="computeroutput">std::istream</tt>. The
rest can be left for reference to the standard definitions. On the
client side, begin and end iterators, for-loops, and dereferencing
idioms are simple and natural.</p>
</li>
<li>
<p>A fundamental extension, is for the template code to test both
the input source parameter type and input data parameter type for
<tt class="computeroutput">isValid</tt> routines, and use these to
check the input data values.</p>
</li>
<li>
<p>Both the error status conditions and the exception flags are now
better included in the iterator template class, rather than the
function parameter list.</p>
</li>
<li>
<p>Have the template code also test the iterator parameter type for
an <tt class="computeroutput">onError</tt> interface and report
errors to that interface.</p>
</li>
<li>
<p>Actually there are two parts (handling and reporting) to an
<tt class="computeroutput">onError</tt> routine and hence the
possibility for two routines:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>The first maps the conditions from a particular environment into
the more general client interface. It may also need to set a flag
to indicate if resuming input is possible and providing such a
mechanism.</p>
</li>
<li>
<p>The second, which may be part of the input routine itself,
passes information identifying the details to a common higher level
application reporting mechanism, for appropriate logging and
recording.</p>
</li>
</ul>
</div>
</li>
<li>
<p>A small, but valuable generalization is to look for an input
mapping routine in an interface borne by the iterator. This allows
data types and values in the input domain to be directly
transformed to data types and values in the client domain.</p>
</li>
<li>
<p>Similarly a filter routine can be used, if present, to bypass
unneeded source data.</p>
</li>
<li>
<p>Illustrate support for <tt class="computeroutput">to_string</tt>
and <tt class="computeroutput">from_string</tt> serialization
routines, for use with operators <tt class=
"computeroutput">&lt;&lt;</tt> and <tt class=
"computeroutput">&gt;&gt;</tt> for derived types.</p>
</li>
<li>
<p>When adapted for output, the iterator can also contain
formatting flags and delimiters.</p>
</li>
<li>
<p>This leads to raising the level of the routine.</p>
<p>Better, for many but not all purposes, would be a copy routine
(or move routine, if the input is consumed) following the STL
syntax - here, with <tt class="computeroutput">end()</tt> to be set
for the iterator return of conditions flagged by the caller. For
some applications, which need a lower level involvement in handling
special conditions, selected <tt class="computeroutput">end()</tt>
conditions can be processed by the client routine, with begin()
used to allow an attempt at resumption of input.</p>
</li>
</ol>
</div>
<p>And these seventeen progressive steps, I think, provide an
outline of a fairly complete solution to the problem of creating a
code structure for simply, safely, and effectively transferring
input data into an application framework, and by simple extension
output data (the homework exercise?). Various interfaces can be
made more general and more sophisticated as necessary, without
impact on client code. Alternatively, if client code needs to adapt
to additional conditions this can be added in a consistent and
compatible manner.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>
</div>
<p>The final result, or outline for a result, is considerably more
complicated than the initial small example, but there are many
valuable pedagogical reasons for developing it. In particular, it
should be emphatically taught when not to use code that is error
agnostic.</p>
<p>The fundamental lesson here is that there is a considerable
difference between production code and code for beginning exercises
or prototyping. This is easily spouted as a general principle, but
is difficult to teach effectively. The sample problem here provides
an ideal basis for illustrating this issue systematically and
indicating approaches to dealing with it.</p>
<p>The next most fundamental lesson is to assign responsibilities
appropriately, then to design interfaces that handle the
responsibilities, and finally to allow flexibility by providing
mechanisms to delegate responsibility for policies appropriately.
Here there are separate responsibilities in several places:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>for the input routine, in being complete in some definable
sense,</p>
</li>
<li>
<p>for the client interface, in specifying a request,</p>
</li>
<li>
<p>for a higher level routine, in parameterizing the request
according to design parameters and constraints,</p>
</li>
<li>
<p>for the input data class, in maintaining consistency and
integrity constraints according to class invariants,</p>
</li>
<li>
<p>for consistent error handling and reporting policies at the
application level, and for flexibility for appropriate
interventions by the using client.</p>
</li>
</ul>
</div>
<p>Understanding tradeoffs of where and how to apply generality,
simplicity, ease of use, and allowance for specific conditions is
fundamental. The solution should illustrate use of templates,
constructors, default parameters, and environment variables and
routines (including exception handlers), as appropriate, to design
and apply constraints and policy.</p>
<p>Also fundamental, is the realization that error handling is
basic for any significant code that is to actually be employed for
useful purposes. By analogy perhaps, with a numerical analysis
computation, the result is generally not of value, other than as a
guess at usefulness, unless error analysis has been performed to
determine how good the result actually is.</p>
<p>One basic tenet about error handling, that emphatically applies
here, is that applications need to catch all erroneous inputs at
the external interfaces. This can then limit significantly the data
validity testing needed later.</p>
<p>Since the student will undoubtedly be exposed to them, the
lesson might include tradeoffs in various approaches to error
returns through special values, (e.g., <tt class=
"computeroutput">end()</tt>), through pairs, through bit flags,
through status objects, through exceptions, etc. The lesson can
emphasize the dangers, particularly for critical application
interfaces, of starting with more limited approaches that are
inflexible and that do not scale.</p>
<p>The final result may seem more complex than needed for what
seems like a simple problem, but I would respectfully disagree with
the premise. The problem posed is not trivial; and ignoring basic
issues makes for an incomplete solution, not a simple solution.
Reasonably simple solutions can still be arrived at by dealing with
each issue separately and appropriately.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e271" id=
"d0e271"></a>Techniques</h2>
</div>
<p>The lessons here are general but the implementations, if they
are to be illustrated in C++ code, are admittedly non-trivial. As
examples though, the techniques can be easily taught as idioms, to
be imitated, and these idioms are also useful in many broader
contexts.</p>
<p>From a teaching and learning perspective, there are only two
roads to writing useful code in C++. The first is to understand the
C++ language and library standard, and particular compiler
deviations from it in detail (not particularly to be recommended),
The second is by extensive reading and following of useful models
(which is what all the worthwhile C++ beginner and intermediate
texts provide). Ideally this accomplished with a mentor.</p>
<p>The basic techniques here include:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>Basic bit flag masks to indicate status or state; supported by
enums that are powers of two, operations on sets of flags, and by
<tt class="computeroutput">status.isXXX()</tt> type predicates.</p>
</li>
<li>
<p>Rudiments of exception handling.</p>
</li>
<li>
<p>Type generalization through templates, with basic template
specialization.</p>
</li>
<li>
<p>STL iterator concepts, at at least a high level, and their use
in general algorithms such as copy.</p>
<p>In particular, a strong preference, if only for consistency, for
using STL constructs and concepts where appropriate can be
inculcated. For instance, encapsulating iteration (here copy) in a
library routine, rather than using a variety of for, while and do
constructs is worthwhile.</p>
</li>
<li>
<p>Parameterization options through template parameters, <tt class=
"computeroutput">typedef</tt> statements, constructor arguments,
default function parameters and environment support (here, at
least, exception handlers).</p>
</li>
<li>
<p>Testing types and objects for extended interfaces through
compile time (template based) and run time (dynamic cast)
techniques. Here, the solution tries to allow existing data objects
and iterators to be used, but takes advantage of additional
capabilities if provided.</p>
</li>
<li>
<p>And yes, idiosyncrasies of various input mechanisms also can be
explored.</p>
</li>
</ul>
</div>
<p>Perhaps the final lesson is my perception of C++ as a really
ugly tool for developing beautiful constructs. As one mentor, once
said, &quot;You don't ask a cow why it works the way it does, you just
learn to milk it.&quot;</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>
</div>
<p>The goal of making C++ more accessible to novices is admirable,
but oversimplifying the issues does not appear useful; nor does
dwelling on details of <tt class="computeroutput">std::istream</tt>
to the exclusion of more basic issues.</p>
<p>The discussion above leads to approaches to that goal on two
levels:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>At the client level, the final copy routine is indeed simple,
and can illustrate the power of the tailoring mechanisms to provide
a significant range of underlying functionality including:
comprehensive handling of unusual conditions, full reporting of
error conditions, the ability to adapt to any input source, the
ability to map data from different sources to common types,
scaling, formats and representations, and the ability to filter
extraneous input.</p>
</li>
<li>
<p>At the development level, the analysis of problems and solutions
illustrates both design considerations needed for building code
that can adapt to a broad range of application needs, as well as
coding considerations in the use of C++ facilities for
accomplishing this. This surely is a worthwhile introduction to
what programming is all about.</p>
</li>
</ul>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
