    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Questions &amp; Answers</title>
        <link>https://members.accu.org/index.php/journals/742</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 10, #6 - Sep 1998</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c135/">106</a>
                    (12)
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Questions &amp; Answers</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 05 September 1998 13:15:28 +01:00 or Sat, 05 September 1998 13:15:28 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>I have had several phone calls over the last couple of months
relating to using Borland (sorry Inprise's) development tools. The
result has been considerable gnashing of teeth and irritation on my
part with the unhelpfulness of some of Inprise's technical support.
I am afraid that I have mislaid the names of the querists.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e24" id="d0e24"></a>A Question
About Templates</h2>
</div>
<p>The original question was about writing templates for
collections of mathematical types (e.g. a matrix that could be
instantiated for all arithmetic types including complex ones). The
gist of the question was whether the compiler should issue error
messages for member functions that could not be compiled for the
current template argument. To focus on the problem I wrote the
following brief program using a couple of very trivial classes.</p>
<pre class="programlisting">
#include &lt;iostream&gt;
using std::cout;
using std::endl;
template&lt;typename T&gt; class X {
  T t;
  public:
   void report (){cout &lt;&lt; t.report();}const
   void report1(){cout &lt;&lt; t.report1();}const
   };
class A {
public:
  char const* report(){return &quot;report&quot;;}const
   char const* report1(){return &quot;report 1&quot;;}const
   };
class B {
public:
  char const* report(){return &quot;report&quot;;}const
   };
int main(){
X&lt;A&gt; a;
a.report(); cout &lt;&lt; endl; a.report1();
cout &lt;&lt; endl;
X&lt;B&gt; b;
b.report();
return 0;
}
</pre>
<p>Borland C++ 5.02 fails to compile this because it tries to
compile an X&lt;B&gt;::report1() even though this function is never
used. I tried this program on three other compilers that I had to
hand. Watcom C++ 11.0 fails in exactly the same way that Borland
C++ 5.02 does. Metrowerks' CodeWarrior and Microsoft Visual C++ 5.0
had no problems. Borland technical support then told me that their
C++ 5.3 compiler had no problem with the code. Now some of you may
want to know where you can buy that compiler. The simplistic answer
is that you cannot. Borland C++ 5.02 is the last of that line. But
I will tackle that when I deal with the next question.</p>
<p>For many years I have reminded people that Watcom had a line of
excellent compilers that were worth looking at. You can imagine my
feelings when they both acknowledged this flaw in their current
product and declined to provide any information on when they might
release a new version with better template support. In my opinion,
any compiler that fails to compile the above code is unusable for
modern C++. The whole of the STL relies on the languages guarantee
that member functions of templates that are not used for a specific
instantiation will not be compiled.</p>
<p>While Microsoft 5.0 handles this code correctly, it has a number
of other problems including its flawed implementation of the vector
template and its lack of support for 16-bit DOS applications.
Borland C++ Builder 3.0 and CodeWarrior Pro 3.0 share that latter
limitation. Hopefully the vector problem will be corrected in the
forthcoming release of VC++ 6.0.</p>
<p>There is a secondary question and that is how you can write
templates that will compile with compilers that refuse to limit
themselves to instantiating only member functions that are used.
There is an ugly work around. Move all the potentially troublesome
member functions out into the local namespace. Careful, because you
must now explicitly declare them as templates and add an extra
parameter. In the above example we would need something like:</p>
<p>template &lt;typename T&gt; class X; template &lt;typename T&gt;
void report1(X&lt;T&gt; const &amp; xt){ cout &lt;&lt;
xt.t.report1(); }</p>
<p>Note that you have to pre-declare the template class so that you
can declare the correct parameter in <tt class=
"function">report1()</tt>. Now you have to break encapsulation by
making <tt class="function">report1()</tt> a <tt class=
"literal">friend</tt> of <tt class="classname">class X</tt>. Worse
still, you have to change the call in <tt class=
"function">main</tt> from <tt class="literal">a.report1()</tt> to
<tt class="literal">report1(a)</tt>. If you have no choice of which
compiler to use you will have to use ugly hacks like this but your
time would be better spent persuading the holder of the purse
strings that you need a better compiler.</p>
<p>You may think that I am being unreasonable in expecting modern
compilers to compile DOS Apps but judging by the number of
telephone and email enquiries I get there is still a need for
these. At current costs there would seem to be little justification
for running less than some form of WIN32 based system, however
there is the matter of stability. If I am running some form of
control system on a dedicated machine I certainly will be more
interested in stability and reliability than in all the latest
features. Such applications aren't at the glamorous end of the
market but they continue to exist and will do so for quite a few
years more. At the same time the writers of such applications would
like to use compilers that support Standard C++ or even just
Standard C. It is becoming increasingly difficult to even buy the
latter. I would be very happy to publish details of currently
available C and or C++ compilers that support DOS development. If
you know of any, send me a brief write-up for my 'Members'
Experiences' column.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e66" id="d0e66"></a>Can I Use
Borland C++ Builder for Ordinary C++?</h2>
</div>
<p>When this question arose in a telephone call, I simply answered
yes. That seemed obvious to me because there had to be a compiler
in there somewhere. Then I tried it. I found the Console Wizard and
used it to create a suitable starting point. Wrote a simple little
program, compiled and ran it.</p>
<p>The next thing I tried was adding another file. At this stage I
became completely confused and lost it entirely. I do not believe
that the mechanism for producing multi-file console apps is
anywhere near intuitive. I actually hit upon a workable solution by
pure chance. For the benefit of those interested let me outline
what seems to work for me. First use the console wizard to create a
default project. Now just before <tt class="function">main()</tt>
add the line:</p>
<pre class="programlisting">
int mymain();
</pre>
<p>Then I replaced the provided <tt class="literal">return 0;</tt>
with <tt class="literal">return mymain();</tt></p>
<p>That sets things up so we can focus on conventional C++
programming. The next problem is that files must be added to the
project in a very specific fashion unless you want to get bogged in
tedious details. First the files you want to add must already exist
(even if they are empty) and should have a <tt class=
"filename">.cpp</tt> extension. Then you must use the 'add to
project' item in the project drop down menu.</p>
<p>You will need to rename your <tt class="function">main()</tt> as
<tt class="function">mymain()</tt>. Note that this means that it
must have return statements because the C++ special rule for
<tt class="function">main()</tt> will not apply to <tt class=
"function">mymain()</tt> which will behave as an ordinary
function.</p>
<p>I tested this scheme using the Harpist's implementation of my
old Tartan program; the one that uses the 'playpen' window. I
tidied up the problem it had with Microsoft's (MFC, I think)
<tt class="literal">__try</tt> and <tt class=
"literal">__except</tt>. Uses of these Microsoft items seems to be
a problem for those porting code to other compilers. Can someone
write up a way of dealing with these without just commenting them
away.</p>
<p>After that it compiled and linked to produce an executable that
behaved as expected.</p>
<p>I would be happy to publish any improvements you have found to
the above strategy. One interesting point is that Borland C++
Builder supports use of <tt class="filename">conio.h</tt> when you
are using console mode.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e119" id="d0e119"></a>Problems with
C++</h2>
</div>
<p>This time I do know who raised the question but because it
results in more than a little implicit criticism of her employer's
source code I am keeping it to myself. The company is one that
employs a number of excellent C programmers who have a good
understanding of how to use C to achieve their objectives. That may
be proving a handicap when they use C++. The following email
arrived when the writer needed to let off steam about the code she
was maintaining. Unfortunately, I think she blames the language
when I think the problems are caused by a failure to understand
that C++ is very different from C.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>I'm not a fan of C++ anyway but there is one feature that is
particularly annoying. Class declarations go in a single header
file with both public and private members. This header file has to
be included in any component that uses the public members. When the
private members of the class are changed (REM this is the whole
point of object-oriented code) both the header file and the CPP
file for that class are usually affected. This means that every
component that includes the header gets rebuilt unnecessarily.</p>
<p>In the old days when C programmers were in charge of devising
their own object-oriented approach, it was possible to put public
stuff in one header and private stuff in another. Then only the
source file included the private stuff and components were only
rebuilt when actually necessary.</p>
<p>I hate sitting around waiting for a compiler to do things that I
know shouldn't be done just because some moronic humans decided
that other moronic humans couldn't be trusted to program safely in
C. There's always a better C solution to a C++ problem - especially
if one is not a human...</p>
</blockquote>
</div>
<p>Let me start by responding to that parenthetical remark as to
the purpose of OO. Object-orientation is not just writing ADTs.
There is much more to it and data hiding is just one feature that
helps support OO. Encapsulation of functionality is important
because it helps to prevent accidental abuse of functions by
calling them in contexts where they were not intended. Inheritance
is important because it allows programmers to use other people's
work without having problems when they make changes, debug it etc.
Inheritance supports low level reuse that can save a large amount
of maintenance effort. Every time a programmer resorts to cut and
paste s/he is adding another place where errors can lurk
undetected. Virtual member functions support polymorphism and
though you can do this in C it is orders of magnitude harder when
you want to add another variation. In effect to provide all these
features in C you have to hand code what C++ automates. Then there
is the issue of type safe linkage, automatic in C++ but requiring a
level of professional discipline in C that is only exhibited by a
tiny fraction of programmers.</p>
<p>If you work has no need of polymorphism (note that it is
impossible to use OOP without some implementation of dynamic
binding) and handles relatively small quantities of code (C++ was
really designed for work where fifty thousand lines of source is a
small project) then by all means stick with C. But if you elect to
use C++ then learn to write it properly.</p>
<p>The writer of the email wants to hide data and other
implementation details out of site of the application so that
changes do not force rebuilds. It might have crossed her mind that
a language that is widely used in the telecommunications industry
(where projects of a million lines plus are not unusual) must have
ways of avoiding repeated rebuilds. Of course there is and it is
actually very simple as long as you are not trying to write
hierarchies of polymorphic classes (even that is not difficult, but
needs a little more skill and experience).</p>
<p>The first thing you do is to list all the functionality that a
user of the class will want. You also decide on a name for both the
public type and its private implementation. For example:</p>
<pre class="programlisting">
class hiddenMyType ;
class MyType {
  hiddenMyType * hidden;
public:
  MyType();
  MyType(MyType const &amp;);
  ~MyType();
  MyType &amp; operator=(MyType const &amp;);
  int example();
  // other function declarations
};
</pre>
<p>The only implementation detail that the user can see is that
somewhere there is a class called <tt class=
"classname">hiddenMyType</tt>. All that they need to use this class
is the object code that is produced by compiling its implementation
file. Any changes to the implementation file will require
relinking, but that cannot be avoided, if you change your code the
new code has to get into the executable somehow.</p>
<p>When you look at the definition of <tt class=
"classname">hiddenMyType</tt> you will find that its 'public'
interface is (barring replacing <tt class="classname">MyType</tt>
by <tt class="classname">hiddenMyType</tt>) identical to that for
<tt class="classname">MyType</tt>. In fact <tt class=
"classname">hiddenMyType</tt> is exactly what you would have
written for <tt class="classname">MyType</tt> had you not been
worried about rebuild times, creating public dependencies on
implementation details etc. This being the case, it is trivial to
convert an overly exposed class into a largely hidden one.</p>
<p>Implementation of <tt class="classname">MyType</tt>
functionality is simple, but let me provide a sample to help you
believe that.</p>
<p>Constructors:</p>
<pre class="programlisting">
MyType::MyType()
          : hidden(new hiddenMyType){}
MyType::MyType(MyType const &amp; mt)
      : hidden(new hiddenMyType(mt)){}
</pre>
<p>Destructor:</p>
<pre class="programlisting">
MyType::~MyType(){ delete hidden;}
</pre>
<p>Assignment:</p>
<pre class="programlisting">
MyType &amp; MyType::operator=(
                    MyType const &amp;mt){
  hiddenMyType *temp = new(*(mt.hidden));
  delete hidden, hidden = temp;
  return *this;
}
</pre>
<p>Other functions:</p>
<pre class="programlisting">
  int MyType::example(){
    return hidden-&gt;example();
}
</pre>
<p>Currently you will have to write a slightly different forwarding
function where the return type is void (just leave out 'return'.
When compilers catch up you will not even have to do that because
one of the late changes to C++ was to allow you to return an
expression of void type to a void return type.</p>
<p>Now you may notice that this mechanism leans on the use of new
to provide the real objects. If your program creates and destroys
many instances of <tt class="classname">MyType</tt> you might think
that you could not pay the efficiency cost for this mechanism. If
that proves to be the case, a very small change restores maximum
efficiency. All you need to do is to throw away the intermediate
wrapper and rename <tt class="classname">hiddenMyType</tt> as
<tt class="classname">MyType</tt>. This is a mechanism that is not
available in languages such as Java where objects are always
created dynamically. The skilled C++ class designer can even
provide an intermediate 'optimisation' by writing a specific
version of new/delete for <tt class=
"classname">hiddenMyType</tt>.</p>
<p>Just as I would not blame C for the crassly stupid source code
some programmers serve up, I would not blame C++ for the inept
source code that many C++ programmers write. Of course an
experienced C programmer with no more than average knowledge of C++
idioms and techniques will have quite a lot of difficulty with
maintaining the kind of C++ that other C programmers have
written.</p>
<p>Finally, I do not believe that any programmer using OO would use
C if given the choice. If you think otherwise I suspect you have
not fully grasped what OOP is about. On the other hand that is not
to say that C is useless, despite the claims of the enthusiasts,
OOP is not the answer to all programming problems. In fact there
are whole areas of software architecture where using OOP is a
disaster.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e207" id="d0e207"></a>Comparing
doubles from Jon Jagger <tt class="email">&lt;<a href=
"mailto:Jon.Jagger@qatraining.com">Jon.Jagger@qatraining.com</a>&gt;</tt></h2>
</div>
<p>Re C Vu 10.5 p 33 comparing two doubles...</p>
<p>This question is covered in the excellent C FAQs book by Steve
Summit. I quote from p250... &quot;<span class="quote">Since the
absolute accuracy of floating point values varies, by definition,
with their magnitude, the best way of comparing two floating point
values is to use an accuracy threshold that is relative to the
magnitude of the numbers being compared.</span>&quot;</p>
<p>It then suggests...</p>
<div class="literallayout">
<p>     NO    <tt class="literal">if (a == b)</tt><br>
     YES   <tt class="literal">if (fabs(a - b) &lt;=
epsilon*a)</tt></p>
</div>
<p>Which is what Knuth suggests [4.2.2 pp 217-6]</p>
<p>However there was a thread in comp.lang.c on this (25.1.96) in
which the poster (Charlie Coats) suggested this was flawed if
<tt class="varname">a</tt> equals zero. He suggested that a safer
version is...</p>
<pre class="programlisting">
|a - b| / sqrt(a*a + b*b + epsilon*epsilon) &lt; epsilon
</pre>
<p>and the choice of epsilon*epsilon == 1e-20 gives a colloquial
meaning of:</p>
<p>&quot;<span class="quote">a and b agree to about 10 significant
digits</span>&quot;</p>
<p>Another suggestion (also from C FAQs) ...</p>
<p>&quot;<span class="quote">Doug Gwyn suggests a relative difference
function. It returns the relative difference of two real numbers:
0.0 if they are exactly the same; otherwise the ratio of the
difference to the larger of the two:</span>&quot;</p>
<pre class="programlisting">
#define Abs(x)      ((x) &lt; 0 ? -(x) : (x))
#define Max(a,b)    ((a) &gt; (b) ? (a) : (b))
double RelDif(double a, double b) {
      double c = Abs(a);
      double d = Abs(b);
      d = Max(c,d);
      return d == 0.0 ? 0.0 : Abs(a-b)/d;
}
</pre>
<p>typical use being</p>
<pre class="programlisting">
     if (RelDif(a,b) &lt;= TOLERANCE) ...
</pre>
<p>P.S. Just noticed in the above that d could be zero.</p>
<p class="c2"><span class="remark">(but then <tt class=
"literal">Abs(a-b)/d</tt> will not be evaluated. However there is
an apparently much more serious problem if <tt class=
"varname">a</tt> and <tt class="varname">b</tt> are very large
while <tt class="varname">d</tt> is very small. I think I would
prefer to do this in C++ and be able to catch an exception when
necessary)</span></p>
<p class="c2"><span class="remark">Thanks Jon. I must admit that I
am astounded to see someone of Doug's expertise provide code that
so grossly breaches the conventions re pre-processor
identifiers.</span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e272" id="d0e272"></a>Why Does the
Compiler Insert int? From: David Jepps <tt class=
"email">&lt;<a href=
"mailto:djepps@compuserve.com">djepps@compuserve.com</a>&gt;</tt></h2>
</div>
<div class="blockquote">
<blockquote class="blockquote">
<p>In my testing of some new code against some old code I came
across the following.</p>
<p>When I accidentally defined and declared a function as</p>
<pre class="programlisting">
static Function_name(int parameter) {

    double a_double =3D 0.00;
     /*code*/
     return a_double;
}
</pre>
<p>You can see what I have missed out? I should have defined and
declared the function as</p>
<pre class="programlisting">
static double Function_name(int parameter).
</pre>
<p>Testing highlighted the problem. When the function should have
been returning 7.35 it returned 7.0. It seemed to round to an
integer.</p>
<p>The compiler gave no warning though I expect that Lint would
have done.</p>
</blockquote>
</div>
<p>Let me start by saying that I am very surprised that lint did
not warn you of the mismatch between the deduced return type and
the actual return type. Perhaps someone had been customising the
copy you were using so that it did not warn about such problems
(good versions of lint such as PCLint allow users to determine what
they want checked, older generic versions of lint have a reputation
for being a pain because warnings cannot be suppressed)</p>
<p>Now to your problem. For some reason that I have never
understood the designers of C, Kernighan and Ritchie, not only
experimented with the syntax of declarations (once described by
Bjarne Stroustrup as an interesting failure that we all continue to
live with. Only the designers of Java know why they elected to
continue the experiment. C++ had to because it needed direct use of
legacy code.) but they also decided to have a default type. This
decision has been the bane of implementors lives as well as being
the source of innumerable errors in source code. The rule is
simple, if the compiler can deduce that something is a declaration
and the base type is missing it must assume that the missing type
is int. Note that 'must'. The compiler has no room for discretion.
What drives implementors up the wall is that they have to write
very clever parsers that can identify these instances of implicit
<tt class="type">int</tt>.</p>
<p>What is even sillier is that of all the changes that you could
make that might invalidate legacy code this one must be easy to
fix. The compiler can already deduce the need for an <tt class=
"type">int</tt> so all we need is a special version that generates
new source code with those implicit <tt class="type">int</tt>s made
explicit. However being technically easy to fix does not make it
easy to persuade diehards to change the rule. For years C++
laboured under the increasing burden of supporting implicit
<tt class="type">int</tt> (C++ has potential for far more complex
parsing problems because of such things as templates). Finally it
was reported to the C++ Standards Committees that C9X was going to
grasp the nettle and remove the already deprecated (in other words
C89 had issued a warning that the feature might go) implicit
<tt class="type">int</tt> feature. This information arrived just in
time for the change to be made in C++. I will not say that it was
made with cries of joy but sighs of relieve could be heard in some
places.</p>
<p>Ironically, the C Standards Committees had not actually made a
final decision to remove implicit <tt class="type">int</tt> despite
considerable encouragement from those who wanted a safer language.
However when C++ was known to have removed it, the balance swung
sufficiently for C9X to actually remove it (but there is always
time for them to recant and put it back&#61514;)</p>
<p>Along side implicit <tt class="type">int</tt> we have that other
horror in C (that C++ refused to support from day 1) the implicit
function declaration. Consider:</p>
<pre class="programlisting">
#include &lt;stdio.h&gt;
int main() {
  printf(&quot;%f&quot;, sqrt(7));
  return 0;
}
</pre>
<p>A C++ compiler will decline to compile this on the grounds that
<tt class="function">sqrt()</tt> has not been declared. A C
compiler is required to compile it and deduce a declaration for
<tt class="function">sqrt()</tt>. The deduced declaration will
be:</p>
<pre class="programlisting">
int sqrt(int);
</pre>
<p>The return type is our old friend implicit <tt class=
"type">int</tt>. The parameter type is <tt class="type">int</tt>
because that is the type of 7 (used in the call).</p>
<p>The object code generated by the compiler is now passed to the
linker to produce an executable. When the linker determines that
the user has not provided a function called <tt class=
"function">sqrt</tt> (note that C does not provide type information
to the linker, C++ does) it searches the available libraries to
resolve the problem. It finds sqrt in the maths library and links
that in. Now we have a complete mismatch between the assumptions of
the compiler and the function actually called. One of the major
tasks of lint was to identify such type mismatches, however it
might not help here because lint works with source code and the
library with the conflicting implementation is object code. Of
course good versions of lint will list externals that it has been
unable to check.</p>
<p>As far as I know the C Standards Committees have resisted all
efforts to require functions to be declared before use. They
happily add all sorts of new features, but why bother to break the
bad legacy code of lazy or ignorant programmers when all sane
programmers ensure that all their functions have visible
prototypes? Of course we all sometimes forget to make prototypes
visible &#9786;.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e350" id="d0e350"></a>Real number
comparisons from Colin Hersom <tt class="email">&lt;<a href=
"mailto:colin@hedgehog.cix.co.uk">colin@hedgehog.cix.co.uk</a>&gt;</tt></h2>
</div>
<p>In your piece in C Vu 10.5 on checking for equality of two real
numbers, your solution suggested writing an entirely new function
to perform the test. Now I do not disagree with the solution, but I
wonder whether it would be wise to suggest the production such a
function to someone who is sufficiently unaware of the properties
of floating point and their implementation. There is already a
function available, and its behaviour sufficiently well-understood
to be used, which can be brought into use in a way which produces
the answer required, not quite as efficiently, maybe, but possibly
more understandable. If we observe that comparing two values is
equivalent to comparing their difference with zero, the comparison
can correctly be made by:</p>
<pre class="programlisting">
  round(x-y, 5) == 0.0
</pre>
<p>Why? Well, when <tt class="literal">x-y</tt> is within 0.00005
of zero, <tt class="function">floor()</tt> will generate a zero,
and if any compiler/machine generated a non-zero value by dividing
zero by a positive number I would be extremely worried. I would
advocate, however, that this calculation be wrapped in a function
(or macro) so that if problems are found then it can be revised
locally.</p>
<p>I am concerned, however, that the wrong problem may have been
solved. You state that the originator worried about comparing 1.41
with 1.4017 to two places, because it compared equal when rounding
before comparison compares unequal. Does the problem domain require
that rounded values be compared, not to compare to a specific
number of places? It is unusual, but it could be right. In this
case, surely the answer is to use some form of fixed point, rather
than floating point, representation. Wrapping up integers in a
structure/class is probably an effective way of doing this, but a
certain degree of knowledge is required to get all the operators
correct. By the way, if I create a class in C++ to do this, I get
copy and assignment constructors automatically (assuming I don't
override), but I don't get equality operators - why not? Surely if
shallow copying is correct for a class, then shallow comparison is
also correct?</p>
<p>Let us go right back to the beginning. You state: &quot;on rare
occasions this line produces inconsistent results... successive
evaluations of the conditional evaluate differently&quot;. Now I have to
assume that you mean two evaluations with the same input values.
Did you mean that the same program running on the same data
produces different results? Or did you mean that two different
calculations, which happen to produce the same input, give
different comparisons? The former case is worrying - why should a
program behave differently on the same data? The second case
depends on how the inputs to the two comparisons were deemed to be
the same. If you print out the values of <tt class="literal">x
&amp; y</tt> before the comparison, are you getting the full story?
No. The binary/decimal approximations strike again. If the bit
values were displayed, rather than the value converted to decimal,
then I suspect that they would appear subtly different (down in a
low-significance bit). Does the <tt class=
"function">compare_doubles</tt> solve the problem? Well yes and no.
For the particular occasions that were causing a problem
originally, the answers will probably turn out consistently. But
other times, when the difference between <tt class="literal">x
&amp; y</tt> appears to be 0.00001, the same problem will occur,
sometimes evaluating equal, sometimes not.</p>
<p>Colin</p>
<p class="c2"><span class="remark">Thanks for your input. The
problem as presented to me was multi-layered. Your reuse of the
original function is interesting but still leaves a substantial
maintenance task as substantial changes would have to be made to a
large volume of existing code. The person raising the problem was
not responsible for any part of the original code. Nor was he
responsible for the original design. Perhaps I can
elaborate.</span></p>
<p class="c2"><span class="remark">Apparently information about was
stored that about the date of a process together with information
about a flow rate. For some bizarre reason processes were not
assigned unique identifiers but it was assumed that two processes
that started on the same date and had the same flow rate (to some
number of decimal places) would be the same process. The comparison
was intended to identify a process from its start date and flow
rate. You may find this a weird design, I certainly do, but the
maintenance programmer was faced with this legacy code to port to
new equipment. The newly compiled code generally worked however
test code running exactly the same data produced inconsistent
answers for certain rare data sets. Such inconsistencies might
actually be mathematically correct even if surprising. To
understand this you need to understand rounding rules where large
data sets are in use. For example suppose you had a sample of ten
thousand values quoted to one decimal place and they were to be
rounded to whole numbers. If you follow the rule that many learnt
in school, all values ending in .0, .1, .2, .3, .4 would be rounded
down all others would round up. Let us list those in a table and
see the consequences (we suppose that there are exactly one
thousand instances of each terminal digit.</span></p>
<div class="informaltable">
<table border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;9%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;9%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;9%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;8%&quot;&gt;
&lt;col width=&quot;9%&quot;&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>terminal digit</td>
<td>.0</td>
<td>.1</td>
<td>.2</td>
<td>.3</td>
<td>.4</td>
<td>.5</td>
<td>.6</td>
<td>.7</td>
<td>.8</td>
<td>.9</td>
<td>total</td>
</tr>
<tr>
<td>rounding error (for 1000 cases</td>
<td>0</td>
<td>-100</td>
<td>-200</td>
<td>-300</td>
<td>-400</td>
<td>500</td>
<td>400</td>
<td>300</td>
<td>200</td>
<td>100</td>
<td>500</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p class="c2"><span class="remark">Now you see that the rule that
says round terminal fives up breaks symmetry and introduces a
systematic error. The simplest rule to correct for this is to
alternately round five up and down. As this is a deterministic
algorithm you could computerise it by providing your rounding
function with a memory (via a <tt class="literal">static</tt>
variable). Doing so would break the above design even though it
would be mathematically superior and much preferred by
statisticians.</span></p>
<p class="c2"><span class="remark">Again, I agree that making the
design work really needs fixed-point arithmetic. That also means
making substantial changes to existing code.</span></p>
<p class="c2"><span class="remark">Now let me answer your aside.
The only reason that C++ provides copying and copy assignment by
default was for compatibility with existing C code. C <tt class=
"literal">struct</tt>s can be both copied and assigned. In C these
are bitwise operations and were so in the early versions of C++.
Clearly this was an error and so C++ fairly soon provided
memberwise copying. Where there are no user-defined constructors
the two copying algorithms produce the same result. Default
constructors and destructors are provided for the same reasons. Had
there not been a compatibility problem there would have been no
compiler generated functions. There are good reasons for only using
compiler generated functions for pure value (attribute) types, and
even then you might be well advised to provide them yourself. Pure
object types should not support public copying (though a clone
function makes sense). Once you have declared a private or
protected copy constructor you will need to declare a public
default constructor if you want default objects and you will need a
destructor to qualify as virtual if your objects may have
polymorphic varieties.</span></p>
<p class="c2"><span class="remark">To return to the problem. It was
one that many of us have to deal with, how to improve code that has
been written for a seriously flawed design when there is neither
time nor other resources to do it over. Try explaining the above
mathematical subtleties to management. What do you do when
apparently rare problems require a major effort to fix. You know,
like the once in a millennium problem that we just happen to have
approaching.</span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e459" id="d0e459"></a>More On
Random Number Generators from Graham Jones <tt class=
"email">&lt;<a href=
"mailto:graham@balnakeil.demon.co.uk">graham@balnakeil.demon.co.uk</a>&gt;</tt></h2>
</div>
<p>Random numbers again. I felt very unfairly criticised by your
comments in C Vu 10.5. My <tt class="function">uniformrand()</tt>
was never intended to be any kind of attempt to improve a poor RNG.
Indeed, the latter part of my article (after the &quot;Quality of your
<tt class="function">rand()</tt>&quot; section) was based on the
assumption that the RNG is use was not the source of any
distribution problems.</p>
<p>The problem with your <tt class="function">uniformrand()</tt>
(and Stewart Brodie's) is that if <tt class=
"constant">RAND_MAX</tt> is small, say 32767, the numbers returned
will be rather spaced out. It will never generate any number
between .00001 and .00003. After just 250 calls, it will, more
likely than not, have returned the same number twice. For the kind
of uses that I have for a function such as <tt class=
"function">uniformrand()</tt> these are gross defects. They are far
more important than the deficiencies caused by even a rather poor
<tt class="function">rand()</tt>. My <tt class=
"function">uniformrand()</tt> was an attempt to &quot;fill in the holes&quot;
to the extent possible given the precision of doubles.</p>
<p>How on Earth did you come to the conclusion that I was trying to
repair a bad RNG? How ever did you manage to explain the connection
between <tt class="constant">DBL_EPSILON</tt> and the number of
times <tt class="function">rand()</tt> ought to be called to reduce
dependence between successive values?</p>
<p>You also criticised my other code. I think you are wrong, but if
you're right, you need to explain more than you have so far.
Suppose I have a high quality <tt class="function">rand()</tt> with
<tt class="constant">RAND_MAX</tt> equal to 32767. Now I want to
generate evenly distributed integers in the range 1 to 20000, i.e.,
the kind of problem Silas originally posed. According to you I
should abandon my RNG and get another one. As far as I can see, one
call to <tt class="function">rand()</tt> produces 15 (nearly)
random bits, two calls produce 30, three 45, and so on. This is the
idea my code is based on, and I'd like to know what's wrong with
it. Obviously there will be particular situations where you would
be better off obtaining another RNG, but what is wrong with the
general principle? It's not enough to say I am building on a
&quot;possibly unreliable base&quot;, since the potential replacement is
&quot;possibly unreliable&quot; too - indeed, it may be worse. On re-reading
my article, I noticed there is a real error in my code, but oddly,
no-one has commented on that.</p>
<p class="c2"><span class="remark">Thanks for patiently explaining
your objective. Obviously I was deeply confused by your code
because I <span class="bold"><b>knew</b></span> that it would not
generate more values, just different ones. That is why I jumped to
the erroneous conclusion that you were trying to flatten the
distribution produced by a bad RNG. I was guilty if being
insufficiently detailed in what I said. You cannot increase the
number of values generated though you can change those values in a
variety of different ways. It is important not to confuse the
number of bits used to provide a value with the 'fineness' of the
distribution.</span></p>
<p class="c2"><span class="remark">The fundamental flaw in using
<tt class="function">rand()</tt> or any other algorithmic
pseudo-random number generator based on a single seed is that (good
ones) will have a cycle length equal to <tt class=
"constant">RAND_MAX+1</tt>. In other words it can never generate a
repeat until all other values have been generated. Composing random
values through several calls to a RNG does not solve that problem
nor will it add more values to the set of possible ones. Actually,
if you call your RNG a number of times that is not coprime with the
cycle length of your RNG you will reduce the cycle length of
possible values. The maximum number of distinct values that you can
generate from an PRNG is the maximum cycle length. For single seed
generators that is normally no larger than <tt class=
"constant">RAND_MAX+1</tt>. It is possible to have much longer
cycles if the PRNG uses a wider internal range than that exposed by
the return values. It is also possible to have far more complicated
cycles of values if you use an PRNG that has two parameters
controlling returned values.</span></p>
<p class="c2"><span class="remark">If you need an RNG that can
return any number within a range with equal probability at every
call you cannot use an algorithmic based PRNG. For such purposes
you must use some truly random mechanism. You can sometimes get
acceptable performance if the cycle length of your PRNG is several
orders of magnitude larger than the range of values you want to
use.</span></p>
<p class="c2"><span class="remark">Let me give a small example.
Suppose that you have a pseudo-random number generator that
generates single digits and that the cycle is:
5-&gt;7-&gt;2-&gt;0-&gt;9-&gt;3-&gt;4-&gt;1-&gt;8-&gt;6-&gt;5. Now
you use it to generate two digit values by successive calls to it.
The result will be (depending on where you start in the cycle) one
of two five number cycles:
57-&gt;20-&gt;93-&gt;41-&gt;86-&gt;57&hellip; or
72-&gt;09-.34-&gt;18-&gt;65-&gt;72&hellip;Generating 3-digit values
will be slightly better in the sense that you will be back to a
single cycle of ten values:
572-&gt;093-&gt;418-&gt;657-&gt;209-&gt;341-&gt;865-&gt;720-&gt;934-&gt;186-&gt;572&hellip;</span></p>
<p class="c2"><span class="remark">Where is the improved
uniformity? That is right, it was an illusion.</span></p>
<p class="c2"><span class="remark">If those that have got this far
understand the subtle viciousness of the pseudo part of PRNGs the
will be able to identify the flaw in the following seductive piece
of code:</span></p>
<pre class="programlisting">
double random_value(){
  int temp;
  int i;
  double value=0;
  for(i=0; i&lt;15; i++){
    value += (rand() % 10);
    value /= 10;
  }
  return value;
}
</pre>
<p class="c2"><span class="remark">ll it does is to generate
fifteen significant figures in succession. But how many different
values can it generate?</span></p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
