    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Taking A Chance</title>
        <link>https://members.accu.org/index.php/articles/730</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 8, #2 - Apr 1996</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c136/">082</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Taking A Chance</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 06 April 1996 13:15:27 +01:00 or Sat, 06 April 1996 13:15:27 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a>Bottom Slice by
Francis</h2>
</div>
<p>In the five years that I have edited CVu no previous item has
ever caused such a large response. Clearly ACCU members have been
infected with lottery fever along with the rest of the population.
What am I referring to? That little item in the last issue of CVu
about a broken program to calculate lottery entries.</p>
<p>I am going to provide the bread to make up a sandwich while
letting The Harpist provide the filling. I hope you will find the
result helpful. I hope the experts among you will take the time to
read this item through to the end. This is for two reasons. The
first is that I would like your feedback on the material. The
second is that I suspect that you may learn something - I certainly
did.</p>
<p>I would very much like the rest of you to hunt out your own
questions and problems, because if they trigger even a small
fraction of what this one has then all of us will become better
programmers.</p>
<p>Before I deal with the actual bugs in the program let me
introduce you to your English dictionary. Go and find it and look
up the word meretricious and you will know why all your meritorious
efforts have gone unrewarded. OK, it was a nasty trick to play and
as a result it has produced a single indisputable winner, the only
competitor to identify what metric was being used even if there was
rather too much merit in his response. Was there a point to this?
Well I wanted to remind everyone that you should look up what you
do not know in the appropriate reference book, not just guess at a
suitable meaning from context. Anyone who knew the meaning of
'meretricious' but assumed that I had confused it with
'meritorious' should remember that I was once a teacher and words
are the tools of my trade.</p>
<p>No, I am not telling you what it means - look it up.</p>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e33" id="d0e33"></a>The Bugs</h2>
</div>
<p>Everyone found the major bug - the under dimensioned array of
five ints to store six results. Several speculated as to why it bit
in the way it did. A single member not only explained it but also
proposed a change to the program to demonstrate the problem
consistently on all compilers. I would have felt very guilty about
not giving him the prize had it not been that his own program would
have won a prize for the most obfuscated algorithm, and I am still
not sure what it does in all circumstances.</p>
<p>To save you hunting for the last issue, here is the problem code
again.</p>
<pre class="programlisting">
#include &lt;stdio.h&gt; #include &lt;time.h&gt; int main() { int loop=0, count=0; int num[5]; int test = 0; srand((unsigned)time(NULL)); num[loop] = (1+rand()%49); printf(&quot;Random number no %d is %3d\n&quot;,loop+1,num[loop]); loop++; while (loop &lt;= 5 ) { count = 0; test = 0; num[loop] = (1+rand()%49); while (count &lt; loop) { if(num[loop] == num[count]){ test = 1; count = loop; } else count++; } if(test==0) { NULL printf(&quot;Rand number no %d is %3d\n&quot;, loop+1, num[loop]); loop++; } } return 0; }
</pre>
<p>The significant feature is the optimiser for Turbo C will almost
certainly put <tt class="literal">count</tt> into a register (as
well as <tt class="literal">test</tt> and <tt class=
"literal">loop</tt>). Unoptimised code, and presumably this was the
situation with GNU C, may have locate <tt class=
"literal">count</tt> contiguous to the array <tt class=
"literal">num</tt>. In such circumstances count may coincide with
the notional location of <tt class="literal">num[5]</tt>. This will
not matter until you generate the sixth value at which point one of
two things happens, the sixth value is less than six and gets
incremented by each pass of the inner <tt class=
"literal">while</tt> loop. Or it is greater than five and the
<tt class="literal">while</tt> loop terminates immediately.
Sometimes this value will be the same as one of the first five.</p>
<p>If you want to demonstrate this with your current compiler of
choice, correctly declare <tt class="literal">num</tt> as an array
of six <tt class="literal">int</tt>s and then use <tt class=
"literal">num[5]</tt> instead of <tt class="literal">count</tt>
throughout the rest of the program.</p>
<p>So that is the obvious bug, and the explanation for its working.
There is another one lurking that most of you missed (and that
includes some very experienced professional programmers - no names,
you know who you are). The writer clearly intends to include the
correct header files, so where is <tt class=
"literal">stdlib.h</tt>? Without it the compiler makes a best guess
at the parameters and return values for <tt class=
"literal">rand()</tt> and <tt class="literal">srand()</tt>. It will
get those for <tt class="literal">srand()</tt> wrong as it will set
up code to pass an <tt class="literal">int</tt> and return an
<tt class="literal">int</tt>. The code the linker will find in the
library will expect an <tt class="literal">unsigned int</tt> and
return a <tt class="literal">void</tt>. Did I hear a mutter of
'Well, what does it matter? The code will still work.' That kind of
attitude won't cut it. Of course a C++ compiler will spot the
problem, and so will the kind of type-safe linker that comes with
Clarion's compilers.</p>
<p>A couple of people wanted to grumble about the '<tt class=
"literal">return 0</tt>' as they wanted it replaced with
'<tt class="literal">return EXIT_SUCCESS</tt>'. This is an example
of a magic number (the '<tt class="literal">0</tt>') that most
experienced programmers are quite happy with. What about the other
magic numbers? The '<tt class="literal">49</tt>' and the
'<tt class="literal">5</tt>'. May be the programmer would have got
it right if he had used manifest constants, probably not but ...
There are also some nasty problems lurking for the C++ programmer
who uses <tt class="literal">NULL</tt>. The consensus of opinion
these days is to simply use <tt class="literal">0</tt> when you
want a null pointer and avoid the use of <tt class=
"literal">NULL</tt>.</p>
<p>I think all the points in the last paragraph are relatively
small and are the kind of thing that some programmers will waste
endless hours arguing about.</p>
<p>The final point I have on the code itself is that I am less than
happy with it being written as one single monolithic function, a
feeling shared by most of those writing about it. Remember the
Harpist's guiding rule, 'A function should (conceptually) only do
one thing.' The other 'rule of thumb' that has been broken is 'If
you find yourself writing something twice, it should have been done
once as a function.'</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e138" id="d0e138"></a>Deeper
Problems</h2>
</div>
<p>I am going to leave The Harpist to write about many of the
problems associated with using rand() but there is one that needs
some explanation now (The Harpist missed the problem when we were
brainstorming on the subject). This is the problem of ensuring that
your algorithm will terminate. Let me use a simpler problem.
Suppose that you are using a (pseudo-)random number generator to
simulate tossing a pair of coins. The core code might look
something like this:</p>
<pre class="programlisting">
enum Pair {TT, TH, HT, HH}; enum Pair result toss( void ){ int coin1 = rand()% 2; int coin2 = rand()% 2; return (coin1 + 2 * coin2) }
</pre>
<p>What are the possible return values? Well you might reasonably
expect a fairly equal distribution of 0 (TT), 1 (TH), 2 (HT) and 3
(HH). Most current implementations of <tt class=
"literal">rand()</tt> will generate that kind of result but there
is at least one infamous implementation that generates alternate
odd and even values with the result that the above code can only
return 1 or 2. This particular problem is so blatant that the poor
quality of the implementation is quickly noticed. However despite
all expectations to the contrary even worse implementations are (in
my opinion - if you can quote text to the contrary please send it
in) conforming. As the ISO C Standard does not define a
pseudo-random number (a pretty difficult thing to do anyway) the
following seems to be conforming:</p>
<pre class="programlisting">
static unsigned int next =1; void srand(unsigned int seed) { next = seed;} int rand () { return abs(next); }
</pre>
<p>Yes, I know that this is plain crazy but it is only an extreme.
How good is the implementation of rand() that your Standard C
Library provides? Under what circumstances will it break your
program because it fails to meet your implicit expectations?</p>
<p>This is a quality of implementation issue, but most programmers
just assume that the implementation they are using is good enough.
One odd aspect of <tt class="literal">rand()</tt> is that it is
explicitly prohibited from generating genuine random numbers
because setting the seed requires that the subsequent generated
sequence is fully determined. Even not calling <tt class=
"literal">srand()</tt> does not help because that is defined as
equivalent to calling <tt class="literal">srand(1)</tt>. I'll leave
The Harpist to explore the consequences of this in the current
context.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e165" id="d0e165"></a>Alternative
Algorithms</h2>
</div>
<p>The first criterion for acceptance is that whatever algorithm
you use should demonstrably terminate in a predictable time. My
instinctive choice was to apply a standard shuffle routine. The
heart of this is a small function that exchanges the last item in
the list for a randomly chosen item (possibly itself).</p>
<pre class="programlisting">
void choose_last(int array[], int entries){ int item = rand()%entries; int temp = array[item]; array[item]=array[entries-1]; array[entries-1]=temp; return; }
</pre>
<p>Now we can use this function to do a complete shuffle of an
array:</p>
<pre class="programlisting">
void shuffle(int array[], int size) { int last; for(last=size; last&gt;2; last--) choose_last(array, last); return; }
</pre>
<p>Set up an array of 49 <tt class="literal">int</tt>s and
initialise it to contain each of the integer values 1-49 (how you
do this is irrelevant, the obvious iteration of <tt class=
"literal">numbers[i] = i+1;</tt> will do fine), call <tt class=
"literal">shuffle()</tt> with the array name and 49, now use the
entries in the first six (last six or whatever rule takes your
fancy) as your selection.</p>
<p>This was such an obvious solution for me that I was amazed to be
offered an entirely different one, which I had never consciously
seen before but is perfectly reasonable. The basis of this solution
was to apply a 'weight' to each entry. First we need a struct to
provide pairs of values and weights:</p>
<pre class="programlisting">
typedef struct { int value; int wt; } weighted_value;
</pre>
<p>And next an ordering function:</p>
<pre class="programlisting">
int compare(weighted_value i, weighted_value j){return (i.wt-j.wt);}
</pre>
<p>And finally we can write our shuffle function:</p>
<pre class="programlisting">
void shuffle(weighted_value array[], int size){ int count; for(count=0; count&lt;size; count++) array[count].wt = rand(); qsort(array, size, sizeof(weighted_value), compare); return; }
</pre>
<p>You may have to play around with some nasty casts to pass that
function address to qsort() as many compilers are very picky. The
correct cast is:</p>
<pre class="programlisting">
(int (*)(const void *, const void *))
</pre>
<p>Many books never bother with detailing this because they are
still thinking in terms of writing code without prototypes. In such
cases, what the compiler usually does works. Once you have
prototypes around the compiler can see that the type of the
function address being passed does not match the type of the
function pointer declared as the fourth parameter of <tt class=
"literal">qsort()</tt>. I've known quite a few programmers abandon
<tt class="literal">qsort()</tt> because they could not work out
the correct cast - function types are not the easiest thing to
declare. This shuffle is used very like the previous one.</p>
<p>Any other shuffling algorithms? Perhaps I should warn you that
many of the ones people invent for themselves are badly biased.
Don't let that deter you because we can all learn from broken
solutions as well as working ones.</p>
<p>Time to hand over to The Harpist. I'll be back when he has
finished to give you the prize winner as well as my solution to
writing a lottery selection generator.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e215" id="d0e215"></a>The Filling -
A Fair Chance by The Harpist</h2>
</div>
<p>We all have a sense of what we mean by random. We have some very
firm expectations. The trouble is there are a surprising number of
tarpits for even the experts (I'll not be surprised if I manage to
fall into a couple).</p>
<p>Let me give you a small example. You toss a coin 20 times and it
lands heads every time. You are offered a bet on the next toss. Do
you back heads or tails? Those who know nothing about coins and
probability usually want to back heads again. Those with
superficial knowledge back tails because they reckon it is about
time that tails came up to start balancing the results. The
statisticians go to great pains to explain that the past history of
results is irrelevant (the coin lacks memory and self-awareness) so
the chance is 50-50 either way. So what is your answer?</p>
<p>Notice that all you know is that you tossed the coin, you know
nothing about the 'fairness' of it. The only evidence you have for
this coin is that it has shown a very strong tendency to land
heads. If you look at the coin you might even find that it was
two-headed. We know that biased coins exist, even after checking
that it is not double-headed we still do not know if this coin is
unbiased. Indeed all the evidence we have suggests that it is
biased towards landing heads. We should always use the best
evidence available. Before we toss the coin at all, that is our
knowledge of the general behaviour of coins. Once we have tossed it
a few times we have should take into account the specific behaviour
of this coin. So back it to land heads again. If it isn't biased
you still have your 50-50 chance, but if it is biased it is almost
certainly biased towards heads.</p>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e224" id="d0e224"></a>What Is
Random?</h2>
</div>
<p>Well what is your answer? I think we generally mean that all
potential selections have an equal chance. You can use a single
six-sided die to select one of six items. But notice that using two
standard dice to generate totals between 2 and 12 does not fill
that definition of randomness - there is a strong bias towards the
middle numbers. In other words you can use a die to generate a
random number but you cannot use one to generate a random total. I
know that this is simplistic, but I am not writing an in depth
article on Statistics (something that Francis maintains is one of
the worst understood items in common use - he used to have endless
battles with other maths teachers and examiners who got their
answers wrong.)</p>
<p>Now what are we intending to choose at random when we select
lottery numbers? Surely not the individual numbers. Think that
through. Any single number is not much use to us. To win the big
one we need to select a specific six numbers. The selection
mechanism used by the lottery is designed to ensure that all the
possible selections are equally likely. If you want to work out how
many different selections are possible evaluate:</p>
<pre class="programlisting">
(49 x 48 x 47 x 46 x 45 x 44) / (6 x 5 x 4 x 3 x 2)
</pre>
<p>This is 13983816, or almost 14 million. Keep that number in
mind.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e235" id="d0e235"></a>Pseudo-Random
Number Generators</h2>
</div>
<p>It is very difficult to define one of these. What we want is a
deterministic mechanism that will select numbers from a particular
range without any apparent relationship between one number and the
next choice. Of course there is a connection or else the mechanism
would not be deterministic. In a sense, the one thing that a
pseudo-random number generator cannot do is generate random
numbers. What we make do with is some form of randomising the
starting point for a function that generates a sequence of numbers
that does not appear to follow any well defined pattern.</p>
<p>Note very carefully that the only function that can possibly use
a random number (which must be supplied by some outside device) is
<tt class="literal">srand()</tt>. Unless you call srand() again,
the sequence generated by a particular implementation of <tt class=
"literal">rand()</tt> is fully determined, and each time you run
your program, supplying the same seed value to <tt class=
"literal">srand()</tt> you will get the same results. This means
that you cannot get more results from your program than there are
starting seeds. (We will see in a moment that this isn't quite
true)</p>
<p>How many starting points does C allow you? Look in <tt class=
"literal">limits.h</tt>. Don't make Francis' mistake and look at
<tt class="literal">RAND_MAX</tt> (found in <tt class=
"literal">stdlib.h</tt>). The critical value is <tt class=
"literal">UINT_MAX</tt> because <tt class="literal">srand()</tt>
takes an <tt class="literal">unsigned int</tt> argument. That gives
you the maximum number of results that your program can generate if
<tt class="literal">srand()</tt> is only called once (calling it
several times only helps if you can provide some external mechanism
to provide a random seed. The system clock won't cut it)</p>
<p>Now on the 16-bit systems most of you are using <tt class=
"literal">UINT_MAX</tt> is usually 65535. That is a long way from
14 million. In other words the kind of program that most write to
generate lottery selections are biased. We could actually generate
all the possible solutions by iterating our program for all the
65536 (including 0) seeds available. Assuming that you have a good
quality pseudo-random number generator and a 32-bit <tt class=
"literal">unsigned int</tt> you have a reasonable chance that your
program can generate all selections with roughly equal
probability.</p>
<p>This twin-problem of good quality pseudo-random number
generators with a sufficient number of seeds to ensure that all
selections are possible is not just an academic quibble. There are
already several commercial programs on the market. If one of these
suffers from the limited seed problem (more than likely) then you
may find that you are sharing your 'big one' with a large number of
other people.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e284" id="d0e284"></a>Cycles</h2>
</div>
<p>As well as the number of different starting points (seeds) for a
pseudo-random number there is the issue of how long the cycle of
values is before it repeats. This depends on the algorithm you are
using. Some algorithms start with lead ins of a hundred or so
values that are dependant on the initial seed before rapidly
entering a short cycle of a dozen or fewer values. If you are
testing a pseudo-random number generator it is well worth
generating a few thousand values which are discarded, before
generating the actual test values. This will help weed out these
poor quality algorithms.</p>
<p>Many algorithms only cycle through a subset of the possible
values, do not confuse this with the length of the cycle itself.
While it is true that many of the simpler pseudo-random number
generators do not have values repeat in a cycle this does not have
to be true. The reason why some people think this is an essential
feature of cycles is that they are thinking of pseudo-random number
generators that reseed themselves from the last value
generated.</p>
<p>Let me give you a simple example (not very good as a
pseudo-random number generator, but it does generate a cycle with
repeated values.</p>
<pre class="programlisting">
static unsigned int next =1; void srand(unsigned int seed) { next = seed;} int rand(){ next++; return (next * next) % 10; }
</pre>
<p>This generates the cycle 0-1-4-9-6-5-6-4-9-1-0-1-4...</p>
<p>While the common problem is short cycles it is worth remembering
that the best pseudo-random number generators can have very long
cycles, many times longer than <tt class="literal">RAND_MAX</tt>.
In other words there may be a much richer texture of patterns
available if only we were not limited by <tt class=
"literal">srand()</tt> taking an <tt class="literal">unsigned
int</tt> argument. Is there anyway to capitalise on these longer
cycles?</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e308" id=
"d0e308"></a>Persistence</h2>
</div>
<p>This term refers to data that survives the execution of a
program so that it can be used again. Sadly the C Standard makes no
provision for a persistent pseudo-random number generator. Any
reasonable pseudo-random number generator will have some internal
state variables that control the next value generated. There is no
way of accessing these (well a kind implementor could provide you
the information, but I do not know of one that does). The best you
can do is to store the starting seed and the running total of the
number of calls to rand(). At the restart of your program you would
have to seed the pseudo-random number generator and then iterate
through all the previous calls to <tt class="literal">rand()</tt>
(discarding the return values) to get back to the place in the
sequence that you last used. The obvious problem with this is that
you have an ever increasing start-up overhead (try benchmarking a
million calls to <tt class="literal">rand()</tt>)</p>
<p>Considering our current problem, we can use a different bit of
persistent data to widen the range of end selections. Assuming that
we are using a shuffle technique as the core of our selection
generator, we can store the final order in a data file. Subsequent
uses of the program could use the most recent stored values to
initialise the array. That is, instead of:</p>
<pre class="programlisting">
int i; int num[49]; for(i=0; i&lt;49; i++) num[i]=i+1; /* program */
</pre>
<p>Use something like:</p>
<pre class="programlisting">
int i; int num[49]; FILE *numbers = fopen(&quot;last&quot;, &quot;r&quot;); for(i=0; i&lt;49; i++) fscanf(numbers, &quot;%d&quot;, num+i); fclose(numbers) /* rest of program */ numbers = fopen(&quot;last&quot;, &quot;w&quot;); for(i=0; i&lt;49; i++) fprintf(numbers, &quot;%d&quot;, num[i]);
</pre>
<p>(How much better I find the C++ io! Look at that horrible lack
of symmetry between <tt class="literal">fscanf</tt> and <tt class=
"literal">fprintf</tt>.)</p>
<p>I think that this technique will provide a reasonable random
selection of lottery numbers even on a system with 16-bit
<tt class="literal">unsigned int</tt>s. Anyone got anything better?
Come to that, are there any approaches that do not rely on some
form of shuffle?</p>
<p>One last thing before I hand you back to Francis. Anyone who
wants to learn more about pseudo-random number generators could do
much worse than read chapter 3 of 'The Art of Computer
Programming'. It's the first chapter of volume 2 (Seminumerical
Algorithms). Originally written in 1969 with a second edition in
1981 it is still one of the most authoritative works available. The
maths gets fairly heavy at times, but don't let that put you off,
there is much there that you will understand even without following
all the math.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e342" id="d0e342"></a>Top Slice -
Getting Objective by Francis Glassborow</h2>
</div>
<p>One contributor thought about using some object orientation.
I'll leave his name out because I think his actual solution was far
below his normal exemplary standards. But he actually set my
thinking onto a new path.</p>
<p>Often we use pseudo-random number generators because we want
repeatability, and the seed mechanism guarantees that. But this is
exactly what we do not want for this problem. We want to generate a
selection of 6 numbers from 49 with only about one chance in 14
million that the same selection will be made by any other use of
the same program.</p>
<p>Let me use a bit of O-O. I want to emulate two things, a
continuous shuffling of the available numbers. The choice of the
one that is at the selection point at a random moment. Now human
time runs at a very different rate from our modern machine time. We
can capitalise on this.</p>
<p>Consider the following little function:</p>
<pre class="programlisting">
int new_top(int array[], int entries){ int item = rand()%entries; int temp = array[item]; array[item]=array[entries-1]; array[entries-1]=temp; return kbhit(); }
</pre>
<p>And using it something like:</p>
<pre class="programlisting">
int main(){ int size=49; int num[49]; int count; for(count=0; count&lt;size; count++) num[count]=count+1; srand((unsigned)time(NULL)); printf(&quot;Press a key to get each number \n&quot;); for(count=0; count&lt;6; count++){ while(!new_top(num, size)); /* shuffle till a key is hit */ printf(&quot;%d &quot;, num[--size]); /* then output current end entry */ } printf(&quot;\nWell thats your six. Best of luck.\n&quot;); return 0; }
</pre>
<p>You may want to do some inlining. You may also want to suppress
all output till the end. The purpose is to get the machine
shuffling as fast as possible so that even those trying to repeat
cannot possibly duplicate the precise timing of keypresses that
would require.</p>
<p>If you are willing to put up with a little tedium you could try
this variation of a shuffle function that could replace the one
that I started with (in the other slice).</p>
<pre class="programlisting">
void shuffle(int array[], int size) { int last; printf(&quot;Make %d keypresses&quot;, size-1); for(last=size; last&gt;2; last--) while(!new_top(array, last); return; }
</pre>
<p>I think that this is about the most random shuffle you are
likely to get from current hardware. Of course this is not a
Windows program because it relies on kbhit(). Which leads to two
variations for those that want to try their hands at this idea.
First convert this shuffling idea to run effectively on an event
driven GUI system (MS Windows for most of you). Second redesign it
and implement it for a multi-threading system with one thread
shuffling and the other thread watching the keyboard, mouse or
whatever. This is actually quite hard, because you will have to
handle the reducing array for each selection event. You also want
the time slices to be kept small else the granularity will make
repeatability possible. I feel that this could be quite a nice
introduction to MT programming.</p>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e367" id="d0e367"></a>The
Winner</h2>
</div>
<p>Tony Sumner was the only contributor to notice my malicious
qualification on entries (thank goodness there was one, and only
one) so he wins a copy of the Anniversary edition of The Mythical
Man-Month.</p>
<p>I would like to thank all the following whose contributions
added much to an enjoyable brainstorming between me and The
Harpist. I promise you that none of us got it all right.</p>
<div class="sidebar">
<p><span class="bold"><b>Contributors:</b></span></p>
<p><span class="emphasis"><em>Nigel Armstrong</em></span> (who, I
think noticed that the prize was for a meretricious submission, but
his wasn't),</p>
<p><span class="emphasis"><em>Raymond Butler</em></span> (who kept
things simple),</p>
<p><span class="emphasis"><em>U P Cheah</em></span> (who suggests
that such misunderstandings of array size is particularly prevalent
among those converting from BASIC to C),</p>
<p><span class="emphasis"><em>Geoff</em></span> (trouble with email
is that surnames often get sliced off, perhaps it is just as well,
because he thought that the algorithm in the original was OK)</p>
<p><span class="emphasis"><em>Alan Griffiths</em></span> (The
editor: The ISDF Newsletter) provided a rather comprehensive list
of minor points re the published code as follows:</p>
<p>There are a number of other problems with the code, the first
couple of the following list are appropriate even to novices (as I
presume the author to be), the latter ones are more related to the
ideal of &quot;quality code&quot;.</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>If the random number generator were to repeat the same five (or
fewer) numbers then the program would not terminate. (Although this
unlikely to be a problem it makes getting a result a matter of
luck.)</p>
</li>
<li>
<p><tt class="literal">srand()</tt> and <tt class=
"literal">rand()</tt> are accessed without prototypes, the header
<tt class="literal">&lt;stdlib.h&gt;</tt> should be included.</p>
</li>
<li>
<p>The choice of variable names is poor - <tt class=
"literal">count</tt> is not used as a count and <tt class=
"literal">test</tt> does not indicate what the test is for.</p>
</li>
<li>
<p>The use of &quot;magic numbers&quot; throughout code is a frequent source
of errors - especially when maintaining code. For example, if the
range of numbers were to change to 1-99 then it is necessary to
check the whole program to locate the code that needs changing.</p>
</li>
<li>
<p>Because the algorithm is poorly structured it is necessary to
repeat (almost) the code to output the numbers. Again, this
increases the effort needed to understand or change the
program.</p>
</li>
<li>
<p>The variables <tt class="literal">count</tt> and <tt class=
"literal">test</tt> are declared with a scope that extends far
beyond their use. This can lead to &quot;dual purpose&quot; variables with
misleading names.</p>
</li>
</ul>
</div>
<p>In a lengthy commentary <span class="emphasis"><em>Derek
Johnson</em></span> writes:</p>
<p><span class="emphasis"><em>Assuming that <tt class=
"literal">rand()</tt> produces something approximating to an even
distribution, then there is zero probability that the algorithm
fails to terminate. Even if the same algorithm were used to find an
order for all 49 numbers, then the expected time is still finite -
it might be rather large, but it will still finish some time. Maybe
what you were thinking of is that the algorithm has no guaranteed
time at which it terminates, which is quite another
thing.</em></span></p>
<p>And later</p>
<p><span class="emphasis"><em>Under Sun Unix (version 4, at least)
rand produces odd &amp; even numbers alternately!</em></span></p>
<p>In other words he knows that there are very bad generators out
there, but does not believe it will effect this program. Somewhere
I have a book on <span class="emphasis"><em>Murphy's
Law</em></span>. Programmers have to be less trusting.</p>
<p><span class="emphasis"><em>Alec Ross</em></span> included a good
clear analysis of the reason why the problem manifested the way it
did.</p>
<p><span class="emphasis"><em>Kevin Howe</em></span> produced an
alternative selection mechanism that effectively used an array of
flags. He then used the generated value to count through the so far
unselected choices. As he mentions, this means that no sorting is
necessary if you want to produce your list in numerical order, his
version does not disturb the implicit order.</p>
<p><span class="emphasis"><em>Peter Wippell</em></span> came up
with a pure C++ based solution relying on using a set container to
keep out duplicates. Sadly this is vulnerable to the quality of
implement-ation of the pseudo-random number generator.</p>
<p>There are a couple of other anonymous contributions sitting in
my directory tree. Try to remember to include your name in the body
of material that you send in electronic form. I know I should be
efficient and tag all incoming files but we all know I'm not.</p>
<p>Thanks for all the input. I think that it has been very
worthwhile (at least I have learnt something) now who has the next
little bit to trigger another tutorial?</p>
</div>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
