    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Multithreading (2)</title>
        <link>https://members.accu.org/index.php/articles/774</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + CVu Journal Vol 11, #2 - Feb 1999</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c133/">112</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-133/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+133/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Multithreading (2)</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 February 1999 13:15:29 +00:00 or Wed, 03 February 1999 13:15:29 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>To summarise the previous article: Don't do it if you don't have
to!</p>
<p>More specifically, we looked at the risk of conflicting use of
data objects and resources. We touched on external and internal
locking. It should also have been apparent that established 'good
things' such as encapsulation, weak coupling and avoiding static
data all help to reduce these locking issues.</p>
<p>Next on my agenda of doom is yet another way that threads can
fail.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e28" id="d0e28"></a>Deadlock</h2>
</div>
<p>This tragic condition can suddenly strike when you least expect
it. Imagine two threads, each needing to share access to two
objects. These objects need to be locked for overlapping periods
with respect to each thread. Suppose the first thread locks the
first object, does some work and then locks the second object. Also
suppose that the second thread locks the second object and then
locks the first object. I'm sure that you will have spotted that if
the second thread locked the second object after the first thread
locked the first, both threads will now be waiting for their next
lock forever. That is, each thread is waiting for the other thread
to release one of the two objects but neither will.</p>
<p>In some cases, this problem can be avoided by waiting for both
locks in one blocking call. In Win32, the function is <tt class=
"function">WaitForMultipleObjects</tt>, passing a 'wait for all
objects' parameter. Not all systems have this facility though.
There may also be other reasons why it is difficult to acquire all
locks together. As with many multithreading problems, this one is
best avoided by cunning design.</p>
<p>What makes deadlock so insidious is that often it isn't easy to
spot. This is particularly true when some of the locking is hidden
inside methods of objects where the lock applies to more than just
the current instance. That may not be very clear, so let me try to
expand on that thought.</p>
<p>Remember the choice between internal and external locking? As an
example, internal locking can work very well for operations on a
queue; the queue will only need to be locked when putting a message
in, removing it or perhaps clearing messages. Each of these
operations will need locking only on the queue instance and only
require protection against each other. Each queue instance will
have its own mutex instance.</p>
<p>You may make a case for external locking purely for the reason
of performance, where you may want to add or remove several
messages in a burst of activity. Nevertheless, internal locking
would be the best and easiest for most cases.</p>
<p>Suppose instead that an operation required a lock on an external
object that is shared by many instances of the class. The worst
case is a static instance that is shared by all such instances.
Now, we already know that static objects are dangerous when
accessed by more than one thread, so naturally they must be
protected.</p>
<p>Now consider the case where, during an operation on an instance
of this class, the shared object is locked and during that lock a
second thread performs an operation that happens to call an
operation on another instance of the class in question. There will
be no problem, except if the second thread happens to have obtained
a lock on another static object and the first thread needs to
obtain the lock on the same object. If you allow static objects to
be accessed inside library or other widely used classes, the
chances of this kind of deadlock arising are high if you aren't
very careful.</p>
<p>There are ways to avoid such problems. The best is to eliminate
the need to lock class wide resources as opposed to instance
resources by not having class resources. An alternative is to
ensure that your internal locks truly are internal. If they are
held while calling into external objects' code, where you have no
control over that code, there is a risk that that code could lead
to a deadlock. Circumstances where you have no control over the
external code include obvious ones such as virtual functions,
callbacks and templated classes. Don't forget that overloaded
operators and constructors may hold surprises as well as the more
obvious method calls.</p>
<p>A last resort, if the class wide resource is necessary, is to
use external locking on the resource. By that, I mean that the
client of such objects must obtain a lock on the class wrapping all
operations fully. This puts a burden on the user of your class and
makes the locking quite coarse-grained but at least it will be
thread safe.</p>
<p class="c2"><span class="remark">Sergey Ignatchenko gave a good
worked through example of a problem of this kind in the C++ Report
article '<i class="citetitle">STL Implementations and Thread
Safety'</i>. There was a reprinted copy in the ACCU conference
pack, which is how I came to read it.</span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e58" id="d0e58"></a>Priorities</h2>
</div>
<p>For most purposes, you can safely work with threads all running
across an even playing field. What happens at the edge of the field
is beyond the scope of this article. If you really need to assign
different priorities to your threads, there are a couple of things
to bear in mind.</p>
<p>The first is simply that if a high priority thread is always
busy, no lower priority threads will get a look in (except by
special dispensation from the operating system). This is a
statement of the obvious really, just a reminder that threads
should always be suspended or waiting when there's no work to
do.</p>
<p>The term given to busy high priority threads hogging the machine
is 'thread starvation'. This is a symptom of either a badly written
program or an overloaded (or dying) machine.</p>
<p>The second is that if a high priority thread shares a resource
with a low priority thread, then the low priority thread will stop
the high priority thread running until it has released that
resource. That is to say that low priority threads should lock such
shared resources for the minimum time.</p>
<p>This problem is known as 'priority inversion'. A more
complicated scenario is known as 'indirect priority inversion'.
This involves one or more threads with an intermediate priority
holding off the low priority thread whilst it has the
above-mentioned resource locked. This is easily avoided by not
allowing very low priority threads to access resources shared with
very high priority threads, so I will say no more about it.</p>
<p>In practice, there are usually only three levels of priority
that you are likely to consider using:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>High priority threads will handle external events that need to
be serviced promptly. The worst case of this is some kind of
real-time system where buffers might overflow if left too long.</p>
</li>
<li>
<p>Medium priority threads to do the main work of the
application.</p>
</li>
<li>
<p>Low priority (or idle) threads that to do a low priority task,
such as updating the progress bar!</p>
</li>
</ol>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e83" id="d0e83"></a>Stopping your
threads</h2>
</div>
<p>In effect, there are two kinds of threads, those that iterate
repeatedly and those that simply perform some operation and finish
naturally. The latter aren't very interesting.</p>
<p>In one scenario, you might want to allow the loop body to wait
for data, then process the data. If you want the thread to process
all the available data, then finish gracefully, then it must be
able to determine that the data is finished before exiting the
loop. If the thread is reading messages from a queue, the message
could tell the thread to stop. That is, the producer thread puts a
stop message into the queue and the consumer thread eventually gets
it, possibly putting a stop message into the next queue if it's a
producer itself.</p>
<p>This mechanism only works for the simple case of one producer,
one consumer.</p>
<p>Another, more popular mechanism is to use a 'stop' event
associated with the thread. This is set when you want to terminate
the thread. The thread checks for this event by waiting for either
the stop event or the object that signals that data is ready. After
the wait has returned, it can check if the stop event was set and
break the loop accordingly. Managing the closedown process can be
done by first signalling all running threads to stop and then
waiting on each thread. This is possible because threads become
signalled once they are terminated.</p>
<p>Oh, I suppose I should mention that what you don't do is to call
the <tt class="function">TerminateThread</tt> API.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e99" id="d0e99"></a>More
threads</h2>
</div>
<p>My favourite design concept with threads is to de-couple them
using the idea of passing messages between threads using queues. I
can get quite boring on the subject, it is my one solution to all
threading problems. However, there are many ways that this basic
concept can be used.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e104" id="d0e104"></a>Many producers
and consumers</h3>
</div>
<p>There are many circumstances where you will need an indefinite
number of threads that produce messages. One that commonly occurs
is in a server where each client connection is serviced by one
thread that receives data on that connection. Many applications
actually handle the whole of the processing of that data within
that thread. However, they will still have to fight amongst
themselves for limited resources.</p>
<p>Often you may want to limit the number of threads that do the
bulk of the processing. The main reason for this is that the
resources available may be limited. For example, a database cache
that has many threads making conflicting demands on it may well be
run ragged trying to service them all. At the very least, it will
require more memory to run smoothly.</p>
<p>On the other hand, if there are multiple processors available,
you will want to take full advantage of them. A single thread
handling all database requests will only be able to use one
processor.</p>
<p>In either case, a dynamically changing number of threads will be
running, servicing a varying number of client connections. These
threads can be responsible for putting each message into one or
more queues. There may be more than one type of consumer thread;
each type will need its own queue.</p>
<p>There are two ways that I know of that a producer may signal
that data is available. One is to set an event that is waited for
by the consumer. The consumer may then eat all the queued messages
before waiting again.</p>
<p>The more common alternative is to use a semaphore. This is
incremented for each message that put into the queue and
decremented for each wait completed by the consumer. Note that a
plain event can be used in place of the semaphore if you prefer.
The important feature is that after each wait, the thread only
consumes just one message. This allows the workload to be shared
amongst many consumer threads.</p>
<p>Because this architecture is so common, it is worth thinking
about how to complete the circle and return data to the client.</p>
<p>If data is returned to the client in response to client requests
only, the producer thread described above could in fact wait for
the response from the consumer that handles the message. Thus, its
loop would consist of receiving a message, putting it in a queue
for processing, waiting for a response and then sending the
response.</p>
<p>If on the other hand, the server sends data at random intervals
to the client, there will need to be a separate sending thread
created for each connection.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e125" id="d0e125"></a>Thread
pools</h3>
</div>
<p>In the example above, the consumer threads are acting as a pool
of threads. How many are active at one time depends on the load the
machine is under. In this case, the number of these threads is
determined by you, either a constant number or a number derived at
run-time from the number of processors.</p>
<p>The client connection handling threads could also be created as
a pool. This is advantageous if the connections are likely to be
brief but frequent. There is an overhead involved in creating and
destroying threads that may dominate the actual data
processing.</p>
<p>In this case, two strategies are available. One is to create a
fixed number of threads, setting a limit on the number of
simultaneous connections that may be supported. This moves the
thread creation overhead to the server startup. Another strategy is
to dynamically manage the thread pool by allowing it to grow as
incoming connections are made but not necessarily exiting the
threads as the connections are closed. This is an interesting
exercise in statistics!</p>
<i><span class="remark">You might also refine your strategy to
refuse connections when you detect that the server is overloading.
This may sound difficult but the nature of the message queuing
gives you a clear indicator in the form of the number of queued
messages waiting for the main consuming threads. I'm assuming that
these are the threads that are doing most of the work.</span></i>
<p>In some systems, you may wish to make a single thread pool
available for threads performing a multitude of tasks. This is
easily achieved but I have to say that I've never felt the
need.</p>
<p>Under Win32 on NT, an interesting alternative to the thread per
connection scenario is to use 'I/O completion ports'. These allow
you to handle the incoming data with a thread pool with far fewer
threads than you would otherwise need. Essentially, your threads
handle data as it arrives for any connection. This mechanism can
also be used for outgoing data.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e140" id="d0e140"></a>Threads are
more robust?</h2>
</div>
<p>I've seen this argument put forward on occasion. The idea is
that by using many threads, if one fails, it won't take the whole
application down with it. Interestingly, the same case is often
made to justify the use of multiple processes. There may be truth
in this. However, there are many ways that a thread stopping may
interfere with the operation of the remainder, particularly if they
do so while in possession of a shared resource. On the other hand,
taking the connection handling thread as an example, if a
connection fails and the thread waits for a long timeout, the other
connections will be unaffected. Good use of exception handling may
allow a thread to fail in such a way that the rest can stagger
on.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e145" id="d0e145"></a>Well, that's
it!</h2>
</div>
<p>Looking back, these articles were neither a tutorial nor a
complete overview of multithreaded programming! I hope that I've
highlighted most of the pitfalls and that the design ideas at least
help.</p>
<p>One final checklist then:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Write reentrant functions where possible, at least drop all
static data!</p>
</li>
<li>
<p>General-purpose library code should either be intrinsically
thread-safe, or should be conditionally built in a thread-safe
version. If it needs internal locking, make sure you get it
right!</p>
</li>
<li>
<p>Beware of performance degradation caused by fine-grained
internal locking.</p>
</li>
<li>
<p>Document the external locking requirements of your objects.</p>
</li>
<li>
<p>Design the threading aspects your applications at the highest
level; never try to bolt threads onto existing code.</p>
</li>
<li>
<p>De-couple threads and limit the reach of threads by limiting the
binding between objects used by different threads.</p>
</li>
</ol>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
