    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Thread Pooling: An Investigation</title>
        <link>https://members.accu.org/index.php/articles/461</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Overload Journal #41 - Feb 2001</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c163/">41</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-163/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+163/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Thread Pooling: An Investigation</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 26 February 2001 16:46:05 +00:00 or Mon, 26 February 2001 16:46:05 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<p>When designing fast server applications, the decision to use a
single thread or multiple threads of execution must be compared to
the relative complexities of each implementation.</p>
<p>The single threaded approach should not be dismissed without a
thorough investigation. On a low usage application that processes
requests quickly it can provide an adequate solution. By using
asynchronous I/O, reasonable rates of data throughput can be
achieved. The big advantage of this approach is no shared resource
contention problems. That means no thread synchronisation and no
possible deadlock problem to track down.</p>
<p>The problem here is scalability. In an ideal world, running the
application on a dual processor machine, you will handle twice as
many requests. The single threaded solution will not scale this
way; its single thread will plod on regardless of the number of
processors. After persuading Big Company Limited to buy the great
new Mega Tsunami&auml; software, that last thing they will want is
the quad CPU server sitting at 10% utilisation.</p>
<p>In order to take advantage of the 4 CPU's at the applications
disposal, multithreading is one solution. Another chance for the
sales team as <span class="trademark">Mega Tsunami MT</span>&trade;
is released. There are three approaches that can be taken:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>A thread per request.</p>
</li>
<li>
<p>A pool of worker threads.</p>
</li>
<li>
<p>Adaptive Thread Pool.</p>
</li>
</ul>
</div>
<p>The Mega Tsunami MT&auml; application in this article is
designed to listen for requests on a 'well known' port and process
the request, replying with data if necessary. One thread will be
created to listen on the port.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e43" id="d0e43"></a>A Thread Per
Request</h3>
</div>
<p>This approach will scale well. As each client connects, the
listening thread creates a new thread to process the request. By
being careful with the design, synchronisation issues can be kept
to a minimum as each request is cocooned within its own thread.
This is important because it would be unfortunate to have Mega
Tsunami MT&auml; utilising only 20% of the quad-cpu server because
most of the threads are waiting to acquire a shared resource.</p>
<p>There are drawbacks from using this approach. Although it is
more efficient to create a thread rather than a process, this still
takes time. On the plus side, that is all the listening thread has
to do before listening again. If the application is to process
thousands of requests a second, an underlying limit of the
operating system will stop the application in its tracks. There
will be an upper limit on the number of threads an operating system
can create and schedule. If this is reached what action should be
taken, as the connection cannot be accepted for processing. Earlier
than this limit, the application is likely to become thread bound
with context switches.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e50" id="d0e50"></a>Context
Switching</h3>
</div>
<p>With multiple processes and threads running on the server, the
operating system must maintain the current CPU registers, its
context, for each one. When a thread is scheduled to run, the
system must initialise the CPU with its context. The thread's
execution will then continue from where it left off. This context
switch is not free; some of the processor time must be spent
performing the switch. If there are 200 live requests currently
being serviced on the quad server, only four can run concurrently.
The other 196 are waiting to be scheduled. If a thread is blocking
on an I/O request, it will not be scheduled until the I/O operation
completes. This is a partial saviour for the thread per request
model. Assuming that the server does not just compute a result, but
performs some I/O as well, some of the threads will be sleeping.
The 200 threads will be in one of the following states: sleeping,
executing or waiting.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e55" id="d0e55"></a>A pool of worker
threads</h3>
</div>
<p>If only four threads can run concurrently, why create more? The
application creates four worker threads and adds them to a pool of
available threads. When a request is received, a thread from the
pool is allocated to action it. When the processing is finished it
is returned to the free pool. In itself this is not a good
solution. Now if the server gets 200 requests only 4 will be
serviced, the rest will either timeout or be processed in what
appears to be a slow server from the clients point of view.</p>
<p>If a form of asynchronous I/O is applied to the worker threads
it is now possible to get the 4 threads to serve more than 4
connections. After the thread processes the request it will return
data. When this data is returned with asynchronous I/O the thread
will not block, execution will continue as though the send data
operation completed in 0 time. The thread can now wait for more
connections and I/O completion notifications.</p>
<p>The underlying problem of 4 threads is still there, but they are
now handling many more client connections. There will come a point,
if the server receives a large number of requests, that connections
will time out</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e64" id="d0e64"></a>Adaptive Thread
Pool</h3>
</div>
<p>So what is the solution? The application cannot just create
threads on a whim and using one thread per processor will have
problems with a large number of requests.</p>
<p>An adaptive thread pool would start with one thread per
processor and with a large number of requests will create more
threads up to an application defined upper limit. When the load on
the server is reduced the number of threads in the pool would
reduce to one thread per processor.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e71" id="d0e71"></a>Adaptive Thread
Pool Example</h2>
</div>
<p>This example is going to be simple. The requirements are:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>The worker threads function will be application defined.</p>
</li>
<li>
<p>The creation of the threads will be application defined.</p>
</li>
<li>
<p>The shutdown mechanism for the threads will be application
defined.</p>
</li>
<li>
<p>The algorithm used to control the number of threads will be
application defined.</p>
</li>
</ol>
</div>
<p>There are two classes that need to be defined immediately,
<tt class="classname">Thread</tt> and <tt class=
"classname">ThreadPool</tt>. <tt class="classname">Thread</tt> will
be a base class for the user defined thread and the <tt class=
"classname">ThreadPool</tt> will contain a collection of threads.
The <tt class="classname">Thread</tt> class will provide an
interface to create, start, stop and pause the thread. The class
will also define two pure virtual functions, stop and the thread
entry function. This interface will allow any class derived from
<tt class="classname">Thread</tt> to be run as part of the pool or
as a stand-alone thread. The class definition is:</p>
<pre class="programlisting">
class Thread {
public:
  Thread();
  ~Thread();
  bool Start();
  bool Pause();
  
  bool Create(unsigned int createType = 
                    CREATE_SUSPENDED);
  // application defined
  // user defined callback to stop the
  // thread
  virtual bool Stop() = 0;
  // user defined callback for the thread 
  // proceedure
  virtual int ThreadProc() = 0;
private:
  HANDLE mThread;
};
</pre>
<p>By making the <tt class="classname">Thread</tt> class abstract
it answers requirements 1 and 3. As the thread pool will contain
threads it makes sense to provide the same control over the pool
threads. In order to satisfy requirement 2, allow the application
to create the threads, the thread pool must be passed a method of
creating the threads. In the example I chose to use a function, why
define a class when a function will do? The thread pool cannot
operate without the function so it is passed as a parameter of the
constructor.</p>
<pre class="programlisting">
// create a function for specific
// object creation
//
// Thread * CALLBACK CreateWorkerThread();
typedef Thread * (CALLBACK *CreateWorker)();
class ThreadPool
{
public:
  ThreadPool(CreateWorker cw);
  ~ThreadPool();
  bool Start();
bool Stop();
  bool Pause();
  
  // create the pool
  bool Create();
private:
  typedef std::list&lt;Thread *&gt; Threads;
  Threads mThreads;
  // user callback to create the threads
  CreateWorker mCw;
};
</pre>
<p>Solving requirement 4 requires a bit of thinking. The algorithm
controlling the thread pool is application defined. How does it
know the current number of threads, the numbers that are free,
which are processing and those it should signal to die? By defining
an interface class (<tt class="classname">ThreadPoolAlgorithm</tt>)
that the application implements and having the <tt class=
"classname">ThreadPool</tt> class inherit from, information from
the pool can be passed to the concrete algorithm. The easiest way
to do this when the base class is undefined is using templates.</p>
<pre class="programlisting">
template&lt;typename base&gt;
class ThreadPool : public base
</pre>
<p>Now an instance of the pool can be created thus;</p>
<pre class="programlisting">
ThreadPool&lt;MyAlgorithm&gt; myPool;
</pre>
<p>The interface, <tt class="classname">ThreadPoolAlgorithm</tt>
enables the framework to know what functions it can call. If the
concrete implementation passed to the <tt class=
"classname">ThreadPool</tt> is not of this type a compile time
error is generated. What should this interface contain?</p>
<pre class="programlisting">
class ThreadPoolAlgorithm {
public:
  // returns true if a thread is to be 
  // added
  virtual bool CanAddThread() = 0;
  // called when a thread is added to the
  //  pool
  virtual void ThreadAdded() = 0;
  // called when a thread is removed to
  // the pool
  virtual void ThreadRemoved() = 0;
  // returns true when thread should exit
  virtual bool CanThreadExit() = 0;
  // called when a thread starts
  // processing
  virtual void ThreadBusy() = 0;
  // called when a thread stops
  // processing
  virtual void ThreadFree() = 0;
  // shutdown thread pool
  virtual void Shutdown() = 0;
};
</pre>
<p>Why use an abstract base class (ABC)? There is no reason that
one must be used, as long as a class supplies the implementations
for the required functions. The ABC is there to remind the
programmer what functions to supply.</p>
<p>The next question is how does the pool know the state of a
thread? The thread should have someway to call back into the
<tt class="classname">ThreadPool</tt>. Because there is no
requirement to expose the inner workings of the pool, another
interface class (<tt class="classname">ThreadPoolControl</tt>)
needs to be defined.</p>
<pre class="programlisting">
class ThreadPoolControl{
public:
  virtual bool CanThreadExit() = 0;
  virtual void RemoveThread(Thread* t)=0;
  virtual void ThreadBusy () = 0;
  virtual void ThreadFree () = 0;
};
</pre>
<p>This interface exposes part of the thread pool and thread pool
algorithm implementations to the <tt class="classname">Thread</tt>
class. By changing the inheritance of <tt class=
"classname">ThreadPool</tt> to:</p>
<pre class="programlisting">
template&lt;typename base&gt;
class ThreadPool : public ThreadPoolControl, public base
</pre>
<p>And changing the thread constructor to accept a pointer to a
<tt class="classname">ThreadPoolControl</tt> object, the thread can
fulfil its requirement to let the pool know its state. It can also
query if it should remain in the pool. If it should not it can let
the algorithm know it is leaving. The thread base class has the
following functions added to support this.</p>
<pre class="programlisting">
class Thread {
public:
  .
  .
  .
void ThreadBusy();
bool CanThreadExit();
void ThreadFree();
private:
  ThreadPoolControl * mController;
};
</pre>
<p>An example of the application defines <tt class=
"classname">ThreadProc</tt> is:</p>
<pre class="programlisting">
int PoolWorkerThread::ThreadProc() {
  bool finished = false;
  while(!finished)
  {
    DWORD ret = ::WaitForSingleObject(mEvent, 1000);
    if(WAIT_TIMEOUT == ret)
    {
      ThreadBusy();
      ::Sleep(500);
      finished = CanThreadExit();
      ThreadFree();
    }
    else
    {
      // event signalled ...
      finished = true;
    }
  }
  
  RemoveThread(this);
  return 0;    
}
</pre>
<p>In this example the thread waits for <tt class=
"literal">mEvent</tt> to be signalled. If the function times out,
it will perform some simulated processing.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e183" id="d0e183"></a>Shutting down
the pool</h3>
</div>
<p>Shutting down a thread pool is very much application dependent.
The pool has no idea of the mechanism used by the thread to wait
for action notifications. It could be using wait for events, I/O
completion ports or an application-defined mechanism. The
<tt class="methodname">Stop</tt> method of the class <tt class=
"classname">Thread</tt> is pure virtual to force application
defined threads to implement this stopping mechanism. If the
<tt class="function">ThreadProc</tt> function is waiting on
multiple events, one of which is application stop, the stop
function can signal this and return. The thread will receive the
event when it is next in the wait for event state. In order to
ensure that the thread pool algorithm does not try to create more
threads, it is also informed of the shutdown event.</p>
<p>One possible implementation of the thread pool stop is:</p>
<pre class="programlisting">
bool ThreadPool::Stop(){
HANDLE * h = NULL;
  Shutdown();
  {
  // stop the threads
    Guard&lt;CriticalSection&gt;
                 guard(&amp;mCritSec);
    Threads::iterator i;
    // signal stop and wait for all to 
    // exit
    int n = mThreads.size();
    h = new HANDLE[n];
    n = 0;
    for(i = mThreads.begin(); i !=
             mThreads.end(); ++i, ++n)
    {
      Thread * t = (*i);
      t-&gt;Stop();
      h[n] = t-&gt;operator HANDLE();
    }
  }
  ::WaitForMultipleObjects(n, h, 
                  TRUE, INFINITE);
  delete [] h;
}
</pre>
<p>In the world of Win32 this code will inform the algorithm that
the pool is shutting down. Then it will signal each thread to stop
and call <tt class="function">WaitForMultipleObjects</tt> to wait
for all of the threads to exit.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e206" id=
"d0e206"></a>Conclusion</h2>
</div>
<p>The approach described provides a reusable framework for both
free standing threads and those that are part of a thread pool. The
mechanism to control number of threads in the pool is defined by
the application and easily used by the pool. An application could
create a number of pools all using different strategies to control
the number of threads. The framework requires no prior knowledge of
the application thread functions and can therefore be packaged into
a library to be used in many projects.</p>
<p>Finally thanks to Mark Radford and Martin Lucus for helping to
turn a set of disjointed notes into ...</p>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e213" id="d0e213"></a>References</h2>
</div>
<div class="bibliomixed">
<p class="bibliomixed"><span class="citetitle"><i class=
"citetitle">Advanced Windows</i></span> (J Richer)</p>
</div>
<div class="bibliomixed">
<p class="bibliomixed"><span class="citetitle"><i class=
"citetitle">Object-Oriented Multithreading Using C++</i></span>
(Hughes and Hughes)</p>
</div>
<div class="bibliomixed">
<p class="bibliomixed"><span class="citetitle"><i class=
"citetitle">MSDN</i></span> (Microsoft)</p>
</div>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
