    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Self Registering Classes - Taking polymorphism to the
limit</title>
        <link>https://members.accu.org/index.php/articles/597</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Design of applications and programs + Overload Journal #27 - Aug 1998</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c67/">Design</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c176/">27</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c67-176/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c67+176/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Self Registering Classes - Taking polymorphism to the
limit</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 27 August 1999 18:22:42 +01:00 or Fri, 27 August 1999 18:22:42 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<p>In this article, I wish to propose a method of allowing easy
addition and removal of classes from an application. This will use
registration of class-factory functions to emulate virtual
constructors.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e22" id=
"d0e22"></a>Introduction</h2>
</div>
<p>One of the main aims of an Object-Oriented programming language
is to attempt to reduce coupling between the parts of a program by
encapsulating the functionality and state of data structures within
class instances, and for those classes to expose as little as
possible to the outside world. Taken to an extreme, this becomes
component-based software development, in which an application may
comprise components written using a variety of languages and
possibly running on disparate machines and architectures, but for
now, we'll consider a single monolithic application.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e27" id="d0e27"></a>Coupling</h2>
</div>
<p>Firstly, what is the coupling problem?</p>
<p>Simply stated, it's the tendency for a subsystem A to know how
subsystem B works, and vice versa. Any change to A requires a
change to B, any change to B requires a change to A. Extend this to
subsystems C, D and E, and a combinatorial explosion of
dependencies occurs. Since larger systems tend to have more
subsystems, one of the primary tasks of the software engineer on
such projects is to avoid such reciprocal knowledge.</p>
<p>Ideally, then, a subsystem should have no knowledge of any
subsystem that knows about it, and the grand design then tends
toward the composition of more complex subsystems from simpler
ones, somewhat like this, where an arrow means 'knows about':</p>
<div class="c2"><img src=
"/var/uploads/journals/resources/bellingham_ploymorphism_classes.png" align=
"middle"></div>
<p>In this case, whoever is implementing B doesn't need to know
about A, and the implementor of C needs to know only about C.</p>
<p>In general, an attempt to design in this way will lead to
reduced maintenance problems, and produce cleaner code. It
shouldn't be hard to see that conceptually each subsystem roughly
corresponds either to a single class, or to a class with helper
classes that the client need not know about.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e41" id="d0e41"></a>Back to
reality</h2>
</div>
<p>In real life, it's rarely this easy. Subsystems may need to
notify their parents of changes, proxy classes may be returned that
multiple subsystems need to understand, and the result becomes
somewhat more of a cobweb. However, with suitable use of callback
functions, notifications mean that a subsystem doesn't actually
know anything about its owner, and common classes should be
considered almost as built-in types and changed about as frequently
&#9786;.</p>
<p>However, there is another potential problem, and that is hinted
at by the &quot;Law of 5 plus or minus 2&quot;. It is well known that human
beings have problems really understanding what's going on when a
large number of entities is under consideration, unless all the
entities are the same, as in an array or list. In this case,
consider the following:</p>
<div class="c2"><img src=
"/var/uploads/journals/resources/bellingham_ploymorphism_classes2.png" align=
"middle"></div>
<p>In this case, subsystem J has to know how all the subsystems
from A to H all work. However, much of the time, many of these
subsystems, although different in detail, do similar work, and this
is where a language such as C++ can simplify things by presenting
all of these as being effectively the same class, by allowing the
designer to use polymorphism.</p>
<p>By providing an abstract base class which exposes a common
interface for all of these classes, instead of 9 subsystems A to I,
we should be able to treat it as 9 copies of a single subsystem
that just happen to be different internally.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e53" id="d0e53"></a>The problem of
creation</h2>
</div>
<p>Indeed, careful use of C++ virtual functions does allow us to
use polymorphism to dramatically reduce the number of times that an
owner actually has to know about which concrete class it is
currently using. However, there is one major function that cannot
be made virtual: the constructor. As a result, there is often a
switch statement, that looks something like this:</p>
<pre class="programlisting">
void Figure1Func(int objectType, int param)
{
  GraphicItem * AC = NULL ;
  switch(objectType)
  {
  case 0:
    AC = new TextItem(param) ; break ;
  case 1:
    AC = new Box(param) ; break ;
  //...
  case 99:
    AC = new FilledEllipse(param); break;
  }

  if (AC)
  {
    AC-&gt;DoWhatever();
    delete AC ;
  }
}
</pre>
<p><span class="bold"><b>Figure 1 - calling constructors from a
switch statement</b></span></p>
<p>Also, it is frequently the case that there will be a requirement
to serialise such items in or out of memory. Serialising out is
easy - it just requires a suitable virtual function call, and the
object will write itself out. Serialising into memory, though, is
harder - because there is no existing object that can be called
that is known to be of the right type. So, a switch statement will
occur there as well:</p>
<pre class="programlisting">
void Figure2Func(istream&amp; inputstream) 
{
  int objectType ;
  GraphicItem * AC = NULL ;

  inputstream &gt;&gt; objectType ;
  switch(objectType) 
  {
    case 0:
      AC = new TextItem(inputstream) ;
      break ;
    case 1:
      AC = new Box(inputstream) ;
      break ;
    // ...
    case 99:
      AC= new FilledEllipse(inputstream);
    break ;
  }
}
</pre>
<p><span class="bold"><b>Figure 2 - serialising from a switch
statement</b></span></p>
<p>If the application is only ever to have a fixed number of such
classes, there wouldn't be too much of a problem. Unfortunately for
software developers, there is rarely such a creature as a finished
program. New classes get added in. Special versions get written
that have classes deliberately left out. Menus exist listing the
options, and these need to be changed. Sooner or later, someone is
going to miss updating the switch statements correctly, and all
hell will be let loose.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e72" id="d0e72"></a>Banishing the
constructor</h2>
</div>
<p>The whole problem is that the owner has to know exactly what
concrete classes are available. It would be so much simpler if a
list could be built automatically. And who knows better than the
classes themselves?</p>
<p>Consider a class:</p>
<pre class="programlisting">
class GraphicItem
{
protected:
  GraphicItem(int param) { ; }

public:
  virtual ~GraphicItem () = 0 ;
  virtual void DoWhatever () = 0 ;
} ;
</pre>
<p><span class="bold"><b>Figure 3a: GraphicItem.h</b></span></p>
<p>We may then derive the concrete types from it, like this:</p>
<pre class="programlisting">
class FilledEllipse : public GraphicItem
{
private:
  FilledEllipse(int param) ;

public:
  virtual ~ FilledEllipse () ;
  virtual void DoWhatever () ;

  static GraphicItem * 
                 Construct (int param) ;
  enum { ID = 99 } ;
  //  Different for each class
} ;
</pre>
<p><span class="bold"><b>Figure 3b: FilledEllipse.h</b></span></p>
<p>This class has a private constructor, and a public class factory
function - i.e., a function that returns a constructed instance of
the class. The class factory function actually uses the private
constructor.</p>
<p>We could have a table (or better yet, a map), of these class
factory functions against class IDs, and the client code could then
scan the table for the right function to call in order to construct
a new FilledEllipse given only an ID:</p>
<pre class="programlisting">
#include &quot;GraphicItem.h&quot;
// typedefs to reduce typing later
//
typedef GraphicItem * 
         (*ClassFactoryFn)( int params) ;
typedef std::map&lt;int, ClassFactoryFn&gt; FactoryMapType ;
typedef FactoryMapType::const_iterator FactoryMapIter ;

FactoryMapType FactoryMap ;

//  Somehow FactoryMap is initialised ...

void Figure5Func(int objectType, int param)
{
  FactoryMapIter it = 
            FactoryMap.find(objectType) ;
  if ( it != FactoryMap.end())
  {
    GraphicItem * AC = 
                    (*it).second(param) ;
    AC-&gt;DoWhatever();
    delete AC ;
  }
}
</pre>
<p><span class="bold"><b>Figure 4: using a factory
map</b></span></p>
<p>You will see that, if FactoryMap is constructed to contain
object IDs and function pointers to the class factories, the client
has no idea at all what the real objects constructed are. This is
polymorphism taken to the limit. Note especially that it doesn't
have to include the subsidiary include files for the individual
concrete types - all it needs to know is listed in the abstract
base class declaration.</p>
<p>Since there should only be a single instance of the Factory and
it should exist for the whole program run, it's probably best
implemented using the pattern:</p>
<pre class="programlisting">
FactoryMapType&amp; FactoryMap()
{
  static FactoryMapType FMT ;
  return FMT ;
}
</pre>
<p><span class="bold"><b>Figure 5: a singleton factory
map</b></span></p>
<p>This means that anything attempting to access it cannot see it
before it's constructed.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e111" id="d0e111"></a>Building the
class factory map</h2>
</div>
<p>&quot;Aha,&quot; I hear you say, &quot;this has only moved the problem
elsewhere. Something has to build the Factory map, and that
something has to know about the functions.&quot; Well, not quite.</p>
<p>What if the classes themselves cooperate in building the map, or
at least, helper classes do. All the client has to supply is a
function for the classes to register themselves:</p>
<pre class="programlisting">
void RegisterFactory(int ID, ClassFactoryFn fn)
{
  FactoryMap()[ID] = fn ;
}
</pre>
<p><span class="bold"><b>Figure 6a: registering with the
factory</b></span></p>
<p>Now all that is required is to ensure that this function is
called for each of the classes. That can be done by a helper
class:</p>
<pre class="programlisting">
template&lt;class T&gt; class FactoryRegistrar
{
public:
  FactoryRegistrar()
  {
  RegisterFactory(T::ID, T::Construct);
  }
} ;
</pre>
<p><span class="bold"><b>Figure 6b:
FactoryRegistrar.h</b></span></p>
<pre class="programlisting">
#include &quot;FactoryRegistrar.h&quot;
#include &quot;FilledEllipse.h&quot;

static FactoryRegistrar&lt;FilledEllipse&gt; FRFE ;

//  Implementation of FilledEllipse
</pre>
<p><span class="bold"><b>Figure 6c:
FilledEllipse.cpp</b></span></p>
<p>The construction of the static helper class does the class
registration. Assuming one module per concrete object, then all
that needs to be done is to link the required modules to the main
client code, and on program startup, the FactoryRegistrars get
constructed, the class factory functions get registered and the
client suddenly &quot;knows&quot; about the available classes.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e137" id="d0e137"></a>The snake in
the grass</h2>
</div>
<p>But there is a problem with this approach. In fact, there are
two, closely related.</p>
<p>According to the ISO C++ Standard, &sect;3.6.2 (Initialization
of non-local objects [basic.start.init]):</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;It is implementation-defined whether the dynamic initialization
(_dcl.init_, _class.static_, class.ctor_, _class.expl.init_) of an
object of namespace scope with static storage duration is done
before the first statement of main or deferred to any point in time
after the first statement of main but before the first use of a
function or object defined in the same translation unit.&quot;</p>
</blockquote>
</div>
<p>This means that the implementation may decide not to construct
our FactoryRegistrar at all, since until it has been constructed,
there is no way that any function or object in that translation
unit is used.</p>
<p>Secondly, it might be useful to build a library of these
classes. However, modern linkers making use of such a library will
only include those units which they can see are used. Again,
because no function call is made into these units, the linker will
totally ignore them. This becomes even more obvious when you
consider a set of ten classes, of which you want five - only pure
telepathy on the part of the linker would help it.</p>
<p>So, we need an answer.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e153" id="d0e153"></a>The huge
source unit option</h2>
</div>
<p>The first method is crude, but it should work - compiler limits
aside. Simply create a source file that will be linked in, and
#include within it all the source files for the classes you want.
It will also need a function called within it before the Factory
map is used for the first time:</p>
<pre class="programlisting">
void InitGraphics ()
{
}

// Change these lines to change
// which classes are available
//
#include &quot;FilledEllipse.cpp&quot;
#include &quot;Box.cpp&quot;
</pre>
<p><span class="bold"><b>Figure 7: AllGraphics.cpp</b></span></p>
<p>You'll need to ensure that the headers can be multiply included,
and it would be an extremely good idea to put the contents of each
of the sources within its own namespace. This solution means that
the statics should be constructed, as long as some function in this
unit gets called. However, putting the classes into a library is no
longer possible, and a full compilation of this unit is required,
which may be quite time consuming, whenever a configuration change
occurs.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e165" id="d0e165"></a>The one call
option</h2>
</div>
<p>An alternative method is somewhat cleaner. Again, we define a
function that the client code should call. But this time, it calls
a function in each of the class units to be used in this
configuration:</p>
<pre class="programlisting">
extern void InitialiseFilledEllipse() ;
extern void InitialiseBox();

void InitGraphics ()
{
  // Change these lines to change
  // which classes are available
  //
  InitialiseFilledEllipse() ;
  InitialiseBox() ;
}
Figure 8a: AllGraphics.cpp
#include &quot;FactoryRegistrar.h&quot;
#include &quot;FilledEllipse.h&quot;

void InitialiseFilledEllipse()
{
  static FactoryRegistrar&lt;FilledEllipse&gt; FRFE;
}

// Implementation of FilledEllipse
</pre>
<p><span class="bold"><b>Figure 8b:
FilledEllipse.cpp</b></span></p>
<p>Now we can place the class units into a library, and because we
know that the class factory registrar will be constructed, we know
that the class factories will be registered. Also, when a
configuration is changed, it's a much smaller unit that gets
recompiled.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e177" id="d0e177"></a>Cleaning
up</h2>
</div>
<p>By now, we have a two functions that are global, but that deal
with the singleton FactoryMap, either directly or indirectly:
RegisterFactory() and InitGraphics(). It makes sense to make them
member functions of the FactoryMap itself, and for the
functionality in InitGraphics() to be called by the constructor. So
let's see what our final result looks like:</p>
<pre class="programlisting">
class GraphicItem
{
protected:
  GraphicItem(int param) { ; }

public:
  virtual ~GraphicItem () = 0 ;
  virtual void DoWhatever () = 0 ;
} ;
GraphicItem.h
#include &quot;GraphicItem.h&quot;
#include &lt;map&gt;

typedef GraphicItem * (*ClassFactoryFn)( int param) ;

class GraphicsFactoryMapImpl : public std::map&lt;int, ClassFactoryFn&gt;
{
public:
  GraphicsFactoryMapImpl() ;
  void Register(int ID, ClassFactoryFn fn) ;
} ;

typedef GraphicsFactoryMapImpl::const_iterator GraphicsFactoryIter ;

GraphicsFactoryMapImpl&amp; GraphicsFactoryMap() ;

template&lt;class T&gt; class GraphicsFactoryRegistrar
{
public:
  GraphicsFactoryRegistrar()
  {
    GraphicsFactoryMap().
                Register(T::ID, T::Construct);
  }
} ;
</pre>
<p><span class="bold"><b>GraphicsFactoryMap.h</b></span></p>
<pre class="programlisting">
#include &quot;GraphicsFactoryMap.h&quot;

GraphicsFactoryMapImpl &amp; GraphicsFactoryMap()
{
  static GraphicsFactoryMapImpl FMT ;
  return FMT ;
}

#define INCLUDE_UNIT(a) extern void Initialise##a();Initialise##a() ;

GraphicsFactoryMapImpl::GraphicsFactoryMapImpl()
{
  // Change these lines to change
  // which classes are available
  //
  INCLUDE_UNIT(FilledEllipse)
  INCLUDE_UNIT(Box)
}

void GraphicsFactoryMapImpl::Register(int ID, ClassFactoryFn fn)
{
  (*this)[ID] = fn ;
}
</pre>
<p><span class="bold"><b>GraphicsFactoryMap.cpp</b></span></p>
<pre class="programlisting">
//  No need for a separate header
// since nothing else includes it
//
#include &quot;GraphicsFactoryMap.h&quot;

namespace {
class FilledEllipse : public GraphicItem
{
private:
  FilledEllipse(std::string params) ;

public:
  virtual ~FilledEllipse () ;
  virtual void DoWhatever () ;

  static GraphicItem *
                  Construct(int param) ;
  enum { ID = 99 } ;
} ;

//  Actual implementation here ...

} /* namespace anonymous */

extern void InitialiseFilledEllipse () ;
void InitialiseFilledEllipse ()
{
  static GraphicsFactoryRegistrar&lt;FilledEllipse&gt; GFR ;
}
</pre>
<p><span class="bold"><b>FilledEllipse.cpp</b></span></p>
<pre class="programlisting">
#include &quot;GraphicsFactoryMap.h&quot;

void SomeFunc(int objectType, int param)
{
  GraphicsFactoryIter it = GraphicsFactoryMap().find(objectType) ;
  if ( it != GraphicsFactoryMap().end())
  {
    GraphicItem * AC = (*it).second(param) ;
    AC-&gt;DoWhatever();
    delete AC ;
  }
}
</pre>
<p><span class="bold"><b>Actual usage</b></span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e202" id=
"d0e202"></a>Conclusion</h2>
</div>
<p>In reality, there are likely to be more functions than just a
simple class factory that will want to be registered - and it's
quite feasible that the registration will insert string
descriptions into menus as well. This example should be sufficient
to demonstrate a methodology that can be extended to such cases
safely and easily.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
