Journal Articles

Overload Journal #34 - Oct 1999 + Programming Topics
Browse in : All > Journals > Overload > 34 (11)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Uncounted pointers in C++

Author: Administrator

Date: 26 October 1999 17:50:55 +01:00 or Tue, 26 October 1999 17:50:55 +01:00

Summary: 

Body: 

Sometimes I come across an idea that strikes me as "interesting, but is there really a use for it". Less frequently, I subsequently find so many uses for it that I can't imagine how I ever survived without. I first encountered the idea of 'weak references' in discussions of Java garbage collection. Since then I've found several applications for the idea in C++.

The idea of a reference is much more general than the C++ language element of the same name. The broader definition used in design, garbage collection, and other programming languages includes any mechanism for referring to an object. In C++ this would include pointers (both base types and the various types of 'smart pointer'). It is this more general definition that I use in this article.

Garbage Collection is a tool for managing the memory referred to by references - it works on the principle that when there are no active references to an object then the program can never use the object again and the memory allocated to it may be reclaimed. This works well in languages like Java that have support for references in the language. (In Java all objects are managed by references.)

Not all references need be equal, and two of the variations that have been developed are 'strong references' and 'weak references'. The garbage collection tool only considers strong references when deciding if an object may be collected. Any weak references to an object will become null if the object is collected.

There are other variations, for example, in Java there are now three reference classes WeakReference, SoftReference and PhantomReference. These are essentially hooks into the Java garbage collection mechanism with different collection semantics.

smart pointers

For a variety of reasons garbage collection is not standard in C++ and in its most general form would be impractical to implement. For instance, C++ allows one to encrypt a pointer (e.g. by xoring it with a password), to write it out to a file, destroy all trace of the original, reload the encrypted version, decrypt it and regain access to the original object.

Instead of garbage collection C++ programs use a variety of memory management techniques. One of the most effective of these is the use of a smart pointer - classes that emulate a pointer and know when an object is to be delete. The standard library contains only one design of smart pointer - the auto_ptr template but there are many other possibilities. (For some of the possibilities check out http://www.boost.org/ and arglib on my website at http://www.octopull.demon.co.uk/.)

The idea of strong and weak references inspired the counted_ptr and uncounted_ptr templates in my arglib library. (We'll deal with uncounted_ptr shortly, after discussing counted_ptr.)

The strong reference implementation is counted_ptr and works by keeping count of the number of references to an object so that it may be deleted when the count reaches zero. (There is another implementation of this same idea on the boost website as shared_ptr.)

The problem with counted_ptr

As proponents of garbage collection will be quick to point out - there are some problems with this form of reference counting. If you've used a version of counted_ptr or any other kind of reference counting pointer then you will know the main issue is that of loops.

A loop occurs when two (or more) objects - such as "Pippa" and "Alan" in Example 1 - contain a cycle of references to each other. In the example Pippa is never deleted because of the reference in Alan, but Alan is never deleted because of the reference in Pippa.

Two other things you will notice in the example are parts of a test harness framework: the macro ARG_TEST and the class instance_count. ARG_TEST logs successful tests or reports unsuccessful ones, instance_count is a utility class that counts the instances in existence.

Example 1

class man;


class woman
{
public:
  void set_husband(counted_ptr<man> value);
private:
  counted_ptr<man> husband;
  instance_count   t;
};


class man
{
public:
  void set_wife(counted_ptr<woman> value);
private:
  counted_ptr<woman> wife;
  instance_count     t;
};


void woman::set_husband(counted_ptr<man>value)
  { husband = value; }
void man::set_wife(counted_ptr<woman> value)
  { wife = value; }


void f()
{
  {
    counted_ptr<woman> pippa(new woman);
    ARG_TEST(1 == instance_count::instances);
    {
      counted_ptr<man> alan(new man); 
      ARG_TEST(2==instance_count::instances);

      alan->set_wife(pippa);
      pippa->set_husband(alan);
    }
    ARG_TEST(2 == instance_count::instances);
  }

  // "Look ma - I've leaked!"
  ARG_TEST(2 == instance_count::instances);
}

Before encountering the idea of weak references and developing explicit support for them I used a mix of smart and raw pointers (with an often arbitrary choice of direction for the ownership). However, as Example 2 illustrates, this leads to the need for Pippa (say) to notify Alan that she's being deleted.

Example 2

class man;


class woman
{
public:
  void set_husband(counted_ptr<man> value);
  ~woman();
private:
  counted_ptr<man> husband;
  instance_count   t;
};


class man
{
public:
  man();
  void set_wife(woman* value);
private:
  woman*         wife;
  instance_count t;
};


void woman::set_husband(counted_ptr<man>value) 
{ husband = value; }

woman::~woman()
{ if (husband.get()) husband->set_wife(0); }

man::man()
  : wife(), t()
{}

void man::set_wife(woman* value)
{ wife = value; }


void f()
{
  {
    counted_ptr<woman> pippa(new woman);
    ARG_TEST(1 == instance_count::instances);
    {
      counted_ptr<man> alan(new man); 
      ARG_TEST(2== instance_count::instances);

      alan->set_wife(pippa.get());
      pippa->set_husband(alan);
    }
    ARG_TEST(2 == instance_count::instances);
  }
  ARG_TEST(0 == instance_count::instances);
}

Even in our simple example this is ugly. In more general usage this can lead to a need to maintain a list of objects to notify and the complexity of maintaining it. Other designs, such as external association lists may be more manageable, but this is still a far cry from the effortlessness we might expect from smart pointers.

uncounted_ptr

An uncounted_ptr refers to the same object as a counted_ptr but isn't included in the reference count. This makes it the weak reference type corresponding to counted_ptr's strong reference. When the object is deleted (because there are no more counted_ptrs referring to it) the uncounted_ptr becomes zero.

With uncounted_ptr in our toolkit we can now revisit the example. The loop problem is solvable by changing the link implementation between 'man' and 'woman' to uncounted_ptr. This breaks the loop by allowing Pippa to be deleted in Example 3. The example also illustrates that there is no dangling reference left in Alan - an ever-present danger when using raw pointers.

Example 3

class man;


class woman
{
public:
  void set_husband(uncounted_ptr<man> value);
private:
  uncounted_ptr<man> husband;
  instance_count t;
};


class man
{
public:
  void set_wife(uncounted_ptr<woman> value);
private:
  uncounted_ptr<woman> wife;
  instance_count t;
};


void woman::set_husband(uncounted_ptr<man>value)
{ husband = value; }

void man::set_wife(uncounted_ptr<woman> value)
{ wife = value; }


void f()
{
  {
    counted_ptr<woman> pippa(new woman);
    ARG_TEST(1 == instance_count::instances);
    {
      counted_ptr<man> alan(new man); 
      ARG_TEST(2== instance_count::instances);

      alan->set_wife(pippa);
      pippa->set_husband(alan);
    }
// alan has gone, but no need to inform pippa
    ARG_TEST(1 == instance_count::instances);
  }
  ARG_TEST(0 == instance_count::instances);
}

using smart pointers

There are a few points to note about any smart pointers you may be using. Most importantly, you need to understand the ownership regime that they implement. Most smart pointers, like the standard auto_ptr, take ownership of an object, so initialising two auto_ptrs from a single raw pointer value will lead to problems. (Because both pointers are entitled to delete the object, but this will lead to undefined behavior.)

Like auto_ptr, counted_ptr assumes ownership of the referenced object. Unlike auto_ptr it shares (not transfers) the ownership by means of the copy construction assignment operations. This gives counted_ptr the required value semantics needed to be used successfully in standard library containers.

The uncounted_ptr template refers to an object owned by one or more counted_ptrs, but it does not share in the ownership of the object. An uncounted_ptr automatically becomes 0 when the object is deleted. (Note however, that dereferencing an uncounted_ptr implicitly creates a temporary counted_ptr preserving the object long enough to complete the current expression.)

counted_ptr interface

template<typename pointee_type>
class counted_ptr : public  specialised_counted_ptr_base<pointee_type>
{
  typedef specialised_counted_ptr_base<pointee_type> base_type;

public:
  /** Construction */
  counted_ptr() ; /// "0 initialised" pointer
  explicit counted_ptr(pointee_type* p); /// Takes ownership of p
  counted_ptr(const counted_ptr& rhs); /// Take value of rhs
  template<typename rhs_pointee_type>
  counted_ptr(const specialised_counted_ptr_base<rhs_pointee_type>& rhs); /// Take value of rhs
  ~counted_ptr() throw();

  /** Accessors */
  pointee_type* get() const; /// Returns contents
  pointee_type* operator->() const; /// Dereference op
  pointee_type& operator*() const; ///  Dereference op

  /** Mutators */
  counted_ptr& operator=(const counted_ptr& rhs);
  template<typename rhs_pointee_type>
    counted_ptr& operator=(const specialised_counted_ptr_base<rhs_pointee_type>& rhs);
  void reset(pointee_type* p); /// Delete existing contents (and take ownership of p)
  void swap(base_type& with) throw():/// Swaps contents with "with"
};

uncounted_ptr interface

template<typename pointee_type>
class uncounted_ptr : public specialised_counted_ptr_base<pointee_type>
{
  typedef specialised_counted_ptr_base<pointee_type> base_type;

public:
  /** Construction */
  uncounted_ptr(); /// "0 initialised" pointer
  template<typename rhs_pointee_type>
  uncounted_ptr( const specialised_counted_ptr_base<rhs_pointee_type>& rhs); /// Take value of rhs
  ~uncounted_ptr() throw();

  /** Accessors */
  pointee_type* get() const; /// Returns contents
  pointee_type* operator->() const; /// Dereference op
  pointee_type& operator*() const; ///  Dereference op

  /** Mutators */
  template<typename rhs_pointee_type>
  uncounted_ptr& operator=( const specialised_counted_ptr_base<rhs_pointee_type>& rhs);
  void reset(pointee_type* p); /// Delete existing contents (and take ownership of p)
  void swap(base_type& with) throw(); /// Swaps contents with "with"
};

Notes: 

More fields may be available via dynamicdata ..