ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinC# Generics - Beyond Containers of T

Overload Journal #74 - Aug 2006 + Programming Topics   Author: Steve Love
Steve Love takes a look at generics in C# v2.0, how to use them to simplify code and even remove dependencies.

One of the bigger differences between the latest version of C# and its predecessors is the addition of Generics. This facility is in fact provided and supported by the runtime (actually the Common Language Infrastructure, or CLI, specification), and exposed in the language of C# from version 2.0. Programmers already familiar with C++ templates or Java generics will immediately spot that they share a common base motivation with C# - the provision of type-safe generic containers of things. Syntactically they are similar, too, but it would have been grotesque for C# to choose an entirely different syntax just to be unique. Where these languages really diverge is in the implementation details. A discussion of these differences is beyond the scope of this article; this isn't a comparison between languages, rather an exploration of what C# generics offer. If you think that means you can do:

public class Stack< T >
{
}
    

and

private T Swap< T > ( T left, T right )
{
}
    

well, you're right, but that's not the whole story. This article is for people who already know you can do those things and are starting to wonder if they can do anything else.

Groundwork

At its most basic level, using generics is about writing less code. At a slightly higher level, it's about saying what you mean in code. Without generics, the contents of a list have to be manually cast from object to obtain the real thing. With a generic list container, you write less code because the casts are no longer required. A list container parameterised on the type of its contents also says what it is right on the tin. Speak less, say more.

The next thing generics give you is support from your compiler. If you try to get an integer out of a container of strings, the compiler tells you you're dumb. Defects like this, if left until the program runs, are much harder to find and fix. Generics provide stronger type-safety. Wise programmers depend on their compilers being smarter than they are, however smug the error messages are.

Note that all the above is about using generic types, delegates, methods, etc.. If you're writing generic code, you need to invest a bit in allowing the programmers using your code to write less code, say what they mean and expect the compiler to spot their misuses. Given that often the "other programmers" will be you later on, it's worth every penny.

The example (nope - it's not a stack)

Double Dispatch [Wiki] is a common pattern used to figure out the concrete type of an object when all you have is an interface. The Visitor Pattern is often used for this, but the intent of Double-Dispatch is different. A common example is for Shape objects, and a basic implementation is in Listing 1. For the purposes of the example, the ShapeHandler class has a Receive method to stand-in for client code. The important point is that the code needs to access the concrete Shape instances when it has only a Shape interface available.

interface Shape
{
  void Accept< HandlerType >( HandlerType handler );
}
class Circle : Shape
{
  void Shape.Accept< HandlerType >(
     HandlerType handler )
    {
      ( ( Handler< Circle > )handler ).Handle(
         this );
    }
}
class Square : Shape
{
  void Shape.Accept< HandlerType >(
     HandlerType handler )
    {
      ( ( Handler< Square > )handler ).Handle(
         this );
    }
}
interface Handler
{
    void Receive( Shape s );
    void Handle( Circle c );
    void Handle( Square s );
}
class ShapeHandler : Handler
{
    public void Receive( Shape shape )
    {
        shape.Accept( this );
    }
    public void Handle( Circle c )
    {
        Console.WriteLine( "Circle" );
    }
    public void Handle( Square s )
    {
        Console.WriteLine( "Square" );
    }
}
Listing 1

What's worth paying attention to in this example is what would be required if a new member of the Shape hierarchy gets added: the Handler interface grows a new method, and the ShapeHandler implementation grows a new method in sympathy. This isn't too onerous, really, but it's a nasty dependency. All of the classes shown must be in the same assembly, because to do otherwise would introduce a circular reference, which is not allowed. In practice this means that the Handler interface is redundant. We could just as easily use the concrete ShapeHandler object in the Accept methods.

We can use C# generics to remove some of the code duplication in this example, and more importantly, we can break the dependency between the Handler interface, and the concrete implementations of the Shape interface.

There are two areas in the example for Listing 1 where code is duplicated. Firstly, each of the concrete implementations of the Shape interface require an Accept method which contains identical code. Further implementations of Shape require exactly the same code. The second duplication is also a source of the dependency already mentioned: each of the implementations of Shape is mentioned in both the Handler interface and its implementation.

The ideal situation we'd like to achieve is to have the Shape and Handler interfaces living in their own assemblies (or perhaps a single common interfaces assembly), with concrete Shapes in one separate assembly, and the concrete ShapeHandler<sup class="footnoteref"> in another. We also want to divorce the Accept method from the Shape interface: it is not a Shape's responsibility to do double dispatch, it has other things to think about. We can in fact go one step further, and remove the duplication in the Square and Circle classes by moving the Accept method to a base class.

The Handler

The first pass at breaking this up is to remove the dependency between the Handler interface and the concrete Shape classes. This is where C#'s generics can help.

Generic code is representable regardless of the types it operates on (although we'll talk about type constraints later). "Representable" usually means some combination of:

  • An algorithm works irrespective of the types on which it operates - e.g. Swap or sort.
  • The storage of objects is the same whatever their type - e.g. Containers of "T".

Our requirements don't really fit either of these: the whole point of using double-dispatch is to vary the behaviour on the concrete type of the object we have in hand, and we're not really storing objects. What we can say is that the interface for the Handler is the same for each type of Shape object we consider. By using a generic interface, our Handler interface and ShapeHandler implementation become as shown in Listing 2.

interface Handler< HandledType >
{
  void Receive( Shape s );
  void Handle( HandledType c );
}
class ShapeHandler : Handler< Circle >,
   Handler< Square >
{
  public void Receive( Shape s )
  {
    s.Accept( this );
  }
public void Handle( Circle c )
    {
        Console.WriteLine( "Circle handled" );
    }
    public void Handle( Square s )
    {
        Console.WriteLine( "Square handled" );
    }
}
Listing 2

This code demonstrates a simple use of generics, where the Handler has a generic parameter indicating the handled type. This handled type is then used by the Handle method declaration. When the interface is implemented, a Handle method for each of the generic arguments (in this case Circle and Square) must be implemented. Handler is a straightforward generative interface. In particular, it allows ShapeHandler to explicitly declare what types of Shape it is a handler for - it is like a label on a tin.

Note that the Handler interface no longer has the dependencies upon Circle and Square. This means we can safely put the Handler interface into a separate assembly, and have broken that part of the dependency circle. It presents other challenges, though.

Recall from Listing 1 that the Shape interface looked like this:

interface Shape
{
  void Accept( Handler handler );
}
    

Now that the Handler interface has a generic parameter, this is no longer valid. The problem is, what would we put as the argument to it?

interface Shape
{
  void Accept( Handler< ? > handler );
}
    

We could use the ShapeHandler class directly, but that would again make the Handler interface redundant, and would re-introduce the circular dependency between Shape and ShapeHandler. We could try the same trick as with Handler and make Shape a generic, but that would cause a different kind of difficulty. If Shape were generic, we would have to name its type parameter at every use, precluding uses like

List< Shape< ? > > shapes;
    

The answer is to use a generic method. Instead of making the whole of the Shape interface generic, we make only its Accept method generic. A first attempt might look like this:

void Accept< ShapeType >(
  Handler< ShapeType > handler )
    

This would be valid, parameterising Handler on the generic parameter of Accept, but we have now moved the problem of what to put as a generic argument back to ShapeHandler, in its Receive method:

public void Receive( Shape s )
{
  s.Accept( this );
}
    

This does not compile because the call to Shape.Accept<sup class="footnoteref"> is ambiguous: "this" could be interpreted as either a Handler< Circle > or a Handler< Square > in this context, and we cannot explicitly specify which to use, because at this point, we know only that we have a Shape<sup class="footnoteref">. The best we can manage is to make the entire type of the handler a generic parameter.

void Accept< HandlerType >(
  HandlerType handler );
    

We then need some way to tell the compiler the actual contents of the handler in the Accept method, and generics can't help here: a runtime cast is required. Listing 3 shows the new Shape hierarchy with this in place. ShapeHandler<sup class="footnoteref"> and its interface remain as in Listing 2.

interface Shape
{
  void Accept( Handler handler );
}
class Circle : Shape
{
  public void Accept( Handler handler )
  {
    handler.Handle( this );
  }
}
class Square : Shape
{
  public void Accept( Handler handler )
  {
    handler.Handle( this );
  }
}
Listing 3

Note that it's not possible to cast an unconstrained generic parameter to just anything. A cast to object is always valid, and a cast to an interface type (which is used here) is also OK. Constraining the parameter can make further casts valid.

This code works because generics in C# are a runtime artifact. The Accept method in Shape is implicitly virtual, because it's declared in an interface type. Derived classes (Circle<sup class="footnoteref"> and Square) inherit this method, and if code calls Accept<sup class="footnoteref"> using the Shape reference, the over-ridden method in the concrete (runtime) type will get called. If you're already familiar with C++ templates, this is probably the biggest difference between them and C# Generics. In C++, templates are a purely compile-time construction, and it's not possible to have a virtual function template.

Mission accomplished

We have reached the point where the circular dependency between concrete Shapes and the Handler interface is gone. We can create a new project structure reflecting rôles and responsibilities, with each different assembly having a specific rôle:

HandlerInterface

- interface Handler< HandledType >

ShapeInterfacedepends upon HandlerInterface

- interface Shape

Shapesdepends upon ShapeInterface and HandlerInterface

- class Circle

- class Square

Application depends upon Shapes, ShapeInterface and HandlerInterface

- class ShapeHandler

There are no circular references, and apart from the main application, all dependencies are on interface-only assemblies. If nothing else, this makes testing easier. There remain some opportunities for overtime, however: we can still improve on what we have. It's time to make good on the promise that we can reduce the amount of duplicated code, and in the process make the whole a bit more tidy and self-describing.

The interfaces of shapes

It has already been noted that the Shape interface has too many responsibilities. Not only is it a shape, with all that implies, it is a shape that can take part in the double-dispatch mechanism we are describing here. That really isn't part of a Shape interface.

This indicates the need for a new interface, which we'll call Dispatchable. This interface exposes the double-dispatch mechanism in the same way as the old Shape interface did - a generic Accept method.

We can still do better. The implementations of the Accept method in the concrete Circle and Square classes are identical. Each class now implements Dispatchable as well as Shape, but the Accept method remains as it was in Listing 3. We have already identified that one of the uses of generic code is when an algorithm can operate regardless of the types using it: this is exactly that example.

We can therefore make a base class common to all concrete shapes which implements the Accept method. Listing 4 shows a first try at what we want to achieve. Taking a leaf from a pattern normally associated with C++ - the Curiously Recurring Template Pattern [Coplien95] - each derived class parameterises its base class with itself.

public class Dispatchable< Dispatched >
   : Dispatchable
{
  void Dispatchable.Accept< HandlerType >(
     HandlerType handler )
  {
    ( ( Handler< Dispatched > )handler ).Handle(
       this );
  }
}
public class Circle : Dispatchable< Circle >, Shape
{
}
Listing 4

The observant among you will notice that we now have two types with the same name - Dispatchable. C# allows types to be "overloaded" based on the number of generic parameters, and thus Dispatchable and Dispatchable< Dispatched > are distinct types.

Unfortunately this doesn't compile because the call to Handle is passing this -which is an instance of Dispatchable< Dispatched >. Recall the Handler<sup class="footnoteref"> interface: it is a generic interface, where the Handle method declaration uses the generic parameter. ShapeHandler itself implements the Handle method with the actual concrete type of the shape being used, and so expects either a Circle or a Square. The Dispatched parameter is a generic handle for exactly that - depending on which concrete class implementation is in play - so we might try this:

( ( Handler< Dispatched > handler )
   .Handle( ( Dispatched )this );
    

Alas this doesn't compile either. The difficulty is that the compiler doesn't know the type of Dispatched because it is resolved at runtime. The types to which we can cast "this" are strictly controlled: we can cast to object, to any direct or indirect base class or interface of this, or to the same type as this - which is usually redundant, but permitted. It cannot be cast to a type that is, as far as the compiler is concerned, an unrelated type.

Constraints

A short diversion into a comparison between C++ templates and C# generics might be illustrative of what is going on. In C++, a template parameter represents any type, and being a compile-time construct, the compiler knows when some facility is used that the supplied argument to the template parameter doesn't support. In C# generic parameters are not resolved to types until runtime, so during compilation, the parameter still represents any type, but only as an object, the ultimate base class of all types. Therefore only those operations supported by object are permitted by the compiler, without extra information.

The extra information is provided either by explicitly casting the reference (as we've already seen), or by using constraints on the types allowed as arguments to that parameter. Listing 5 shows this in a simple way. Note there are several different types of constraint; a fairly detailed discussion can be found at [MSDN].

struct Person
{
  public string Name
  {
    get { return name; }
  }
  private string name;
}
class PersonComparer
{
  public static bool Compare< Sorted >(
     Sorted left, Sorted right )
    where Sorted : Person
    {
      return string.Compare( left.Name,
                              right.Name ) < 0;
    }
}
Listing 5

The where clause on the PersonComparer. Compare method tells the compiler that the parameters are Person objects (or are derived from Person), and thus have a Name property. Without the constraint, this code won't compile because object has no Name property. In addition, if PersonComparer. Compare is called with arguments which are notPerson objects, the compiler also issues an error - the constraint applies to the client code as well.

So finally we should be able to finish the generic Dispatchable base class. From Listing 4, remember we need to be able to cast this to a type suitable for the argument to Handler< Dispatched >.Handle which accepts a reference to either Circle or Square. Depending on context, the Dispatchable class is either a Dispatchable< Circle > or Dispatchable< Square >, with the concrete type substituted for the generic parameter Dispatched.

In order to cast this to Dispatched, Dispatched must be constrained to ensure it's actually the same type asthis, so the class declaration becomes:

public class Dispatchable< Dispatched > : Dispatchable
   where Dispatched : Dispatchable< Dispatched >
    

and the cast is now legal, allowing:

void Dispatchable.Accept< HandlerType >(
   HandlerType handler )
{
  ( ( Handler< Dispatched > )handler ).Handle(
     ( Dispatched )this );
}
    

With this in place, we can now add the Dispatchable interface, and perhaps the Dispatchable< Dispatched > base class, since it depends only on the Handler< Dispatched > interface, to the Handler assembly, and the Shape interface is completely independent of the double-dispatch mechanism.

The effort now required to add a new object to the Shape family is to add the class, and inherit from Dispatchable, add a new derivation to ShapeHandler, and add the overloaded Accept method. No copy-and-paste, and many mistakes in usage will be caught by the compiler. Another benefit of using this generic double-dispatch framework is that ShapeHandler need not be the only handler: there could be multiple implementations of the Handler interface, each handling a different set of Shape objects, with no duplication in the interface or implementations.

Using a Constraint for the Dispatched parameter in the Dispatchable< Dispatched > class gave the compiler extra information about Dispatched which allowed us to use it as a cast target. The question now arises - could we apply a constraint to HandlerType in the Accept method and so remove the runtime cast?

Unfortunately, the answer is "no".

An interface class specifies a contract which must be adhered to by implementing classes. A constraint on a generic parameter forms part of the interface, and therefore the contract, so implementing classes must match it exactly. The HandlerType parameter would need a constraint in the Dispatchable interface:

interface Dispatchable
{
  void Accept< HandlerType >(
  HandlerType handler ) where HandlerType :
  Handler< ? >;
}
    

and we have no way of specifying what to use as the argument to Handler at that point.

In conclusion

There is more to life - and generic code - than containers and simple functions. Generics in C# improve the type-safety of code, which in turn gives us, as programmers, much greater confidence that our code, once compiled, is correct.

The double-dispatch example shows how generics allow a generative interface to remove hard-wired dependencies, how classes can make use of a common generic base class to reduce code duplication, and demonstrated the run-time nature of C# generics, using virtual generic methods. In addition, it shows the trade-off of using type constraints on generic parameters, where restricting the types permitted by user code allows the generic code more freedom in its implementation.

There is much more to generics in C# than can be covered here, including using constraints to improve code efficiency, specifying naked type constraints, to match whole families of types (mimicking Java wild-cards), and creating arbitrary objects at runtime.

Generics in C# are not perfect - nothing is! - and there are limitations which can seem to be entirely gratuitous, but they provide a very powerful and expressive framework for improving code by allowing it to speak less, and say more.

Acknowledgements

The idea for using C# generics to implement double-dispatch is the result of inspiration from two sources:

Anthony Williams' article "Message Handling Without Dependencies" [Williams06] discusses managing the dependency problem of double-dispatch in C++ using templates. This got me wondering whether anything like it was possible in C# using generics.

Jon Jagger has an Occasional Software Blog [Jagger], where he uses the Visitor Pattern to demonstrate some properties of generics in C#. This showed me that double-dispatch probably was possible using C# generics.

Thanks to Phil Bass, Nigel Dickens, Pete Goodliffe, Alan Griffiths and Jon Jagger for their helpful comments and insights.

References

[Wiki] http://en.wikipedia.org/wiki/Double_dispatch

[Coplien95] James O Coplien, "Curiously Recurring Template Pattern", C++ Report February 1995

[MSDN] MSDN, "Constraints on Type Parameters (C# Programming Guide), http://msdn2.microsoft.com/en-us/library/d5x73970.aspx

[Williams06] Anthony Williams, "Message Handling Without Dependencies", Dr Dobb's Journal May 2006, issue 384, available on-line at http://www.ddj.com/dept/cpp/184429055

[Jagger] Jon Jagger, "Less Code More Software (C# 2.0 - Visitor return type)", http://jonjagger.blogspot.com

Overload Journal #74 - Aug 2006 + Programming Topics