Choosing the right parameters for a template can make a significant difference to how useful the template is. In this article, I will present a very simple guideline that, where applicable, can improve a template's flexibility. I will also provide an example of how the standard library itself could have applied this guideline but didn't.
The fundamental idea can be seen in the following example:
template<typename Element> struct inflexible { typedef Element element_t; typedef std::vector<Element> container_t; // ... }; template<typename Container> struct flexible { typedef typename Container::value_type element_t; typedef Container container_t; // ... };
These two templates both define two member typedefs element_t and container_t, presumably for further use internally (not shown). In the first case, although the template can have any element type, it always uses a std::vector as container. The second case is more flexible, since it will work with any container, and any element type, provided that the container has a sufficiently vector-like interface. The principle at work can be stated as follows:
"A template will be more flexible if, instead of internally generating a new type from its arguments, it accepts the generated type directly as a parameter."
Unfortunately, there are some costs associated with this approach, which I will point out first before expanding on the benefits. Firstly, the interface may be less convenient. From the example, client code would have to use flexible<std::vector<int> > instead of simply inflexible<int>. There is an easy solution, which is to provide a convenient interface once the more flexible implementation is available:
template <typename Element> struct convenient : flexible<std::vector<Element> > { };
The second cost is that the documentation for the template will be more complicated. In the example, instead of merely specifying the constraints on the element type, the documentation must now also describe what interface the container type must provide, such as an operator[] member function, insert functions and so on. Ideally, the requirements would also be broad enough to allow for future changes in the implementation of the template, for instance switching internally from using operator[] to random-access iterators.
Lastly, to inter-operate with a wide variety of argument types, the template implementation will need to be more carefully written. In the example, instead of assuming that size_t is the correct type for indexing into the container, the implementation should use typename container_t::size_type.
To see the consequences of writing a more flexible template, let's take a look at std::map:
template <class Key, class T, ...> class map { public: typedef Key key_type; typedef T mapped_type; typedef pair<const Key, T> value_type; // ... };
This should be recognizable as a variant of the first template /inflexible/, since it generates the value_type (using std::pair) from template arguments instead of accepting a value_type parameter directly. If I have a user-defined type that has both key_type and mapped_type in the same object, I have a problem which will be familiar to some readers from their own experience. For example:
struct person { person_id_t m_person_id; //Unique identifier std::string m_surname; std::string m_other_names; // ... }; void foo () { typedef std::map<person_id_t, person> map_t; map_t my_map; person p; person_id_t id; // must generate a pair object for insertion my_map.insert(map_t::value_type (p.m_person_id, p)); // lookup by ID and modification are // convenient my_map[id].m_other_names = "Joe"; }
The insertion is a little cumbersome, and duplicates the person ID for every entry in the map. How much of a problem this is in practice depends on the nature of the key type in use, but it is usually a good idea to avoid data duplication where possible. Alternatively, one could choose to use std::set and have code like this:
struct person_id_less { bool operator()(person const &p1, person const &p2) { return p1.m_person_id < p2.m_person_id; } }; void foo () { typedef std::set<person, person_id_less> set_t; set_t my_set; person p; person_id_t id; // insertion is convenient my_set.insert (p); // find requires a person object instead // of just the ID p.m_person_id = id; // mutable access requires a const_cast const_cast<person &>( *my_set.find (p)).m_other_names = "Joe"; }
So std::set is easier to use as far as insertions are concerned, and does not duplicate any data, but element lookup and modification are made more difficult. In fact, the std::set example has a potentially catastrophic problem, since my_set.find() will return my_set.end() if there is no match, leading to undefined behaviour from the code. None of these problems are insurmountable, but they do reveal some limitations of the two templates' interfaces.
Now consider an alternative template, map2, which applies this article's guideline by accepting the complete value type as a template parameter:
template <class Value, ...> class map2 { public: typedef typename Value::first_type key_type; typedef typename Value::second_type mapped_type; typedef Value value_type; // ... };
This template assumes that the supplied type provides the same interface as std::pair, which means that the map implementation needs almost no changes at all. Unfortunately, this also means that the supplied type must have public member variables first and second which contain the object's key and mapped value, respectively. So allowing the client to provide different value types, but requiring a matching interface, hasn't actually achieved very much in this case.
Of course, we don't have to stop there. Having made the decision to accept value types other than instances of std::pair, it is fairly natural to consider alternative interfaces. For instance, requiring that the value type provide member functions get_key and get_mapped would solve most of the problems. It would then be relatively easy to extend the person class to provide the necessary interface and store person objects directly in a map2.
Unfortunately, this assumes that the value type knows in advance that it is going to be stored in a map. Furthermore, it is not very convenient for user defined types that could appear in different maps with different keys (e.g. person::m_surname would be suitable as an alternative multimap key). A far better solution would be to accept some additional information via a traits class:
template <class Traits, ...> class map3 { public: typedef typename Traits::key_type key_type; typedef typename Traits::mapped_type mapped_type; typedef typename Traits::value_type value_type; // ... }; struct person_id_traits { typedef person_id_t key_type; typedef person mapped_type; typedef person value_type; static key_type const &get_key( value_type const &val) { return val.m_person_id; } static mapped_type &get_mapped( value_type &val) { return val; } static value_type construct( key_type const &key) { // Required by map3::operator[] return value_type (key); } }; void foo () { // person has an unchanged interface typedef map3<person_id_traits> map_t; map_t my_map; person p; person_id_t id; // insertion is convenient my_map.insert (p); // lookup by ID and modification are // convenient my_map[id].m_other_names = "Joe"; }
This template provides convenient interfaces for insertion, lookup and modification. It avoids any data duplication and compares well to the std::map and std::set versions, which each made some of the operations simple but not others.
Before going on, the construct function in the traits class probably requires further explanation. It is necessary because the map3 operator[] accepts just a key as parameter and might have to insert a whole new value into the map. The std::map template has a similar constraint, since it defines operator[] in terms of insert(make_pair(key, T())), requiring that its parameter T be default constructible. This is also quite similar to the lookup problem mentioned for std::set, which requires a complete value object in order to search the container. The advantage of map (or map3) is that this problem only arises in operator[] and not the alternative find member function.
So the traits-based version provides a good solution, because it means that almost any class can be stored in the map3 container without internal changes. There is some work involved in writing a new traits class every time, but it is easy enough to emulate the original std::map interface for the simple cases that it conveniently supports:
template<typename Key, typename T> struct std_map_traits { typedef Key key_type; typedef T mapped_type; typedef std::pair<Key const, T> value_type; static key_type const &get_key( value_type const &val) { return val.first; } static mapped_type &get_mapped( value_type &val) { return val.second; } static value_type construct( key_type const &key) { // Required by map3::operator[] return value_type (key, mapped_type()); } }; template<typename Key, typename T, ...> struct std_map : map3<std_map_traits<Key, T>, ...> { // constructors... }; void bar () { std_map<int, std::string> my_map; my_map[1] = "hello"; }
There are plenty of other applications for a more flexible map. For instance, suppose that we want two different indexes for the same collection of person objects, one which uses the (unique) person ID, and another which uses the (non-unique) surname. A sensible way to do this is to use a reference-counted smart pointer, and maintain two sets of pointers that are sorted by the alternative keys. In our case, client code can re-use its existing traits classes by writing a traits pointer-adaptor template. For example (using the boost reference counted pointer available free from http://www.boost.org):
// Adaptor to convert a traits class for use // via a map3 of boost::shared_ptr values template<typename PlainTraits> struct ptr_traits { private: typedef typename PlainTraits::value_type plain_value_type; public: typedef typename PlainTraits::key_type key_type; typedef typename PlainTraits::mapped_type mapped_type; typedef boost::shared_ptr<plain_value_type> value_type; static key_type const &get_key( value_type const &val) { return PlainTraits::get_key(*val); } static mapped_type &get_mapped( value_type &val) { return PlainTraits::get_mapped (*val); } static value_type construct( key_type const &key) { return value_type( new plain_value_type( PlainTraits::construct(key))); } }; // Define two pointer-based indexes using // different keys map3 <ptr_traits <person_id_traits> > index1; multimap3 <ptr_traits <person_name_traits> > index2; // ... index1[my_id].m_other_names = "Joe";
Note that the fact that the indexes store reference-counted pointers internally is at least partially hidden from the client code. This is not a complete solution, of course, but goes some way towards one (interested readers may also like to investigate the link given under additional reading). An alternative solution using std::set would also be possible, but again is not quite as convenient for searching and modifying. Even ignoring the possibility of find() returning end(), the assignment from above would look more like this:
boost::shared_ptr<person> temp( new person (my_id)); (*my_set.find (temp))->m_other_names = "Joe";
In the map3 solution from above, these details are effectively encapsulated the ptr_traits template.
In the case of std::map, a number of changes were necessary to achieve a real gain in flexibility. However, the first step was to allow the client code to choose their own value type. It is then a fairly obvious improvement to access the keys and mapped values via client-provided functions instead of by using member variables. In this case, a traits class is a convenient way to provide the necessary additional information.
More generally, whenever a template internally generates a new type from its arguments, it is limiting its own flexibility. By accepting the generated type as a parameter instead, the template can become flexible not only in terms of the original parameters, but also in the choice of generated type. There are certain trade-offs in terms of ease of documentation and coding, but potential users of the template may discover unexpected benefits from a more flexible template.
Thanks to Phil Bass for his comments on two earlier drafts of this article.
The indexed_set library under development by Joaquín Mª López Muñoz. Google for "indexed_set" or see the recent discussion at http://lists.boost.org/MailArchives/boost/msg54772.php
A simple implementation of the map3 template for demonstration purposes can be found at http://home.clara.net/raoulgough/map/
Overload Journal #58 - Dec 2003 + Programming Topics
Browse in : |
All
> Journals
> Overload
> 58
(8)
All > Topics > Programming (877) Any of these categories - All of these categories |