Journal Articles

Overload Journal #46 - Dec 2001 + Programming Topics
Browse in : All > Journals > Overload > 46 (5)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Metaclasses and Reflection in C++ - Part 2

Author: Administrator

Date: 26 December 2001 16:46:08 +00:00 or Wed, 26 December 2001 16:46:08 +00:00

Summary: 

Body: 

In an article published in Overload 45 I described the design and implementation of a meta-object layer for C++, MOP. Definitions were provided for both metaclasses and metaobjects. This article continues that discussion by demonstrating how they can be used to solve the design problems found in an Internet bookshop example. Susan, the manager of the store, requires a highly dynamic system to allow the business flexibility she requires to be successful. Using a metaclass her products would be modelled thus:

Now the MOP is complete. Let's use it:

Creating the Product class:

ClassDef * product = new ClassDef(0, // no base class for Product "Product"); // name of class

Adding attributes:

product->addAttribute(Attribute("Product Number", Type::intT)); product->addAttribute(Attribute("Name", Type::stringT)); product->addAttribute(Attribute("Price", Type::doubleT)); product->addAttribute(Attribute("Weight", Type::doubleT));

Creating the Book class with an attribute list:

list<Attribute> attrL; attrL.push_back(Attribute("Author", Type::stringT)); attrL.push_back(Attribute("Title", Type::stringT)); attrL.push_back(Attribute("ISBN", Type::intT)); ClassDef * book = new ClassDef(product, // base class "Book", attrL.begin(), attrL.end());

Creating an object:

Object * bscpp(book-newObject());

Setting the values for the objects:

Set an int value by index (don't forget that index 0 is ProductNo):

bscpp->setValue(0, RealValue<int>(12345));

Same for a string value:

bscpp->setValue(4, RealValue<string>("Bjarne Stroustrup"));

Better way: set value by name this gives the most derived attribute:

bscpp->setValue("Title", RealValue<string>("The C++ Programming Language")); bscpp->setValue("Weight", Value<double>(370));

Getting the values:

Display a book:

ClassDef::AttrIterator a; size_t idx; for (a = book->attribBegin(), idx = 0; a != book->attribEnd(); ++a, ++idx) { cout << a->getName() << ": " << bscpp->getValue(idx).asString() << endl; }

and we get:

Product Number: 12345 Name: Price: Weight: 370 Author: Bjarne Stroustrup Title: The C++ Programming Language ISBN:

So, our MOP is complete. For our sample application, you have to add a class repository, some nice GUI to define classes and objects, creating the index for the search machine, provide an interface for ShoppingCart, but then you're done, and Susan is happy as she now can create her own new product categories at runtime.

Reflection for existing C++ classes

If you provide the interface for the ShoppingCart in our example, you'll find that it isn't so easy. If all classes are dynamic classes, all access must go through the MOP: a getName() for Book:

string bookGetName(Object const * book) { if (book->instanceOf().getName() != "Book") // throw some exception string name; // name = book->author + ": " + book->title; it was so easy... string author = book->getValue("Author").get<string>(); string title = book->getValue("Title").get<string>(); name = author + ": " + title; return name.substr(0, 40); }

For a lot of applications, it would be useful to provide some classes of a hierarchy as C++ classes, e.g. Product, but still let the user add classes of the same hierarchy at runtime, e.g. TShirt. So, let's look at this. If we want access through our MOP to C++ classes, we need a getValue() to which we can give the attribute we want to access at runtime. So, here it is:

Value getValue(Object *o, MemberPointer mp) { return o->*mp; }

The magic lies in '->*': This is the pointer-to-member selector of C++.

Pointer to members

You can imagine a pointer-to-member in C++ as an offset[1] from the base address of an object to a specific member. If you apply that offset to such a base address, you get a reference to the member (Fig. 1). But as a pointer-to-member is a normal data type in C++, you can store them in containers, pass them to functions, etc. Thus, you can write the function above, building the fundamental base for our combination of C++-classes and runtime-classes.

Let's look at some details of pointer-to-members. As an example, we use the following class:

class Product { // ... protected: RealValue<double> price; }; class Book : public Product { public: // ... private: RealValue<string> author, title; RealValue<double> weight; }; Book b, *bp;

A pointer-to-member is a type that is derived from two other types: The type of the base object (Book in our example) and the type of the member (RealValue<>). The type decorator for a pointer-to-member is '::*', so let's define two variables with initialization:

RealValue<string> Book::* bookStringMemPtr = &Book::author; RealValue<double> Book::* bookDoubleMemPtr = &Book::weight;

The pointer-to-member selector comes in two variations: as '.*' you can apply it to references of the class and as '->*' it takes a pointer. It is a binary operand, as left operand it takes a reference (or pointer) to an object and as right operand a pointer-to-member. So, with the above definitions, you can do things like:

b.*bookStringMemPtr = "Bjarne Stroustrup"; // assigns b.author bookStringMemPtr = &Book::title; bp->*bookStringMemPtr = "The C++ Programming Language"; // assigns b.author

Of course, as title is a private member of Book, the assignment of the pointer-to-member must be at the scope of that class. But the pointer-to-member itself can be used even if you don't have access privileges to the members (as long as you have access to the pointer-to-member).

pointer-to-member Types

A word about the types: RealValue<double>Book::*, RealValue<double>Product::*, and BaseValueProduct::* are different types. But are there conversions? The C++ standard provides a conversion from a pointer-to-member of a base class to a pointer-to-member of a derived class. That makes sense: You can apply an offset to a member of a base class to the base address of a derived object as well, as the base is a part of the derived object[2]. So you can assign bookDoubleMemPtr = &Product::price; as the price is part of each Book instance. The other way around it obviously doesn't work, you couldn't initialise RealValue<double> Product::* &Book::author; as author is not a member of each instance of type Product.

But the standard does not provide a conversion from RealValue<double>Book::* to BaseValueBook::*, though it would be safe: If the result type of the pointer-to-member selector is a reference to a derived class, it can be safely converted to a reference of a respective base class, so it would also be safe to let the compiler do the conversion automatically and therefore also convert the pointer-to-members themselves. As already mentioned, the standard doesn't provide (implicit) and even doesn't allow (explicit through static_cast) that conversion, probably because the committee didn't see any use for pointer-to-data-members at all (see [Ball-]), and for pointer-to-member-functions that conversion really doesn't make sense.

The problem for us is: we need that conversion. We want to keep pointer-to-members to all members of a class in one common container, but what could be the type of that container's elements? One option would be to force the conversion through a reinterpret_cast, but the only thing you can safely do with a reinterpret_casted thing is to reinterpret_cast it back, and for that you have to store the original type as well. So we use another option: we just define the conversion! But as C++ doesn't allow you to define your own conversions to compiler-provided types (and in this sense the pointer-to-members are compiler defined, though the involved single types like Book or BaseValue are user-defined), we have to define wrapper classes around them.

Here's the implementation:

template <typename BaseType, typename BaseTargetType> class MemPtrBase { public: virtual BaseTargetType & value(BaseType & obj) const = 0; virtual BaseTargetType const & value(BaseType const & obj) const = 0; protected: MemPtrBase() {} virtual ~MemPtrBase() {}; private: MemPtrBase(MemPtrBase const &); MemPtrBase & operator=(MemPtrBase const &); }; template <typename BaseType, typename BaseTargetType, typename TargetType> class TypedMemPtr : public MemPtrBase<BaseType, BaseTargetType> { public: TypedMemPtr(TargetType BaseType::* ptr) : p(ptr) {} BaseTargetType & value(BaseType & obj) const { return obj.*p; } BaseTargetType const & value(BaseType const & obj) const { return obj.*p; } private: TargetType BaseType::* p; }; template <typename BaseType, typename BaseTargetType> class MemPtr { // this is a handle only public: template <typename BaseType2, typename TargetType> explicit MemPtr(TargetType BaseType2::* ptr) : p(new TypedMemPtr<BaseType, BaseTargetType, TargetType>(static_cast<TargetType BaseType::*>(ptr))){} ~MemPtr() { delete p; } BaseTargetType & value(BaseType & obj) const {return p-value(obj); } BaseTargetType const& value(BaseType const& obj)const {return p->value(obj);} private: MemPtrBase<BaseType, BaseTargetType> * p; };

Some notes on the code: BaseType is used for the class to which a pointer to member is applied (e.g. Book), TargetType is the result type to which a pointer-to-member points (RealValue<double>), and BaseTargetType is the base class of TargetType (BaseValue). MemPtrBase<> is the common base class as we need it (e.g. MemPtrBase<Book,BaseValue>, which stands for BaseValueBook::*), TypedMemPtr<> hold an actual C++ pointer-to-member (TypedMemPtr<Book,RealValue<double> >), and MemPtr<> is a handle class around MemPtrBase<> to store them in a container. Here, the actual access function is the value() member function. If you want, you can add a global operator '->*' (as template function), but you can't provide the operator by a member function (as the left operand is not the class instance), and you can't overload '.*' (this is one of the few non-overloadable operators).

The MemPtr constructor is a member template with two template parameters: a BaseType2 and the TargetType. The second one is clear as it defines the actual TypedMemPtr to be constructed, but the BaseType2 is not so obvious. If we omit the BaseType2, so only having:

template <typename BaseType, typename BaseTargetType> class MemPtr { // this is a handle only public: template <typename TargetType> explicit MemPtr(TargetType BaseType::* ptr) : p(new TypedMemPtr<BaseType, BaseTargetType, TargetType>(ptr)) {} // ... };

and then we try to create a

MemPtr<Book, BaseValue> mp2(&Product::price);

some compilers give an error, as they cannot fiddle out the correct conversion. This would be to convert RealValue<double> Product::* to RealValue<double> Book::*, which should be done automatically, and then to instantiate MemPtr's constructor with RealValue<double> as TargetType.

One way to solve this is to explicitly cast the pointer-to-member:

MemPtr<Book, BaseValue> mp2(static_cast<RealValue<double>Book>::*(&Product::price));

but that's quite a lot to type. It's actually much easier to move that explicit conversion into the constructor itself and just provide an additional template parameter, as shown in the implementation above. The compiler checks the conversion anyway, so you will get a compile time error if that conversion is not allowed (e.g. if you try to convert a RealValue<double> Book::* to RealValue<double> Cd::*.

C++ classes

The MemPtrs allow you to access the attribute values of an ordinary C++ object. This helps for one part of the MOP. But what about the attributes themselves? The compiler has the necessary knowledge, but unfortunately there is no standard way to access that knowledge at runtime. So we must provide it and define an interface for it. To allow a smooth integration with our existing ClassDef, we provide an Attribute iterator as interface. For now, we provide the information about the attributes manually, but in a following article we'll explore the use of a pre-processor for that.

So, for our class Book we provide the following functions:

class Book : public Product { typedef MemPtr<Book, FinalValue> MemberPtr; public: // as before static size_t ownAttribCount() {return 5; } static Attribute * ownAttribBegin() { static Attribute a[] = {Attribute("Author", Type::stringT), Attribute("Title", Type::stringT), Attribute("Publisher", Type::stringT), Attribute("Price", Type::doubleT), Attribute("Weight", Type::doubleT) }; return a; } static Attribute* ownAttribEnd(){return ownAttribBegin() + wnAttribCount(); } static MemberPtr * memberBegin() { static MemberPtr m[] = {MemberPtr(&Book::productNo), MemberPtr(&Product::weight), MemberPtr(&Book::author), MemberPtr(&Book::title), MemberPtr(&Book::publisher), MemberPtr(&Book::price), MemberPtr(&Book::weight) }; return m; } static MemberPtr * memberEnd() { return memberBegin() + 7; } private: RealValue<string> author, title, publisher; RealValue<double> price, weight; };

Please note the difference between ownAttribBegin() and memberBegin(): the first provides information only about the own attributes, while the latter provides access also to base class members. This separation makes sense: while on object level all data members build together one object, on the class level the base class is a different entity and should be available as common base class for the meta object protocol as well. But this separation has consequences: we can't derive a C++ class from a MOP class (but this is no real restriction) and the C++ base class must be also made known to the MOP (but that's useful anyway).

We have no function that provides information about the base classes, as the baseClass in ClassDef must be made a ClassDef instance as well, as noted above. The C++ base classes are not of much use for our application.

With these functions, we can build a ClassDef from a C++ class; it's so easy that we can even provide a helper function for that:

template <typename CppClass> ClassDef makeClass(ClassDef const * base, string const & name) { return ClassDef(base, name, CppClass::ownAttribBegin(), CppClass::ownAttribEnd()); }

Now, we can build our ClassDefs for Product and Book and create instances from them:

ClassDef base(makeClass<DynaProduct>(0, "Product")); ClassDef book(makeClass<Book>(base, "Book")); book.newObject();

But stop -- though this works, it's not what we want: now, the instances are not genuine C++ objects, but MOP objects, and all access must still go through the MOP.

C++ objects

What we want are real C++ objects that we can also access through the MOP, i.e. through the Object interface. The OO way to do that is to derive our Product from Object, but though other OO languages like Smalltalk do that, I think there's a better, less intrusive option: we provide an adaptor.

From [Gamma-] we learn that there are two options for the adaptor pattern: to design the adaptor class as forwarding wrapper class derived only from Object and containing a Product member or using multiple inheritance and derive the adaptor from Object and Product. For simplicity, we will use the first option, but real world applications often benefit from the second approach. So we provide a wrapper class CppObject that is derived from Object and that holds the original C++ object. It implements the interface of Object (getValue() and setValue()) through our MemPtrs:

template <typename OrigClass> class CppObject : public Object{ typedef MemPtr<OrigClass, BaseValue> MemberPtr; public: CppObject(ClassDef const * myClass) : Object(myClass), myObject(), members(OrigClass::memberBegin()) {} virtual Object * clone() const { return new CppObject(*this); } using Object::getValue; // importing getValue(name) using Object::setValue; // importing setValue(name) virtual Value getValue(size_t idx) const { return members[idx].value(myObject); } virtual void setValue(size_t idx, Value const & v) { BaseValue * p = &(members[idx].value(myObject)); p-set(v); } private: MemberPtr * members; OrigClass myObject; };

A useful rule of OO design says that only leaf classes should be concrete, so let's define Object as abstract base class and create a new class DynaObject that resembles our former Object for real MOP class instances (Fig.2).

If we now do a prod.newObject(), we still get it wrong: we now get a DynaObject, but we want a CppObject<Product>. To solve that, we must provide the ClassDef with a means to create the correct kind of object, and the simplest way to do that is a factory method: we provide a static creation function in CppObject<> and DynaObject, give a pointer to that function to the ClassDef's constructor, store it and use that function in ClassDef::newObject().

Creation functions for DynaObject and CppObject:

Object * DynaObject::newObject(ClassDef const * myClass){ return new DynaObject(myClass); } template <typename OrigClass> Object * CppObject<OrigClass>::newObject(ClassDef const * myClass) { return new CppObject(myClass); }

Changes to ClassDef:

class ClassDef { public: typedef Object * (*CreateObjFunc)(ClassDef const *); template <typename Iterator> ClassDef(ClassDef const*, string const&, CreateObjFunc objFunc, Iterator, Iterator) : // ... createObj(objFunc) { // ... } ClassDef(ClassDef const *, string & const name_, CreateObjFunc objFunc) : // ... createObj(objFunc) { // ... } Object * newObject() const { definitionFix = true; return (*createObj)(this); } // ... as before private: const CreateObjFunc createObj; // ... as before };

And a simple change to makeClass:

template <typename CppClass> ClassDef makeClass(ClassDef const * base, string const & name) { return ClassDef(base, name, CppObject<CppClass>::newObject, CppClass::ownAttribBegin(), CppClass::ownAttribEnd()); }

Now, everything works. Well -- nearly. If we now try to create a ClassDef for Product with makeClass the compiler complains about creating an abstract class: makeClass gives the ClassDef constructor a pointer to CppObject<Product>::newObject(), and that creates a Product instance as part of CppObject<Product>. This is easily fixed: just call the ClassDef constructor directly with a null-pointer for the creation function, thus prohibiting the creation of a Product instance through the MOP.

Usage

The MOP, as you have it now, allows you to define the C++ classes as MOP classes as before:

ClassDef base(0, "Product", 0, Product::ownAttribBegin(), Product::ownAttribEnd()); ClassDef book(makeClass<Book>(&base, "Book"));

You can create instances of them with

book.newObject();

You can define new classes derived from Product

ClassDef * tShirt = new ClassDef(&base, "T-Shirt", DynaObject::newObject); tShirt->addAttribute(Attribute("Size", Type::stringT)); tShirt->addAttribute(Attribute("Color", Type::stringT)); tShirt->addAttribute(Attribute("Name", Type::stringT)); tShirt->addAttribute(Attribute("Price", Type::doubleT)); classReg.registerClass(tShirt);

and manipulate instances of existing classes and new classes through the MOP:

A C++ object:

Object * ecpp(book.newObject()); ecpp->setValue(5, RealValuedouble(22.50)); ecpp->setValue(0, RealValueint(23456)); ecpp->setValue(2, RealValuestring("Scott Meyers")); ecpp->setValue("Title", RealValuestring("Effective C++")); ecpp->setValue(6, RealValuedouble(280)); size_t idx;cout << "ecpp:" << endl; for(a = book.attribBegin(), idx = 0; a != book.attribEnd(); ++a, ++idx) { cout << a-getName() << ": " << ecpp->getValue(idx).asString() << endl; } cout << ecpp->getValue("Author").asString() << endl;

And a dynamic object:

Object * ts(tShirt.newObject()); ts->setValue(0, RealValue<int>(87654)); ts->setValue(2, RealValue<string>("XXL")); ts->setValue("Color", RealValue<string>("red")); ts->setValue("Price", RealValue<double>(25.95)); ts->setValue("Weight", RealValue<double>(387)); for(size_t idx = 0; idx != 4; ++idx){ cout << ts->getValue(idx).asString() << endl; }

C++ Interface

You can't access the instances created by the MOP through the Product interface. For instances of C++ classes you can modify the CppObjectT to derive from T as we have discussed before, or better, to provide a member function in CppObject that returns a pointer to myObject. And for instances of MOP classes you can define a wrapper around an Object that implements the Product interface:

class DynaProduct : public Product { public: DynaProduct(Object const * o) : obj(o) {} virtual std::string getName() const { Value v = obj->getValue("Name"); return v.get<std::string>(); } virtual double getPrice() const { Value v = obj->getValue("Price"); return v.get<double>(); } virtual double getWeight() const { Value v = obj->getValue("Weight"); return v.get<double>(); } private: Object const * const obj; };

And you can't access normal C++ instances of Book through the MOP; to solve this, you could add another constructor that adopts the C++ instance by copying (and consequently you should delete the original one to avoid an object that exists twice) or you could modify the wrapper to hold only a pointer to the C++ object and add a member function to return the controlled C++ object on request: Here, we use the first approach:

template <typename OrigClass> CppObject<OrigClass>::CppObject(ClassDef const* myClass, OrigClass const & obj) : Object(myClass), myObject(obj), // calls the copy-ctor of OrigClass, which must be accessible members(OrigClass::memberBegin()) {}

And now, you can do this:

Book b("Bjarne Stroustrup", "The C++ Programming Language", "Addison-Wesley", 27.50, 370); CppObject<Book> mb(*book, b); Object * ob = &mb;cout << "C++ object through MOP" << endl; for(a = ob->instanceOf()->attribBegin(), idx = 0; a != ob->instanceOf()->attribEnd(); ++a, ++idx) { cout << a->getName() << ": " << ob->getValue(idx).asString() << endl; }

But in general, it's just important that you can define basic classes in C++ at programming time but allow the user to derive own classes from these base classes at runtime.

Applications

Is such a MOP approach for C++ actually useful? The pure reflection mechanism based on pointers-to-members is quite useful for persistence libraries -- on relational databases or file formats like XML. This was the application when I first used pointers-to-members in C++, which were just the C++ replacement of the old C offsetOf macro that is still in widespread use for that purpose.

But I also came across quite a lot of applications where a handful of pre-defined entities provided 98% of the requirements of the users of the system, but the remaining 2% were so different for different users that a common solution for all was not adequate. For these special cases, a full meta-object protocol approach was really a quite simple and elegant solution that provided all the flexibility the users requested.

Two final remarks

Could the same kind of reflection be achieved with get/set functions instead of pointers-to-members? Perhaps, yes. Some component mechanisms use that approach (e.g. Borland's VCL). In that case, a get() and set() function for each attribute and a specialized TypedMemPtr<> for each attribute for each class is required. That's a lot of work, but with a respective pre-processor or compiler support that's not a point. But it's still much more intrusive to add all the getters and setters; and if they are public, they break encapsulation. Though pointers-to-members allow direct access to the data, that encapsulation leak can be much better controlled.

The second remark relates to RealValue<>. Are they really necessary for true reflection purpose? Actually not. In that case, you could remove BaseTargetType from all MemPtr templates, return the TargetType in TypedMemPtr's getter function, and make the getter function of the MemPtr handle a member template function analogous to the getter function of Value. In the source code for this article ( http://www.vollmann.com/download/mop/index.html) you'll find a sample implementation for that.

Though this seems to be a major advantage for pure reflection applications like persistency libraries, in fact I found it in most cases quite useful to have a base class like DbValue for all persistent attributes to provide additional functionality like dirty flags, type conversion specifics, etc.

Future articles

This article provided reflection and meta-class facilities for data attributes only. A following article will show the application of a MOP for the integration of a scripting language. And that will then allow to extend the MOP with member functions as well.

Another article will look into the capabilities of pre-processors to provide the reflection information.

And yet another article will look at real applications for reflection and meta-classes, like DB libraries.

References

[Kiczales-] Gregor Kiczales, Jim des Rivières, Daniel G. Bobrow: "The Art of the Metaobject Protocol", MIT Press 1991, ISBN 0-262-61074-4

[Coplien] James O. Coplien: "Advanced C++ Programming Styles And Idioms", Chapters 8-10, Addison-Wesley, 1992, ISBN 0-201-54855-0

[Buschmann-] Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, Michael Stal: "Pattern-Oriented Software Architecture: A System of Patterns", Wiley 1996, ISBN 0-471-95869-7

[Gamma-] Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides: "Design Patterns", Addison-Wesley 1994, ISBN 0-201-63361-2

[Ball-] Michael S. Ball, Stephen D. Clamage: "Pointers-to-Members", C++ Report 11(10):8-12, Nov/Dec 1999



[1] In fact, a pointer-to-member is not as easy as an offset. Especially if multiple inheritance comes in, things become more complicated.

[2] It might be necessary to adjust that offset, but the compiler cares for that.

Notes: 

More fields may be available via dynamicdata ..