Journal Articles
Browse in : |
All
> Journals
> Overload
> o116
(6)
All > Topics > Programming (877) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Dynamic C++ (Part 2)
Author: Martin Moene
Date: 02 August 2013 18:04:56 +01:00 or Fri, 02 August 2013 18:04:56 +01:00
Summary: Previously we saw how to use some simple dynamic features in C++. Alex Fabijanic and Richard Saunders explore more powerful dynamic tools.
Body:
[Y]ou can’t build a system that is completely statically typed.
~ Bjarne Stroustrup [Venners04]
In this installment of the ‘Dynamic C++’ series of articles, we continue to explore the dynamic solutions in C++ language. We start with Boost type_erasure
[Boost.TypeErasure], a combination of Boost.Any
[Boost.Any] and Boost.Function
[Boost.Function], addressing the C++ runtime polymorphism shortcomings. Next, we look into Val
, a class at the heart of the PicklingTools library [PicklingTools] aimed at interaction with Python environments. We conclude with Facebook’s folly library solution for interfacing the world of web and JSON from C++ – the dynamic
class.
Boost.TypeErasure
According to the author, type_erasure
is a generalization of Boost.Any
and Boost.Function
classes, allowing easy composition of arbitrary type erased operations; it addresses the shortcomings of C++ runtime polymorphism, in particular:
- intrusiveness
- dynamic memory management
- inability to apply multiple independent concepts to a single object.
Library uses some advanced constructs such as concepts and template metaprogramming constructs from boost::mpl
. In a similar fashion to boost::variant
specifying a set of types at construction time that can be contained at runtime, the type_erasure
library specifies at construction time a set of operations that can be performed on it at runtime. This is achieved through a vector of concepts provided at object declaration site as shown in Listing 1 for an incrementable
and ostreamable
object.
any< mpl::vector< copy_constructible<>, typeid_<>, incrementable<>, ostreamable<> > > x(10); ++x; // incrementable std::cout << x << std::endl; // ostreamable |
Listing 1 |
In the example, copy_constructible
allows copying and destruction of the object, whiletypeid_
provides run-time type information so that any_cast
can be used; these effectively make type_erasure any
equivalent to any
[Boost.Any]. Additionally, incrementable
and ostreamable
concepts are specified, allowing incrementing and streaming of the value x
. Operations can have arguments, so replacing incrementable
concept with addable
allows adding of two any
’s. The functionality is brittle in a subtle way, though – while adding two values of different types will compile, unfortunately it results in undefined behaviour at runtime. This problem can be alleviated by specifying relaxed_match
concept (according to author, recently renamed to a more appropriate relaxed
name), which causes exception to be thrown. The proper way of dealing with this problem is using placeholders, as shown in Listing 2.
int array[5]; typedef mpl::vector< copy_constructible<_a>, copy_constructible<_b>, typeid_<_a>, addable<_a, _b, _a> > requirements; tuple<requirements, _a, _b> t(&array[0], 2); any<requirements, _a> x(get< 0 > (t) + get< 1 >(t)); // x now holds array + 2 |
Listing 2 |
Placeholders are used extensively throughout the library. A placeholder is a substitute for a template parameter in a concept. The library automatically replaces all placeholders with the actual wrapped types.
Furthermore, type_erasure
supports references (both const
and non-const
), as well as user-defined concepts. Listing 3 demonstrates adding stringable
concept to type_erasure
, allowing a to_string()
member function call syntax directly on an integer value wrapped in any
. Things can be simplified when implementation and interface are the same (i.e. a member of type_erasure::any
called to_string
calls a to_string
member of the contained type), BOOST_TYPE_ERASURE_MEMBER
‘shortcut’ macro can be used.
template<class F, class T> struct to_string { // conversion function static T apply(const F& from, T& to) { return to = NumberFormatter::format(from); } }; namespace boost { namespace type_erasure { template<class F, class T, class Base> // binding struct concept_interface <::to_string<F, T>, Base, F> : Base { typedef typename rebind_any<Base, T>::type IntType; T to_string(IntType arg = IntType()) { return call(::to_string<C, T>(), *this, arg); } }; } typedef any<to_string<_self, std::string>, _self&> stringable; int i = 123; stringable s(i); std::string str = s.to_string(); // s == "123" |
Listing 3 |
Internally, a void*
pointer points to the held heap-allocated value and a static equivalent of virtual table serves as a binding for attached operations as shown in Listing 4.
// storage struct storage { storage() {} template<class T> storage(const T& arg) : data(new T(arg)) {} void* data; }; // binding of concept to actual type typedef ::boost::type_erasure::binding<Concept> table_type; // actual storage ::boost::type_erasure::detail::storage data; table_type table; |
Listing 4 |
Boost.Type erasure is an interesting and valuable ‘merger’ of any
and function
features, providing dynamic-language like features within the confines of standard C++. Implementation is rather complex and use has some non-intuitive weak spots that can quickly get an inexperienced user in serious trouble. A more robust interface (even if only with _typedef_
s provided for most frequently used types) would greatly improve the usability and safety for less experienced users.
PicklingTools Val
The next reviewed solution to the dynamic typing problem is the PicklingTools [PicklingTools] Val [Saunders1], [Saunders2], [Saunders3]. The PicklingTools library is an open-source library made up of Python, C++ and Java code allowing cross language communication. The PicklingTools evolved from a need to allow Python and C++ to share Python dictionaries across language boundaries. Many modern applications are built using multiple languages: Python, C++, Java, JavaScript, Lua, Icon/Unicon. The front-end languages (JavaScript, Python, Lua) tend to be dynamic languages for handling scripting and basic data flow. The back-end languages (C++, C, Java, FORTRAN) handle the heavy-lifting of fast communications, data I/O and CPU intensive work. In these hybrid systems, front-end languages need to communicate with the back-end languages intensively. Dictionaries, supported in some form by most dynamic languages, became the currency of many systems. The PicklingTools library solution focuses on the Python dictionary and C++.
The Val Name Rationale |
Why Val and not something like dynamic or any? Three reasons:
|
While making Python dictionaries easy to express in C++ was not a primary goal, given how much users enjoyed the ease of use, the C++ PicklingTools embraced them wholeheartedly. The goal, then, became to make Python dictionaries as easy to express in C++ as they are in Python.
Consider the ease of a dynamic dictionary manipulation in Python shown in Listing 5.
# Create a Python literal >>> d = { 'a': 1, 'b':2.2, 'c': ...{ 'X':1, 'Y':[1,2,3] }} >>> print d['a'] # lookup a single key: 'a' -> 1 >>> d['b'] = 3.3 # insert into dict # Also easy to lookup/insert nested entities >>> print d['c']['X'] # lookup nested key >>> d['c']['Y'] = 0 # insert nested } |
Listing 5 |
Because C++ is a statically-typed language, it requires a compile-time type for all variables. PicklingTools use Val to indicate a dynamic value.
A side-by-side comparison of basic dynamic typing in Python and C++ using Val is shown in Listing 6.
# Python >>> a = 1 >>> a = 2.2 >>> a = 'three' # a takes three different types // C++ Val a = 1; // overload constructor a = 2.2; // overload op= a = "three"; |
Listing 6 |
The PicklingTools Val is implemented as a union and a type-tag, where the value is constructed using placement new inside the union. The destructor has to manually notice which type to destruct (for non-POD types) and explicitly call the correct constructor. The Val is really just a dynamic container, with storage as shown in Listing 7. C++11 standard introduces alignof
/alignas
to address alignment concerns; in pre-C++11, although standard does not explicitly guarantee it and some experts discourage it [GotW28], a union with a double
member is practically close enough guarantee of the alignment to the largest member of the union.
As can be concluded from Listing 7, by design Val
can only contain certain types, shown in Listing 8.
struct Val { // Flags: the ascii typetag, subtype for arrays, char tag; char subtype; char isproxy; char pad; Allocator *a; // if using shared memory or // special allocator union { int_1 s; // type tag 's' int_u1 S; // type tag 'S' int_2 i; // type tag 'i' int_u2 I; // type tag 'I' int_4 l; // type tag 'l' int_u4 L; // type tag 'L' int_8 x; // type tag 'x' int_u8 X; // type tag 'X' real_4 f; // type tag 'f' real_8 d; // type tag 'd' complex_8 F; // type tag 'F' complex_16 D; // type tag 'D' char t[sizeof(Tab)]; // type tag 't', usually 32 char n[sizeof(Arr)]; // type tag 'n', usually 32 } u; }; |
Listing 7 |
// POD types int_1, int_u1, int_2, int_u2, int_4, int_u4, int_8, int_u8, real_4, real_8, complex_8, complex_16, size_t // Non-POD types string, None, Tab (like Python dictionary), Arr (like Python list) |
Listing 8 |
There are no-user defined types by design. This allows library writers to concentrate on making the interface for C++ Python dictionaries as close to Python as possible without worrying about the problems generality brings.
Listing 9 shows how easy it is to construct a Val
from basic types and get values out by means of user-defined conversion.
// overload Val constructor on all supported types Val a = 1; Val b = 2.2; Val c = "three"; // Get a value out via user-defined conversions int_u4 i = a; |
Listing 9 |
The user-defined conversions and overloading are syntactic sugar. Of course, overloaded cast operators direct calls are really just taking advantage of ‘syntactic sugar’ for function calls as shown below:
Val v = 1; int_u4 i = v.operator int_u4();
The Val
supports user-defined conversions for all basic types. If the conversion isn’t direct, then the user-defined conversions follow the Principle of Least Surprise: convert as C++ would (example below).
Val v = 3.14159265; int_u4 i4 = v; // Which conversion? As C++ would: // int_u4 i4 = static_cast<int_u4>(3.14159265);
If the conversion doesn’t make sense, behaviour is identical to that in Python – an exception is thrown, see Listing 10.
# Python >>> a = 3.14159265 >>> i4 = int(a) # converts to 3 >>> d = dict(a) # Exception! doesn't make # sense to convert to dict // C++ Val a = 3.14159265; int_u4 i4 = a; // converts to 3 Tab d = i4; // Compile-time error, can't // construct Tab from float |
Listing 10 |
The Val
provides the basic container to support dynamic dictionaries. The C++ Tab
is equivalent to the Python dict
(think Tab == Table
). The keys of the C++ Tab
, aswell as the values of the dictionary are Val
’s, allowing us to construct dynamic dictionaries:
# PythonId-1007573"> d = {'a':1, 'b':2.2, 'c':'three'} // C++ Tab d = "{'a':1, 'b':2.2, 'c':'three'}";
Note that the C++ literal is inside a string: the library overloads the constructor of the Tab
to take a string which contains the literal. While C++11 has some great new literal constructs, it can’t quite mimic Python’s syntax.
Literal construction of a Python dictionary is supported using exactly the Python syntax: you can cut-and-paste the Python dictionary literal and paste it into the C++ quotes. There is a little Python dictionary parser embedded in the Tab
class so that it recognizes the same syntax. Rather than invent a new syntax for literal construction, library leverages the well-known Python syntax.
Facebook folly::dynamic
The Facebook folly::dynamic
[Folly.Dynamic] class is another one in the spectrum of dynamic-typing classes. The class aims to relax the static typing constraints, especially in the JSON format data manipulation scenarios; it provides a runtime dynamically typed value for C++, similar to the way languages with runtime type systems work (e.g. Python). It can hold types from a predetermined set (ints, bools, arrays of other dynamics, etc), similar to boost::variant
and PicklingTools Val
, but the syntax is intended to be more akin to using the native type directly.
Facebook String Flavour |
Facebook folly::fbstring is a drop-in replacement for std::string , providing the benefit of significantly increased performance on virtually all important primitives. This is achieved by using a three-tiered storage strategy and cooperating with the memory allocator; fbstring is designed to detect use of jemalloc [jemalloc] and cooperate with it to significantly improve speed and memory usage.
Storage strategies
Implementation highlights
Supported architectures are x86 and x64; porting fbstring to big-endian architectures would require changes. |
An example of creating a dynamic
holding most common types is shown below. Strings are stored internally as fbstring
, the Facebook drop-in replacement for std::string
.
dynamic twelve = 12; dynamic str = "string"; // fbstring dynamic nul = nullptr; dynamic boolean = false;
The library extensively uses C++11 features for both speed and syntactic advantages. For example, as shown in Listing 11, arrays can be initialized with initializer lists. This particular feature, however, also imposes a limitation – dynamic
has no default constructor. The rationale for this design decision is due to the standard requirement for the expression dynamic d = {}
to call default constructor. The conflict arises in the default construction either having to result in d.isArray()
(a) being false
for the expression dynamic d = {}
or (b) being true
for dynamic d
. The solution the authors of folly::dynamic
deemed most appropriate is to entirely disallow the default construction.
dynamic array = { "array ", "of ", 4, " elements" }; assert(array.size() == 4); dynamic emptyArray = {}; assert(array.empty()); |
Listing 11 |
Maps from dynamics to dynamics are called objects. As shown in Listing 12, the dynamic::object
constant is how an empty map fromdynamics to dynamics is created. The same listing also shows how dynamic
objects can be created by using object::operator()
.
dynamic map = dynamic::object; map["something"] = 12; map["another_something"] = map["something"] * 2; dynamic map2 = dynamic::object ("something", 12)("another_something", 24); |
Listing 12 |
The internal dynamic
storage is shown in Listing 13. Types that can be held are: null
, Array
, bool
, double
, integer
(64-bit), Object
and String
.
enum Type { NULLT, ARRAY, BOOL, DOUBLE, INT64, OBJECT, STRING, }; // ... Type type_; union Data { explicit Data() : nul(nullptr) {} void* nul; // void* used instead of // std::nullptr_t due to gcc bug Array array; bool boolean; double doubl; int64_t integer; fbstring string; typename std::aligned_storage< sizeof(std::unordered_map<int,int>), alignof(std::unordered_map<int,int>) >::type objectBuffer; } u_; |
Listing 13 |
Examples of object and string construction are shown in Listing 14. Most notably, ObjectImpl
is not a mere typedef but inherits from hash map; the reason for this to avoid undefined behavior of parameterizing std::unordered_map
with an incomplete type.
struct dynamic::ObjectImpl : std::unordered_map<dynamic, dynamic> {}; // ... inline dynamic::dynamic(ObjectMaker (*)()) : type_(OBJECT) { new (getAddress<ObjectImpl>()) ObjectImpl(); } inline dynamic::dynamic(char const* s) : type_(STRING) { new (&u_.string) fbstring(s); } |
Listing 14 |
The gist of the folly
’s conversion facilities is shown in the Listing 15. The listing shows the code involved in conversion of dynamic
to fbstring
. The actual code of converting ‘anything to anything’, as the documentation states, is in a separate header and too large for inclusion here. For binary/decimal and vice-versa conversion of IEEE doubles, the class uses V8 double-conversion [Double.Conversion].
template<> struct dynamic::GetAddrImpl<bool> { static bool* get(Data& d) { return &d.boolean; } }; template<> struct dynamic::GetAddrImpl<int64_t> { static int64_t* get(Data& d) { return &d.integer; } }; template<class T> T* dynamic::getAddress() { return GetAddrImpl<T>::get(u_); } template<class T> T* dynamic::get_nothrow() { if (type_ != TypeInfo<T>::type) { return nullptr; } return getAddress<T>(); } template<class T> T dynamic::asImpl() const { switch (type()) { case INT64: return to<T>(*get_nothrow<int64_t>()); case DOUBLE: return to<T>(*get_nothrow<double>()); case BOOL: return to<T>(*get_nothrow<bool>()); case STRING: return to<T>(*get_nothrow<fbstring>()); default: throw TypeError("int/double/bool/string", type()); } } inline fbstring dynamic::asString() const { return asImpl<fbstring>(); } |
Listing 15 |
As will be shown in one of the next installments, dynamic
provides a very nice user interface, yet also provides a lot in terms of performance. It is a class designed with definite business goal in mind and it succeeds in that endeavor. The only downside for the whole folly
library is a patchy build system which requires a significant effort to build the library. The library is also not portable, at least not in the out-of-the-box fashion.
Conclusion
In this installment, we reviewed three C++ dynamic typing solutions: Boost type_erasure
, PicklingTools Val
and Facebook folly::dynamic
. While dynamic
and Val
provide dynamically-typed storage within the confines of the standard C++, type_erasure
also ventures in a new direction by adding operations to C++ types. In the next installment, we’ll look into more similar solutions, so stay tuned …
Credits
Steven Watanabe provided valuable advice on boost::type_erasure
. The list is, of course, not inclusive - many other people, discussions, libraries and code samples were an indispensable source of help in gathering and systematizing this writing.
References and further information
[Boost.Any] http://www.boost.org/doc/libs/1_53_0/doc/html/any.html
[Boost.Function] http://www.boost.org/doc/libs/1_53_0/doc/html/function.html
[Boost.TypeErasure] http://www.boost.org/doc/libs/1_54_0/doc/html/boost_typeerasure.html
[Double.Conversion] ‘Double-conversion library’ https://code.google.com/p/double-conversion/
[Folly.Dynamic] Facebook folly library, dynamic class – https://github.com/facebook/folly/blob/master/folly/docs/Dynamic.md
[GotW28] The Fast Pimpl Idiom http://www.gotw.ca/gotw/028.htm
[jemalloc] A general-purpose scalable concurrent malloc(3) implementation http://www.canonware.com/jemalloc/
[PicklingTools] The PicklingTools Library www.picklingtools.com
[Saunders1] ‘Dynamic, Recursive, Heterogeneous Types in Statically-Typed Languages’, Clinton Jeffery, Richard Saunders, C++ Now 2013 Presentation, http://cppnow.org/session/dynamic-recursive-heterogeneous-types-in-statically-typed-languages/
[Saunders2] ‘Dynamic, Recursive, Heterogeneous Types in Statically-Typed Languages’ Clinton Jeffery, Richard Saunders http://cppnow.org/files/2013/03/saunders-jeffery.pdf
[Saunders3] C++ Now 2013 Presentation, Richard Saunders http://www.youtube.com/watch?v=W3TsQtnMtqg
[Venners04] ‘Abstraction and Efficiency: A Conversation with Bjarne Stroustrup’ by Bill Venners, February 16, 2004 http://www.artima.com/intv/abstreffi.html
[Wikipedia] Boyer–Moore string search algorithm http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
Further information
‘Dynamic C++’, ACCU 2013 Conferencehttp://www.slideshare.net/aleks-f/dynamic-caccu2013
Facebook folly library, fbstring class – https://github.com/facebook/folly/blob/master/folly/docs/FBString.md
Notes:
More fields may be available via dynamicdata ..