Journal Articles

Overload Journal #148 - December 2018 + Design of applications and programs
Browse in : All > Journals > Overload > o148 (8)
All > Topics > Design (236)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Compile-time Data Structures in C++17: Part 3, Map of Values

Author: Bob Schmidt

Date: 04 December 2018 16:25:16 +00:00 or Tue, 04 December 2018 16:25:16 +00:00

Summary: A compile time map of values allows code to be tested more easily. Bronek Kozicki demonstrates how to avoid a central repository of values.

Body: 

Overview of the previous parts

There is no such thing as ‘O(0)’ time complexity of a function (hence quotes) because time complexity implies that some action will be actually performed during the program execution. Here we are asking the compiler to perform all the actions required by the function (or more accurately, a meta-function) during the compilation itself, which allows us to use the result as an input for further compilation.

A meta-function might be a function with a constexpr specifier, but typically we will use either a template type (wrapped in a using type alias if nested inside a template type) or a constexpr static variable template (also nested inside a template type). In the former case, a result of a meta-function is a type, and in the latter, it is a value.

A tag type is a type which is meant to be used as a name – it has no data members and no useful member functions. The only purpose of objects of such types is the deduction of their type. Examples in the C++ standard library include std::nothrow_t or types defined for various overloads of std::unique_lock constructor.

A pure function is a function which has no side effects and, for any valid and identical set of inputs, always produces the same output. For example, any deterministic mathematical function is also a pure function. A function which takes no input, but always produces the same result and has no side-effect is also a pure function. Mathematical functions in C++ standard library are not pure functions, but this is about to change [P0533R3]. We can view many meta-functions as pure functions.

A limitation of meta-functions is that they do not have a stack in any traditional sense (they have template instantiation depth instead), and cannot manipulate variables. They can produce (immutable) variables or types, which means that they can be used to implement recursive algorithms. Such implementation will be typically a template type, where at least one specialisation implements the general algorithm, while another specialisation implements the recursion terminating condition. The compiler selects the required ‘execution path’ of the recursive algorithm utilising template specialisation matching.

A higher order function is a function which consumes (or produces, or both) a function. Since in our case a (meta)function is a template, we can implement a higher order (meta)function consuming a (meta)function, as a template consuming template parameter (or in other words, a ‘template template parameter’). Since template types can naturally output a template type, any meta-function which is a type can trivially produce a meta-function.

A selector is some entity mapped to another – in a C++ standard library, a selector in std::map<int, std::string> is some value of type int. One way to perform such mapping during compilation is to employ overloading, which makes tag types an obvious choice of the selector. Mapping result could be either a type (deduced with the help of decltype keyword) or actual value, returned from an overloaded function. To avoid instantiating arbitrary types, we are going to use a small wrapper template when only a type is needed.

In the previous two parts of the series [Kozicki18a] [Kozicki18b] we were introduced to a compile-time set of types and a map of types. In both of these data structures, all operations were guaranteed to have precisely zero runtime cost. This part is different, for two reasons:

// copy the definition of set in Listing 6 from
// https://accu.org/index.php/journals/2531
// to here
namespace val_impl {
template <typename T> struct wrap {
  constexpr explicit wrap() = default;
  using type = T;
};

template <template <typename> typename CheckT,
  template <typename> typename CheckV,
  typename... L> struct impl;
template <template <typename> typename CheckT,
  template <typename> typename CheckV>
struct impl<CheckT, CheckV> {
  using selectors = set<CheckT>;
  constexpr explicit impl() = default;
  constexpr static void pair() noexcept;
  constexpr static void type() noexcept;
};
template <template <typename> typename CheckT,
  template <typename> typename CheckV, typename T,
  typename V, typename... L>
struct impl<CheckT, CheckV, T, V, L...> 
  : impl<CheckT, CheckV, L...> {
  using check = typename CheckV<V>::type;
  using base = impl<CheckT, CheckV, L...>;
  using selectors = 
    typename base::selectors::template insert<T>;
  static_assert
    (not base::selectors ::template test<T>);
  constexpr impl(const impl<CheckT, CheckV, L...>&
    b, const check& v) : base(b), val_(v) {}
  using base::pair;
  constexpr const auto& pair(wrap<T>) const
    noexcept { return val_; }
  using base::type;
  constexpr auto type(wrap<T>) const noexcept {
    return wrap<V>{}; }
	
private:
  check val_;
};
}
template <template <typename> typename CheckT,
  template <typename> typename CheckV, 
  typename... L>
class val {
  template <template <typename> typename, 
    template <typename> typename, typename...>
  friend class val;
  using impl = val_impl::impl<CheckT, CheckV,
    L...>;
  impl val_;
  constexpr explicit val(const impl& v) : val_(v)
  {}
public:
  constexpr val() = default;
  template <typename T, typename V> constexpr auto
    insert(const V& v) const noexcept
  {
    using result = val<CheckT, CheckV, T, V,
      L...>;
    using rimpl = typename result::impl;
    return result(rimpl(val_, v));
  }
  using set = typename impl::selectors;
  template <typename U> using type = 
    typename decltype(impl::type
      (val_impl::wrap<U>()))::type;
  template <typename U> constexpr const 
    auto& get() const noexcept
  {
    return val_.pair(val_impl::wrap<U>());
  }
  template <typename U, typename... A> constexpr
  auto run(A&&... a) const -> decltype(auto)
  {
    return get<U>()(std::forward<A>(a)...);
  }
};
			
Listing 1

There is also the elephant in the room which needs to be addressed first: STL-style containers supporting compile-time operations. One such container library is Frozen [Frozen], but there is also an ongoing effort to make standard C++ containers compliant with the literal types requirements [Dionne18] [P0980R0] [P1004R1]. Since the topic of the series is ‘compile-time data structures’, there is an obvious overlap here. For a start, we are dealing with collections which support compile-time operations such as construction or lookup. The implementation of such containers will necessarily rely on a similar set of meta-programming techniques. Finally, we are talking about modern C++, and that’s about all the similarities. The differences are less visible, but not less significant. The STL-style containers are designed for imperative programming style, and they are homogeneous (that is, only support single prescribed data type). This homogeneity is inherent to imperative programming style because, without it, it is somewhat tricky to populate a container, pass it around as a function parameter or its return type, or write a simple for loop, iterating over the container elements. The compile-time collections presented so far in the series do not appear to support iteration at all, but as the code excerpt in the Listing 2 demonstrates, this is quite achievable (although not nice) with the help of generic lambdas.

namespace set_impl {

template <template <typename> typename Check,
  typename... L> struct unique;
template <template <typename> typename Check>
  struct unique<Check> {
// add this to set_impl::unique<Check> code :
  template <typename F> constexpr static bool
    apply(const F&) noexcept { return true; }
};

template <template <typename> typename Check,
  typename T, typename... L>
struct unique<Check, T, L...> {
// add this to set_impl::unique<Check, T, L...> :
template <typename F> constexpr static bool
    apply(const F& fn) noexcept
  {
    if constexpr (not std::is_void_v<T> && not
      contains_detail<Check, T, L...>::value) {
      if (not fn((T*)nullptr)) return false;
    }
    return unique<Check, L...>::apply(fn);
  }
};
} // namespace set_impl

template <template <typename> typename Check,
  typename... L> class set {
  using impl = set_impl::unique<Check, L...>;

public:
// add this to set<Check, L...> :
  template <typename F>
  constexpr static bool for_each(const F& fn)
  { return impl::apply(fn); }
};

int main()
{
  constexpr static auto s1 = set<PlainTypes>
    ::insert<int>::insert<Baz>::insert<Fuz>();
  decltype(s1)::for_each([](auto* d) -> bool {
    using type = decltype(*d);
    std::cout << typeid(type).name() << std::endl;
    return true;
  });
}
			
Listing 2

Similarly, passing parameters and returning heterogeneous collections imply the use of templates, which means a code similar to one in the code excerpt in the Listing 3 (note unconstrained template <typename Val> passed to foo). Templates such as this make it difficult to reason about the code because of the very few constraints attached. Speaking of which, we have seen ‘constraints’ used in our compile-time data structures in the preceding parts of the series, and they present great means of differentiation of the data structures as demonstrated in Listing 4. Since overload resolution complements template matching, it is also possible to overload based on ‘constraints’ set on the compile collection, like the code sample also demonstrates. Our heterogeneous containers offer one more feature which is unattainable for traditional containers – overloading based on content (as seen with Ver1 and Ver2, which imply one potential application – admittedly, such code is likely to be very brittle).

template <typename Val>
void foo(const Val& v)
{
  auto fuz = v.template run<Fuz>(
    v.template get<Baz>());
  // ...
}
int main()
{
  constexpr static auto m1 = 
    val<PlainTypes, PlainNotVoid>{}
    .insert<Fuz>([](int v){ return Fuzzer(v); })
    .insert<Baz>(13);
  foo(m1);
};
			
Listing 3
template <typename T> void foo(T);

template <typename... L>
void foo(const val<PlainTypes, Plain, L...>& v)
{ /* ... */ }

template <typename... L>
void foo(const val<DataSel, Plain, L...>& v)
{ /* ... */ }

template <typename... L>
auto foo(const val<PlatformSel, Plain, L...>& v)
  -> std::enable_if_t<std::decay_t<decltype(v)>
    ::set::template test<Ver1>>
{ /* ... */ }

template <typename... L>
auto foo(const val<PlatformSel, Plain, L...>& v)
  -> std::enable_if_t<std::decay_t<decltype(v)>
    ::set::template test<Ver2>>
{ /* ... */ }
			
Listing 4

Hopefully, the code demonstrated so far is enough to trigger the ‘this is interesting’ response of the reader. But is it actually useful? Here is a 10 thousand feet view at such a container:

The first two points are also available to any user-defined type in a C++ program. We use hardcoded names to refer to a field, member function, or nested type, and there is nothing new here. The last one is what makes the data structures presented here unusual. Let’s have a look at the two popular design patterns where data members, or functions, are typically used.

Abstract factory

Intent: Provide an interface for creating families of related or dependent objects without specifying their concrete classes
~ Design Patterns [GoF95].

The ‘interface’, as referred to above is in the context of object-oriented programming, which in C++ programs translates to a base class with (pure) virtual functions. The use of this design pattern imposes runtime polymorphism on our project since (by definition) there is no way for the factory implementation to convey via the interface the dynamic type of the objects being created. We can, however, take the factory pattern and transform it to the implied interface in the context of generic programming, as demonstrated in Listing 5. This approach enables us to dispose of the dynamic polymorphism since the generic programming style means that the dynamic type is available at the point of use.

template <typename Factory> 
void baz(const Factory& f) 
{
  auto a = f.logger(); // . . .
}
struct MyFactory {
  Logger logger() const;
};
int main()
{
  MyFactory factory;
  baz(factory);
}
			
Listing 5

Encapsulated context

Solution: Provide a Context container that collects data together and encapsulates common data used throughout the system.
~Allan Kelly [Kelly]

Interestingly, the definition of this pattern also uses the word ‘container’. The selector in the case of this pattern is accessor member function which may imply the use of runtime polymorphism. However, this pattern works also with data fields and generic programming, as demonstrated in Listing 6.

template <typename Context>
void baz(const Context& c)
{
  auto& a = c.logger; // ...
}
struct MyContext {
  Logger& logger;
};
int main()
{
  Logger logger;
  MyContext context {logger};
  baz(context);
}
			
Listing 6

Both design patterns impose a single, shared dependency on all the uses of the context or factory class. A programmer has to choose between dependency on the implementation indirectly via the generic interface (as demonstrated in Listings 5 and 6), or directly on the interface class in the OOP sense, hence imposing runtime polymorphism. The heterogeneous containers presented in this part of the series offer an alternative solution: since there is no concrete implementation class as such, the dependency is on the containers (a utility class), selectors and optionally also constraints. The constraints are considered to be optional because their definition can be as loose or as tight as desired by the user. One end of the spectrum was demonstrated in the preceding articles as generic PlainTypes while on the opposite end we might enforce inheritance of selectors from certain types or presence of specific markers (e.g. nested type), or even expressly list them in the form of a ‘set of types’. The selectors are not optional; however, they may be defined in the context of their use, rather than some arbitrary central location (as opposed to fields or member functions of a class). This distributed nature of selectors removes the single shared dependency we have seen earlier. Additionally, with the use of lambdas, we can define both a context and a factory, within the same instance of the map of types (demonstrated in Listing 3).

The lack of central location comes with one more benefit. Imagine we have hypothetical ‘real world’ financial software, with the following internal dependencies:

Such dependencies imply that objects have to be instantiated in a specific order. If we were to use a context pattern to carry the configuration, platform services, market price inputs, network outputs etc. then our choices are to either:

Neither is of those is very appealing. The latter (as simpler) is likely to be preferred, but it will cause brittleness of code and enable a whole category of bugs. The value map, as presented here, allows a solution which enforces the correct order of instantiation in the compile time – see Listing 7. Admittedly, multiple copies of the map and the use of pointers may seem like a ‘code smell’, but such code is also very readable. The more refined version presented in Listing 8 makes use of lambdas instead One point of note is that in this version, the bodies of constructors use v.template run<...Selector>(v) instead of template get<...Selector>(). We have also changed ClientPriceOutput to a template to own the NetworkOutput subobject and made the class NetworkOutput non-copyable, even though it is returned by value from the lambda defined for NetworkOutputSelector. Such construction by-value of non-copyable objects is guaranteed to work thanks to obligatory copy elision in C++17 (users of Visual Studio 2017 version 15.8 will have to find a workaround or downgrade to 15.7 because of a regression bug [VS]). The attentive reader will notice how this last code excerpt mixes both context and factory pattern.

struct CommandLine {
  CommandLine(int argv, char* const* argc);
};
enum ConfigurationSelector {};
struct Configuration {
  explicit Configuration(const CommandLine& cmd);
};
enum PlatformServicesSelector {};
struct PlatformServices {
  template <typename... L>
  explicit PlatformServices(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      *v.template get<ConfigurationSelector>();
    // ...
  }
};
enum MarketPriceInputsSelector {};
struct MarketPriceInputs {
  template <typename... L>
  explicit MarketPriceInputs(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      *v.template get<ConfigurationSelector>();
    const auto& plat =
      *v.template get<PlatformServicesSelector>();
    // ...
  }
};
enum ClientPriceCalcSelector {};
struct ClientPriceCalc {
  template <typename... L>
  explicit ClientPriceCalc(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      *v.template get<ConfigurationSelector>();
    const auto& mark =
     *v.template get<MarketPriceInputsSelector>();
    // ...
  }
};
enum NetworkOutputSelector {};
struct NetworkOutput {
  template <typename... L>
  explicit NetworkOutput(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      *v.template get<ConfigurationSelector>();
    // ...
  }
};

struct ClientPriceOutput {
  template <typename... L>
  explicit ClientPriceOutput(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      *v.template get<ConfigurationSelector>();
    const auto& price =
      *v.template get<ClientPriceCalcSelector>();
    auto& out =
      *v.template get<NetworkOutputSelector>();
    // ...
  }
};

int main(int argv, char** argc)
{
  const auto v0 = val<PlainTypes, PlainNotVoid>{};
  try {
    const auto cmdln = CommandLine(argv, argc);
    const auto config = Configuration(cmdln);
    const auto v1 =
      v0.insert<ConfigurationSelector>(&config);
    const auto plsrv = PlatformServices(v1);
    const auto v2 =
      v1.insert<PlatformServicesSelector>(&plsrv);
    const auto input = MarketPriceInputs(v2);
    const auto v3 =
     v2.insert<MarketPriceInputsSelector>(&input);
    const auto price = ClientPriceCalc(v3);
    auto       output = NetworkOutput(v3);
    const auto v4 =
      v3.insert<ClientPriceCalcSelector>(&price)
        .insert<NetworkOutputSelector>(&output);
    const auto clout = ClientPriceOutput(v4);
    // ...
  }
  catch (...) {
    // ...
  }
  return 0;
}
			
Listing 7
enum ConfigurationSelector {};
struct Configuration {
  template <typename... L>
  explicit Configuration(const val<PlainTypes,
    PlainNotVoid, L...>& v,
    const CommandLine& cmdln)
    { /* ... */ }
  };
// use template run<...> as appropriate, e.g. :
enum NetworkOutputSelector {};
struct NetworkOutput {
  NetworkOutput(const NetworkOutput&) = delete;
  template <typename... L>
  explicit NetworkOutput(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  {
    const auto& conf =
      v.template run<ConfigurationSelector>(v);
    const auto& plat =
      v.template run<PlatformServicesSelector>(v);
    // ...
  }
};
template <typename Out> struct ClientPriceOutput {
  Out out_;
  template <typename... L>
  explicit ClientPriceOutput(const val<PlainTypes,
    PlainNotVoid, L...>& v)
  : out_(v.template run<NetworkOutputSelector>(v))
  {
    const auto& conf =
      v.template run<ConfigurationSelector>(v);
    const auto& price =
      v.template run<ClientPriceCalcSelector>(v);
    // ...
  }
}
int main(int argv, char** argc)
{
  try {
    const auto cmdln = CommandLine(argv, argc);
    const auto v = val<PlainTypes, PlainNotVoid>{}
    .insert<ConfigurationSelector>([cmdln]
      (const auto& v) -> const Configuration& {
        static auto config = Configuration(
          v, cmdln);
        return config;
    }).insert<PlatformServicesSelector>([]
      (const auto& v) -> const PlatformServices& {
        static auto plsrv = PlatformServices(v);
        return plsrv;
    }).insert<MarketPriceInputsSelector>([]
      (const auto& v) -> const MarketPriceIputs& {
        static auto input = MarketPriceInputs(v);
        return input;
    }).insert<ClientPriceCalcSelector>([]
      (const auto& v) -> const ClientPriceCalc& {
        static auto price = ClientPriceCalc(v);
        return price;
    }).insert<NetworkOutputSelector>([]
      (const auto& v) -> NetworkOutput {
        return NetworkOutput(v);
    });
    const auto clout =
      ClientPriceOutput<NetworkOutput>(v);
    // ...
  }
  catch (...) {
    // ...
  }
  return 0;
}
			
Listing 8
Static variables inside lambdas

Users coming from other languages may have some difficulty with static variables defined inside a lambda. Here is how Anthony Williams explained this on the accu-general mailing list:

A lambda is an object of an unnamed type. The lambda body is the operator() member function of this type. For any given lambda expression there is only one such type and one such member function, so there is only one copy of each static object declared within the lambda body.

In practice, this means that the lambdas which make use of static variable will ‘cache’ the once-created object, making them equivalent to accessors of the context pattern with lazy evaluation. The afflictions of static objects do apply (e.g. static object lifetime, the accidental coupling of points of use if the object passed as a mutable reference) but are moderated by the fact that each lambda is its own type.

There is one more question remaining to answer ‘why bother?’ The answer is the same as for any other kind of polymorphism ‘to make the code testable’. The use of templates enables us to avoid the cost of dynamic polymorphism (not only in the runtime but also boilerplate OOP interface code), while the value map removes the need for a single central repository. With its help, our unit tests will be able to easily pass mock objects to the code being tested as demonstrated in Listing 9 (for objects created on the stack) and Listing 10 (for objects returned by lambdas). Note that the static object lifetime of MockConfiguration in Listing 10 will delay the destruction of config object until unit tests termination. A good workaround is to store our mock objects on the stack and then pass them to lambda as a reference, as demonstrated with MockPlatformServices. Readers will hopefully find other useful applications of the compile-time polymorphism offered by the map of values.

TEST("Test MarketPriceInputs")
{
  const auto config = MockConfiguration();
  auto       plsrv = MockPlatformServices();
  const auto v1 = val<PlainTypes, PlainNotVoid>{}
    .insert<ConfigurationSelector>(&config)
    .insert<PlatformServicesSelector,
      const MockPlaformServices*>(&plsrv);
  auto input = MarketPriceInputs(v1);
  plsrv.push("Some test inputs");
  // ...
}
			
Listing 9
TEST("Test MarketPriceInputs")
{
  auto       plsrv = MockPlatformServices();
  const auto v = val<PlainTypes, PlainNotVoid>{}
    .insert<ConfigurationSelector>([](auto&){
      static auto config = MockConfiguration();
      return config;
    }).insert<PlatformServicesSelector>([&plsrv]
      (auto&) -> const MockPlatformServices&
      { return plsrv; });
  auto input = MarketPriceInputs(v);
  plsrv.push("Some test inputs");
  // ...
}
			
Listing 10

We started the series with the demonstration of the most useful meta-programming techniques, in the context of functional programming where the program is run by the compiler. The ‘set of types’ presented in the first part, even though rather simple, enables some interesting uses – among them a constraint to validate that certain template parameters are within a predefined set of types (which is left as an exercise to the reader). Then we moved to the, perhaps not very exciting, ‘map of types’ and finally, in this part discussed what I like to call a ‘heterogenous compile-time container’, demonstrating how such data structure can be used to solve real problems. The work is not finished, but the code discussed here is usable and available for readers to download from the Juno library [Juno]. The library is in very early stages of development and open to significant changes, but the similar solutions to presented here have found their use in the ‘real world’ code already. The very liberal MIT license allows anyone to ‘lift’ the parts of the source and use in their projects, which is preferred over taking the dependency (due to library immaturity). Readers are invited to the discussion on the future development, design and interface, in the ‘issues’ section of the library or sending me email directly. All inputs are welcome!

References

[cppreference] ‘C++ named requirements: LiteralType’ at https://en.cppreference.com/w/cpp/named_req/LiteralType

[Dionne18] Louis Dionne ‘Compile-time programming and reflection in C++20 and beyond’ from CppCon 2018, available at https://youtu.be/CRDNPwXDVp0

[Frozen] ‘Frozen – zero cost initialization for immutable containers and various algorithms’, https://blog.quarkslab.com/frozen-zero-cost-initialization-for-immutable-containers-and-various-algorithms.html

[GoF95] Design Patterns, Elements of Reusable Object-Oriented Software, Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Addison-Wesley 1995

[Juno] The Juno library: https://github.com/Bronek/juno/releases

[Kelly] Allan Kelly (updated 2009) ‘Encapsulated Context’ (design pattern), available from https://www.allankellyassociates.co.uk/patterns/encapsulated-context/

[Kozicki18a] Bronek Kozicki (2018) ‘Compile-time Data Structures in C++17: Part 1, Set of Types’ in Overload 146, available at https://accu.org/index.php/journals/2531

[Kozicki18b] Bronek Kozicki (2018) ‘Compile-time Data Structures in C++17: Part 2, Map of Types’ in Overload 147, available at https://accu.org/index.php/journals/2562

[P0533R3] constexpr for <cmath> and <cstdlib>, 5 August 2018, available at: https://wg21.link/p0533

[P0980R0] ‘Making std::string constexpr’ at https://wg21.link/P0980

[P1004R1] ‘Making std::vector constexpr’ at https://wg21.link/P1004

[VS] ‘MSVC 15.8 C++17 RVO regression for non-static data members: https://developercommunity.visualstudio.com/content/problem/318693/msvc-158-c17-rvo-regression.html

Bronek Kozicki Bronek Kozicki developed his lifelong programming habit at the age of 13, teaching himself Z80 assembler to create special effects on his dad’s ZX Spectrum. BASIC, Pascal, Lisp and a few other languages followed, before he settled on C++ as his career choice. In 2005, he moved from Poland to London, and promptly joined the BSI C++ panel with a secret agenda: to make C++ more like Lisp, but with more angle brackets.

Notes: 

More fields may be available via dynamicdata ..