Journal Articles

Overload Journal #131 - February 2016 + Programming Topics
Browse in : All > Journals > Overload > o131 (7)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Defining Concepts

Author: Martin Moene

Date: 06 February 2016 18:47:45 +00:00 or Sat, 06 February 2016 18:47:45 +00:00

Summary: Concepts provide a new way of constraining code. Andrew Sutton shows us how to define and use them.

Body: 

This article is the second in a series that describe concepts and their use. In the first article, I describe how concepts are used to declare and constrain generic algorithms [Sutton15]. In this article, I discuss how to define and use concepts: the building blocks of the constraints used in the previous article. The next article will focus on systems of concepts, overloading, and specialization.

The features described in this article are based on the ISO Concepts Technical Specification (TS) [N4549], a formal extension of the C++ Programming Language. The specification is implemented in GCC and will be part of the forthcoming 6.0 release. Eric Niebler and Casey Carter are also working on a Ranges TS [N4560] that incorporates these language features and will define the base set of concepts needed for the C++ Standard Library.

Recap

In my previous article, I wrote about a simple generic algorithm, in(), which determines whether an element can be found in a range of iterators. Here is its declaration, modified slightly to suit the purposes of this article.

  template<Range R, Equality_comparable T>
    requires Same<T, Value_type<R>>()
  bool in(R const& range, T const& value);

The function in takes a range and a value as arguments. To specify the requirements on those arguments, the declaration uses three concepts:

Value_type is not a concept. It is an alias of an internal type trait:

  template<typename T>
  using Value_type = typename value_type<T>::type;

We’ll see how value_type can be defined later in this article.

Recall that the compiler internally transforms the concepts in the declaration into a single constraint. In order to use this function, any template arguments must satisfy this predicate:

  Range<R>()
  && Equality_comparable<T>()
  && Same<T, Value_type<R>>()

If this expression does not evaluate to true (given concrete template arguments for R and T), then the function cannot be called, and the compiler emits a useful error message. For example, compiling this program:

  std::vector<std::string> cities { ... };
  assert(in(cities, "Akron"));

will yield an error such as that shown in Figure 1.1

error: cannot call function ‘bool in(const R&, const T&)
[with R = std::vector<std::string>; T = char [6]]’
in(v, "Akron");
^
note: constraints not satisfied
in(R const& range, T const& value)
note: concept ‘Same<char [6], std::string>()’ was not satisfied
note: within the concept template<class T, class U> concept bool Same()
[with T = char [6]; U = std::string]
concept bool Same() { ... }
             ^~~~
note: ‘char [6]’ is not the same as ‘std::string’
			
Figure 1

What exactly are Same, Equality_comparable, and Range, and how are they defined?

Concept definitions

A concept is a predicate on template arguments. In the Concepts TS, concepts are defined as a slightly simplified form of constexpr functions. Here is the declaration of Same:

  template<typename T, typename U>
  concept bool Same() { ... }

Concepts are defined by using the concept keyword in place of constexpr, and they must return bool. In order to make concepts simple to implement, fast to compile, yet sufficient to test properties of types, we impose a few restrictions on their definition:

The language syntactically limits concepts to simple logical propositions, but this isn’t quite as restrictive as it sounds. Those propositions can evaluate any other constant expression. For example, here is the definition of the Same concept:

  template<typename T, typename U>
  concept bool Same() {
    return std::is_same<T, U>::value;
  }

This concept expresses the requirement that two types must be the same. The concept is satisfied whenever std::is_same<T, U>::value is true. Of course, this concept is so fundamental and obvious that it may as well be defined by the compiler.

Concepts can also be defined as variable templates. For example, we could have defined Same like this:

  template<typename T, typename U>
  concept bool Same = std::is_same<T, U>::value;

Variable templates [N3615] were added to C++14 at the 2013 Bristol meeting, the same meeting at which the ISO Concepts TS was formally created. A variable template declares a family of variables whose values depend on template arguments. For example, the value of Same would depend on the types given for T and U.

Variable concepts are restricted in many of the same ways that function concepts are restricted:

Defining concepts in this way means that you can leave off the extra parentheses when using concepts in a requires clause:

  template<Range R, Equality_comparable T>
    requires Same<T, Value_type<R>> // no parens!

  bool in(R const& range, T const& value)

We’ve found that some developers prefer concepts to be declared and written this way despite the lack of overloading. The Concepts TS supports variable templates specifically because of this concern. Variable concepts were added to the TS only after variable templates were added for C++14. My preference is to define concepts as functions, so I use that style throughout this and the other articles in the series.

Syntactic requirements

While every type trait is potentially a concept, the most useful concepts are much more than simple wrappers. Think about Equality_comparable. It requires its template arguments to be usable with == and != operators. In C++14, we might express those requirements using a conjunction of type traits or some other advanced mechanism. Listing 1 is a trait-based implementation. Here, has_equal and has_not_equal are type traits that rely on subtle use of language features to determine the availability of an expression for a type. Their definitions are not shown here.

template<typename T>
concept bool Equality_comparable()
{
  return has_equal<T>::value &&
    has_not_equal<T>::value;
}
			
Listing 1

This approach is both simple and powerful, yet indirect and totally inadequate to the task at hand. Using traits to state requirements obfuscates the intent, making concepts more difficult to read and write. It can also slow compilations, especially when the use of such constraints is ubiquitous throughout a library. More recent concept emulation techniques improve on readability [Niebler13], but we can do better still. The Concepts TS provides direct language support that makes writing concepts simpler, faster to compile, and allows the compiler to produce far better error messages.

To do this, we introduced a new kind of expression: the requires expression. Here is a complete definition of the Equality_comparable concept (see Listing 2). The requires keyword can be followed by a parameter list introducing names to be used to express requirements. Here, we have declarations of a and b.

template<typename T>
concept bool Equality_comparable() {
  return requires (T a, T b) {
    { a == b } -> bool;
    { a != b } -> bool;
  };
}
			
Listing 2

The body of a requires expression is a sequence of requirements, each of which specifies one or more constraints for expressions and types related to a template argument. We refer to these as a concept’s syntactic requirements.

In the Equality_comparable concept, both requirements are compound requirements, meaning they introduce multiple constraints: The expression enclosed within braces (e.g., a == b) denotes a constraint for a valid expression. When the concept is checked against a (concrete) template argument, the constraint is satisfied if the substitution of the template argument into the expression does not result in an error.

The trailing -> bool denotes an implicit conversion constraint on the result type of the instantiated expression. That constraint is satisfied only if the result is implicitly convertible to bool.

The Range concept has more interesting requirements. Let us define it in stages, starting with a first and naïve version (Listing 3). That is, a Range must supply a begin() and an end() function, each taking a Range argument. That’s correct, but not every begin() and an end() function will do.

template<typename R>
concept bool Range() {
  return requires (R range) {
    begin(range);
    end(range);
  };
}
			
Listing 3

To be a Range, they must return input iterators:

  requires (R range) {
    { begin(range) } -> Input_iterator;
    { end(range) } -> Input_iterator;
  }

Input_iterator in another useful concept. When defining new concepts, we almost always build on a library of existing ones. Input_iterator is the representation in code of what is defined in English text in the ISO C++ standard.

When the type following the -> is a concept name (or placeholder), the result type is deduced from the required expression. This is called an argument deduction constraint. If deduction fails, or if the deduced type does not satisfy the named concept, the constraint is not satisfied.

With this definition of Range, the result types of begin() and end() are deduced separately, which means that they can differ. This may not be your intent. As a general rule, if you have several operations that you intend to be the same type, give it a name:

  requires (R range) {
    typename Iterator_type<R>;
      { begin(range) } -> Iterator_type<R>;
      { end(range) } -> Iterator_type<R>;
    requires Input_iterator<Iterator_type<R>>();
  };

That is, begin() and end() must return the same type (here called Iterator_type<R>) and that type must be an Input_iterator. This last requirement is added by the nested requires clause within the body of the requires expression.

To be useful for our purposes, a Range must also name the type of its elements, its Value_type. For example, in() requires that the Value_type of its range is the same type as the type of its value argument. To complete the Range concept we require that it have a Value_type in addition to its Iterator_type (see Listing 4).

template<typename R>
concept bool Range() {
  return requires (R range) {
    typename Value_type<R>;    // Must have a 
                               // value type.
    typename Iterator_type<R>; // Must have an
                               // iterator type.
    { begin(range) } -> Iterator_type<R>;
    { end(range) } -> Iterator_type<R>;
    // The iterator type must really be an 
    // input iterator.
    requires Input_iterator<Iterator_type<R>>();
    // The value of R is the same as its
    // iterator's value type.
    requires Same<Value_type<R>,
             Value_type<Iterator_type<R>>>().
  };
}
			
Listing 4

To ensure consistency, the value type of a range and its iterators must be the Same. Beyond that, however, there are no other requirements we want to make of Value_type. Those other requirements are imposed by algorithms. For example, the in() algorithm requires equality comparison, whereas std::sort() requires a total order. A concept should include requirements for only the types and operations needed for its intended abstraction. Including extra requirements can make a concept too strict (i.e., not broadly applicable).

When defining requirements for a concept, I introduce type requirements first, then simple and compound requirements, and nested requirements last. This is because constraint checking, the substitution of arguments into constraints to test for satisfaction, follows the short-circuiting logic of the && and || operators. This means that failures detected earlier are less likely to result in unrecoverable instantiation failures later.

Ad hoc requirements

The use of alias templates to refer to associated types greatly reduces the verbosity of template declarations. Alias templates like Value_type and Iterator_type refer to facilities that compute associated types based on pattern matching on the ‘shape’ of the template argument. Listing 5 is a first naïve attempt to define Value_type.

template<typename T> struct value_type;

template<typename T>
using Value_type = typename value_type<T>::type;

// The value_type of a class is a member type.
template<typename T>
struct value_type {
  using type = typename T::value_type;
};

// The value_type of a pointer is the type of
// element pointed to.
template<typename T>
struct value_type<T*> {
  using type = T;
};

// The value_type of an array is its element type.
template<typename T, int N>
struct value_type<T[N]> {
 using type = T;
};
			
Listing 5

This seems reasonable at first glance. However, we have not constrained the primary template of the trait definition, and that can cause problems. When the compiler selects the primary template for a template argument that does not have a nested ::value_type, compilation will fail. This is an unrecoverable error that breaks concept checking.

We want to define the value_type trait so that it is instantiated if and only if there is a specialization that provides an appropriate type. To do this, we factor a new constrained specialization out of the primary template leaving it unconstrained and undefined (see Listing 6). Now, the value_type is defined only where it is meaningful. The new specialization is chosen only for classes that have a member called value_type.

template<typename T>
struct value_type;

// The value_type of a class is a member type.
template<typename T>
  requires requires { typename T::value_type; }
struct iterator_type<T> {
  using type = typename T::value_type;
};
			
Listing 6

To avoid verbosity, I did not define a new concept like Has_value_type. Instead, I used a requires expression directly within the requires clause. Yes, requires requires is syntactically correct – it is not a typo. The first requires introduces the requires clause, the second starts the requires expression.

This syntax for ad hoc constraints is not optimized (i.e., gross) on purpose. Providing a more elegant syntax for these kinds of constraints might encourage programmers to think about generic code in terms of small syntactic fragments (although these are sometimes helpful when laying the foundations of higher level abstractions). In general, useful concepts have obvious and meaningful names.

Writing fundamental concepts requires an understanding of the way the type system and other language rules interact. For example, we cannot constrain the primary template directly because constraints are checked after name lookup. Every lookup for T* would fail because pointers do not have nested members. Libraries of concepts saves us from having to consider such subtleties all the time.

When the type trait is instantiated during concept checking, the compiler considers each partial specialization, if none match (e.g., int is neither an array, nor does it have nested type names), then the compiler selects the primary template, which happens to be undefined. The result is a substitution failure that gets ‘trapped’ by the requires expression that causes the instantiation, and this causes enclosing concept to be unsatisfied.

In other words, value_type is a recipe for writing SFINAE-friendly type traits using concepts. The definition of the Iterator_type and its underlying trait have similar definitions.

Mixed-type requirements

Listing 7 is our working definition for the in() algorithm. As declared, the value type of R must be the same as T, which would make the following program ill-formed.

template<Range R, Equality_comparable T>
  requires Same<T, Value_type<R>>()
bool in(R const& range, T const& value) {
  for (Equality_comparable const& x : range) {
    if (x == value)
      return true;
  }
  return false;
}
			
Listing 7
  std::vector<std::string> cities { ... };
  assert(in(cities, "Akron"));

A string literal does not have the same type as std::string, so the constraints are not satisfied. That’s not good enough. The std::string class provides a number of overloads to make it work seamlessly with C-strings, and we should be able to use those in our generic algorithms. How can we change the algorithm to support these kinds of mixed-type operations?

We could redefine the algorithm so that value was a Value_type<R>. However, this would always require a conversion at the call site, which would almost certainly be a pessimization (converting a C-string to a std::string may require an allocation).

We could drop the Same requirement. But then the interface would not express how the elements in range are related to value, and we want our constraints to fully express the syntax used within the definition.

Our best choice is to change the Same requirement to something more permissive: a concept that supports equality comparisons between values of different types. Rather creating a concept with a different, name we can extend Equality_comparable by adding a new definition that takes two arguments instead of one. That is, we overload the Equality_comparable() function. That concept must express requirements for all the ways in which we can compare values of different types for equality (see Listing 8).

template<typename T, typename U>
concept bool Equality_comparable() {
  return requires(T t, U u) {
    { t == u } -> bool;
    { u == t } -> bool;
    { t != u } -> bool;
    { u != t } -> bool;
  };
}
			
Listing 8

This concept requires the symmetric comparison of values of type T and U.

We can now use the mixed-type Equality_comparable concept to weaken the constraints on the in().

 template<Range R, Equality_comparable T>
   requires Equality_comparable<T, Value_type<R>>()
 bool in(R const& range, T const& value);

These constraints fully specify the syntax used within the implementation, the program compiles as expected, and it does not introduce any additional runtime overhead. This is a better declaration of in(); it’s also the version we used in the first article. The ability to extend a concept to support mixed-type requirements is an essential tool for making algorithms more broadly applicable, without extra notational or runtime overheads. The Palo Alto report, for example, uses this technique for total ordered types, all binary relations, and all binary operations.

These extended definitions are not available for variable concepts because the capability is based on function overloading. This is not a limitation imposed by concepts; you simply cannot overload variables in C++.

Semantic requirements

The syntactic requirements of a concept only tells us what expressions and associated types can be used with a template argument (or template arguments). In general, we would very much like to know what those expressions and types actually mean. Just as importantly, it would be helpful for the compiler and other tools to be able to reason about the meaning of such expressions in order to support optimization and verification. Unfortunately, the Concepts TS does not provide direct language support for writing semantic requirements. Instead, we must rely on conventional forms of documentation to specify the semantics of operations operations.

C++0x concepts supported a feature called ‘axioms’, but it was added late in the development of C++11 [N2887], and their utility had not been fully explored by the time concepts were removed. Axioms were also major feature of the Palo Alto report [N3351]. However, as the proposal for Concepts Lite evolved, the Concepts Study Group (SG8) decided to leave axioms out pending further exploration. There is ongoing research related to compile-time checking of semantic requirements [DosReis09], so we hope to see axioms in the future.

Conclusions

Concepts are fundamental building blocks for our thinking and for our code; they provide the foundation upon which we design and implement software. The Concepts TS provides direct language support for the specification of concepts and their syntactic requirements. However, we must not forget or downplay the importance of the semantic aspects of concepts. A concept without semantics is merely a snippet of code.

In the next article, I will discuss systems of concepts, and how overloading and specialization based on constraints can be used to select optimal algorithms at compile time.

Acknowledgements

The design of the features in the Concepts TS was the result of collaboration with Bjarne Stroustrup and Gabriel Dos Reis. That material is based upon work supported by the National Science Foundation under Grant No. ACI-1148461. Bjarne Stroustrup also provided valuable feedback on drafts of this paper.

The WG21 Core Working group spent many, many hours over several meetings and teleconferences reviewing the Concepts TS design and wording. This work would not have been possible without their patience and attention to detail. Many people have submitted pull requests to the TS or emailed me separately to describe issues or suggest solutions. I am grateful for their contributions.

I would also like to acknowledge all of the early adopters of the GCC concepts implementation. Their feedback (often in the form of bug reports) has been invaluable.

References

[DosReis09] Dos Reis, G. ‘A System for Axiomatic Programming’ Lecture Notes in Compute Science. Vol. 7362. 2012. pp 295-309.

[N2887] Dos Reis, G., Stroustrup. B., Merideth, A. ‘Axioms: Semantics Aspects of C++ Concepts’ ISO/IEC WG21 N2887, Jun 2009.

[N3351] Stroustrup, B., Sutton, A. (eds). ‘A Concept Design for the STL’ ISO/IEC WG21 N3351, Feb 2012.

[N3615] Dos Reis, G.. ‘Constexpr Variable Templates’ ISO/IEC WG21 N3615, Mar 2013.

[N4549] Sutton, A. (ed). ISO/IEC Technical Specification 19217. ‘Programming Languages – C++ Extensions for Concepts’, Aug 2015.

[N4560] Niebler, Eric, Carter, C. Working Draft, ‘C++ Extensions for Concepts’, ISO/IEC WG21 N450. Nov 2015. pp. 213.

[Niebler13] Niebler, E. ‘Concept Checking in C++11’ 23 Nov 2013. Web.

[Sutton15] Sutton, A. ‘Introducing Concepts’ ACCU Overload. Vol 129. Oct 2015. pp. 4–8.

  1. This error is generated by GCC (compiled from trunk) with an extra patch (pending review) to improve concept checking diagnostics. The message has been modified for better presentation. Some type names have been shortened and the definition of Same is elided (...)

Notes: 

More fields may be available via dynamicdata ..