Journal Articles

Overload Journal #147 - October 2018 + Programming Topics

Browse in :

All > Journals > Overload > o147 (5)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: P1063 vs Coroutines TS: Consensus on High-Level Semantics

Author: Bob Schmidt

Date: 03 October 2018 18:16:41 +01:00 or Wed, 03 October 2018 18:16:41 +01:00

Summary: Dmytro Ivanchykhin, Sergey Ignatchenko and Maxim Blashchuk argue that we need coroutines TS now to improve-based-on-experience later.

Body:

Disclaimer: This article takes for granted that readers understand what coroutines are about. If this concept is unfamiliar to you (hey, weâ€™re speaking about standard proposals here!) make sure to take a look at [Nishanov15] and [McNellis16].

Disclaimer #2: Just to avoid any doubt, this article is not written with the help of some magical oracle or other source of infinite wisdom; rather, this article (just as any other article for that matter) merely represents an opinion of its authors (which may or may not coincide with the opinion of the Overload editor). In addition, this article is neither sanctioned nor sponsored by any government, WG21, or other official body.

Quite recently, we have learned that newly appeared [P1063R0]1 and its â€˜Core Coroutinesâ€™ proposal has led to controversy, which got in the way of voting Coroutines TS [N4760] (a.k.a. Gor-routines ☺) into C++20. As big fans of coroutines in general and asynchronous processing in particular, we became worried about this development, so we took a look at this situation from the point of view of an app-level developer (and occasional architect). In other words, we do not really care about implementation details and compiler complexities â€“ instead, we care about stuff such as readability, performance, backward compatibility and code maintenance costs; and of course, another extremely important consideration is when weâ€™ll be able to start using those exciting new C++ features (without standardization weâ€™re not really able to use any feature on a massive scale as the associated risks are just too high).

App-level point of view

From our app-level point of view we can say that all code-using coroutines we can think of, in most of real-world projects will fall into two separate categories:

code which uses co_await2. Letâ€™s call this code end-programmer code (mimicking end-user-related terminology). This code will be interspersed with business logic. It will change very frequently, and will be spread all over the code base; as a result, any change to the semantics of co_await will be crazily expensive at app-level, and most likely such a change wonâ€™t be feasible.
code which enables using co_await (for Coroutines TS, it is all the await_*() stuff; for Core Coroutines, it is overloaded operator [<-] etc.). Letâ€™s call this code infrastructure app-level code. For all the use cases we can think of for serious projects, this code is going to be confined to some kind of framework/glue/... layer. Moreover, this layer usually doesnâ€™t contain business logic and tends to be quite limited in size, with changes to this layer being quite rare. In fact, a similar point of view is articulated in [P1063R0]: â€œauthors of wrapper libraries... we expect those to be relatively rareâ€.

Coroutines TS vs P1063: end-programmer example

Letâ€™s take the very same piece of code and see how it can be written under both proposals.

Coroutines TS a.k.a. Gor-routines

future<int> count_bytes(Connection& connection) {
  int bytes_read = 0;
  vector<char> buffer(1024);
  while(!connection.done()) {
    bytes_read +=
      co_await connection.Read(buffer.data(),
      buffer.size());
  }
  co_return bytes_read;
}

P1063R0 a.k.a. "Core Coroutines"

 auto count_bytes(Connection& connection) =>
      make_future<int>([&connection] do {
  int bytes_read = 0;
  vector<char> buffer(1024);
  while(!connection.done()) {
    bytes_read +=
      [<-]connection.Read(buffer.data(),
      buffer.size());
  }
  return bytes_read;
});

End-programmer semantics: exactly the same for Coroutines TS and Core Coroutines

Following from the â€˜App-level point of viewâ€™ section above, the most important (and utterly unchangeable later) portion of any coroutines proposal is the semantics of co_await (or whatever other syntax it may have). Historically, there have been several significantly different semantics of await (for example, in a relatively recent [P0114R0], it was argued not to require a marker for a suspend point â€“ which, BTW, was argued later to be a Bad Thingâ„¢ for app-level [NoBugs17]).

However, if we take a look at currently competing proposals (Coroutines TS and P1063), weâ€™ll see (to the best of our understanding) that

the semantics of co_await and the proposed operator [<-], at least at the point where co_await/[<-] is used by end-programmer code, is exactly the same.

Not only is the flow interrupted (with the possibility of being resumed) in the very same manner for both proposals, but also all properties that are observable from the business-logic level (such as enforcing calls around async call to be asynchronous) are the same too.

As noted above, such consensus on high-level semantics (compared to co_await) wasnâ€™t the case for earlier proposals such as [P0114R0], but is the case for [P1063R0].

On end-programmer syntax

While the semantics of the proposals are exactly the same, there are a few high-level syntactic differences between P1063 and â€˜Coroutines TSâ€™:

Replacement of co_await with an identically used but differently named operator [<-]. Not that it really matters for our current discussion, but we have to mention that we have our doubts about an argument from [P1063R0] that the â€œco_await keyword is an overt manifestation of the TSâ€™s preference for the asynchronous use caseâ€.
We feel that, even when weâ€™re writing generators, we can consider what is happening at that point as â€˜awaitingâ€™ something external to our code flow to happen (even if it is another generator). Indeed, with co_await (or [<-]) weâ€™re interrupting the program flow â€“ but why? To await something external to our program flow to happen (whether it is an async event, another generator, or whatever else). In addition, the concept of unwrapping is guaranteed to be alien to the vast majority of app-level developers (even more so for existing C++ app-level developers). That being said, we are quite indifferent to the choice between co_await and [<-].
Explicit designation of coroutines (vs implicit one in Coroutines TS, where being coroutine is derived from co_await or co_return being used). In general, there are arguments to have app-level code explicitly documented, but this is still a very minor issue. OTOH, the way it is done in P1063 is very verbose (thatâ€™s even after theyâ€™re relying on a yet another pending proposal â€“ and modifying it further (!) â€“ to make syntax more palatable) and we feel that it is at odds with the all-important â€œdirect expression of ideasâ€ principle which was laid out in [Stroustrup04].
Lambda-like syntax in P1063 vs traditional function syntax in Coroutines TS. Again, it doesnâ€™t matter too much for the purposes of our current discussion, but we have to say that lambda-like syntax (a) is more error-prone (keeping all those brackets matching is yet another thing to care about while programming), and (b) as lambda syntax differs significantly from usual function syntax, we feel that it undermines the time-honoured understanding of subroutines being â€œspecial cases of more general program components, called coroutinesâ€ [Knuth].
Replacement of co_return with return. TBH, this is the least of our syntactic concerns (not that other syntactic concerns are significant); we explicitly do not care about it. Either way is perfectly fine with us and we have no idea why it is so important for the authors of [P1063R0].

However, the most important property of all the syntactic differences is

As the differences are purely syntactical, nothing prevents us from either (a) choosing whatever syntax is preferred right now, without delaying the whole thing for N years, and/or (b) adding syntactic alternatives later

Customization points: mostly an implementation detail that can be changed later

In fact, what we have already discussed above is only a minor part of the differences between Coroutines TS and P1063; however, all the remaining differences weâ€™re aware of are either (a) about optimizations (which weâ€™ll discuss a bit later), or (b) about so-called â€˜customization pointsâ€™ in P1063-speak, or, from our current perspective, are about what we decided to call â€˜app-level infrastructure codeâ€™. Letâ€™s take a closer look at those customization points and app-level infrastructure code.

As for app-level infrastructure code, the most important properties are:

it is hidden from the view of the end-user programmer
it is rarely changed
and it is small.

(BTW, as it was already noted above, P1063 itself has indications which agree with this point of view.)

As a direct result of item #1 above, from the end-user programmer point of view,

customization points/app-level infrastructure code are nothing but implementation details

Moreover, from #2 and #3 it follows that costs of rewriting such code â€“ if such a need will ever arise â€“ will be small; this opens us a door to change them later if/when it is demonstrated that such a change is necessary.

Performance and allocations

Another set of objections to Coroutines TS laid out in P1063 is about performance and lack of normative control over allocations. This one is simple â€“ P1063 itself acknowledges that all their performance/allocation concerns can be addressed by extending Coroutines TS later: â€œThese all appear to be pure extensions, so they could be done post-C++20 if need be.â€ As a result, we donâ€™t really care about performance issues now, as optimizations (most of them already existing) can be made normative later.

This is without mentioning that the whole argument along the lines of â€œwe donâ€™t want allocationsâ€ becomes more and more moot as soon as we take into account that modern single-threaded allocators can perform malloc()+ free() pairs in as little as 15 CPU cycles [Ignatchenko18]; with this cost being comparable to the cost of a single branch mis-prediction(!), efforts related to eliminating allocations become more and more of a â€˜yet another optimizationâ€™ rather than â€˜a thing we should care about a lotâ€™.

Analysis: coroutines TS CAN be voted in, even if P1063 is right on every point

Now, weâ€™re done with the preliminaries and can proceed to the point of this article. Letâ€™s assume for the moment that ISO committee and the industry follow this path:

WG21 has a short discussion on syntax for Coroutines TS (or makes a joint proposal in this regard). Our own preferences in this regard were outlined above, but TBH we will accept any kind of syntax to get coroutines into C++20 (that is, as long as end-programmer semantics remains the same).
WG21 votes Coroutines TS into C++20.
In a few years, everybody and their dog are using Coroutines TS.

Now, letâ€™s consider all the possible scenarios with regards to the merits of P1063 in this context (keeping in mind its claims about being more generic than Gor-routines):

If by the end of the day (and as Gor currently argues), P1063 wonâ€™t be able to provide any significant improvements (that is, over an improved-over-time Coroutines TS), accepting Coroutines TS was the right thing; end of discussion.
If P1063 happens to be perfect as promised, it should be possible to rewrite the current implementation of Coroutines TS (including the code providing for await_suspend() etc.) in P1063 style. This means that: (a) at end-programmer level, there will be exactly zero changes; (b) at the level of the app-level infrastructure code: (b1) for the time being, weâ€™ll have Coroutines TS (good enough for us), and (b2) when P1063 is standard-ready (in the very best case C++26(!)), weâ€™ll have both ways of describing things (NB: unless demonstrated to be superior in performance, weâ€™re sure that lots of developers â€“ ourselves included â€“ will still prefer the Coroutines TS way).
If P1063 happens to be not as perfect as promised but still better than Coroutines TS, it might be impossible to rewrite the current implementation of Coroutines TS in the P1063 style. This will mean that: (a) at end-programmer level, there are still exactly zero changes; (b) at the level of the app-level infrastructure code: (b1) for the time being, weâ€™ll have Coroutines TS, and (b2) when P1063 is standard-ready, weâ€™ll have two separate ways of describing things. This might mean â€“ when the project benefits from it â€“ that a very small portion of the project code (from experience, 2â€“5%) may need to be rewritten; taking into account that for the vast majority of projects (90+% being a conservative estimate) Coroutines TS are expected to be â€˜good enoughâ€™, weâ€™re speaking about 0.2â€“0.5% of all the code using Coroutines-TS being rewritten. We are confident that it is not too much of a price for having Coroutines TS at least 6 years earlier (and note that this 0.2â€“0.5% rewrite happens only IF P1063 is better than Coroutines TS but is not as perfect as promised).
If some other way to implement customization (even better than P1063) arises meanwhile: (a) at end-programmer level, there are still exactly zero changes; (b) at the level of the app-level infrastructure code: (b1) for the time being, weâ€™ll have Coroutines TS, and (b2) when some-other-way is standard-ready, weâ€™ll have one or two separate ways of describing things. However, along the lines above, our estimate is that â€“ even in the worst case â€“ only 0.2â€“0.5% of the code using the Coroutines TS will have to be rewritten.

In other words:

In each and every conceivable scenario, including the one where P1063 is right with each and every significant claim theyâ€™re making, voting in Coroutines TS is The Right Thing To Doâ„¢.

Voting Coroutines TS into C++20 will provide two all-important benefits:

in the industry, weâ€™ll be able to use goodies of coroutines right now (and not 6+ years later)
even more importantly, while weâ€™re using it â€“ weâ€™ll see more real-world use cases, and will be able to criticize current implementation not from purely abstract point of view, but based on the needs of the real world.

In a sense, what we have is a situation similar to prima facie hearing in the criminal law of some countries; in such hearings, even if all the evidence presented by the prosecution, is taken at face value, but the defendant is still not guilty, there is no need to argue about the merits of the evidence, and the decision can be made in favour of the defendant without conducting a full hearing. Such cases are admittedly rare, but in our case of P1063-vs-Coroutines-TS, it is possible because of two major observations:

when considering 99+% of the relevant code, the semantics of the Coroutines TS and P1063 is exactly the same. In other words, we have consensus on end-programmer semantics.
And from the point of view of the all-important end-programmer, anything else can be seen as an implementation detail, and Coroutines TS sets the abstraction boundary for customization points to be very close to the end-user programmer, preventing app-level programmers from implementing it themselves. This, in turn, allows specifying this layer later (which is essentially what P1063 tries to do). In other words, weâ€™re going in the direction from being under-specified to over-specified (which, unlike the other way around, is perfectly feasible).

Or, trying to approach the same thing from a different perspective: we clearly feel that current Coroutines TS does represent â€˜gradual expansionâ€™ without degenerating into â€˜opportunistic hackingâ€™ as defined in [P0976] by Bjarne Stroustrup.

Gradual expansion, relying on feedback, is my ideal. Better an incomplete design than a poor/clumsy/bloated â€˜complete solutionâ€™.

And FWIW, â€˜relying on feedbackâ€™ is not really possible until co_await makes it into the standard one way or another; it means that the merits of voting in Coroutine TS right now go far beyond our simple desire to start using it ASAP: it is also important to ensure that the end-product (the C++ standard) is the best one possible. Indeed, if some over-specified stuff makes it into the standard, it will be next to impossible to replace it later â€“ and right now we just donâ€™t have sufficient information to say which way is the best one; in this sense, the approach taken by Coroutines TS (to hide as much as possible beyond the implementation boundary, or â€“ in other words â€“ â€˜to under-specify rather than over-specifyâ€™) is a Good Thingâ„¢; combined with an as-early-as-possible acceptance of Coroutines TS into the standard, this allow to get that all-important feedback Bjarne refers to in [P0976].

Conclusion

We hope that we have made a case for â€˜voting for Coroutines TS right now regardless of the merits of the finer points of P1063â€™ (that is, points going beyond two major observations listed above):

weâ€™ll be able to use coroutines at end-programmer level (where consensus already exists) right away
as for customization points, even if P1063 is The Way To Go(tm) â€“ it can be added later when (if) this becomes apparent. In addition, while weâ€™re using coroutines in the wild, weâ€™ll become much more knowledgeable about real-world use cases â€“ and the ways that Coroutines TS needs to be improved (who knows, maybe a more-straightforward model to express â€˜customization pointsâ€™ arises as we learn more about coroutines from deploying Coroutines TS â€“ and current Coroutines TS has abstraction boundaries which leave room for different ways of specifying â€˜customization pointsâ€™).

In other words, we hope we have demonstrated that voting in Coroutines TS is The Right Thing To Doâ„¢ without criticizing P1063 itself.

Phew. We rest our case.

References

[Ignatchenko18] (Re)Actor Allocation At 15 CPU Cycles, Sergey Ignatchenko, Dmytro Ivanchykhin, Marcos Bracco, Overload #142, https://accu.org/index.php/journals/2533

[Knuth] The Art of Computer Programming, Donald Knuth, Vol. I

[McNellis16] Introduction to C++ Coroutines, James McNellis, CppCon2016, https://www.youtube.com/watch?v=ZTqHjjm86Bw

[Nishanov15] C++ Coroutines â€“ a negative overhead abstraction, Gor Nishanov, CppCon2015, https://www.youtube.com/watch?v=_fu0gx-xseY

[N4760] Working Draft, C++ Extensions for Coroutines, Gor Nishanov, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/n4760.pdf

[NoBugs17] Eight Ways to Handle Non-Blocking Returns in Message-Passing Programs, â€˜No Bugsâ€™ Hare, http://ithare.com/eight-ways-to-handle-non-blocking-returns-in-message-passing-programs-with-script/3/, CppCon17

[P0114R0] Resumable Expressions, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0114r0.pdf

[P0973R0] Coroutines TS Use Cases and Design Issues, Geoff Romer, James Dennett, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0973r0.pdf

[P0976] The Evils of Paradigms Or Beware of one-solution-fits-all thinking, Bjarne Stroustrup, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0976r0.pdf

[P1063R0] Core Coroutines, Geoff Romer, James Dennett, Chandler Carruth, http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p1063r0.pdf

[Stroustrup04] Speaking C++ as Native (Multi-paradigm Programming in Standard C++), Bjarne Stroustrup, http://ewh.ieee.org/r5/central_texas/austin_cs/presentations/2004.02.25.pdf

Based on [P0973R0], with two of the three authors being the same.
Or operator [<-], it doesnâ€™t really matter.

Dmytro Ivanchykhin Dmytro Ivanchykhin has 10+ years of development experience, and has a strong mathematical background (in the past, he taught maths at NDSU in the United States).

Sergey Ignatchenko Sergey Ignatchenko has 15+ years of industry experience, including being a co-architect of a stock exchange, and the sole architect of a game with 400K simultaneous players. He currently holds the position of Security Researcher.

Maxim Blashchuk Maxim Blashchuk has substantial development experience, most of it with embedded programming. Recently he joined a team performing research on low-level C++ libraries providing properties such as determinism and memory safety.

Notes:

More fields may be available via dynamicdata ..