Design of applications and programs + Overload Journal #140 - August 2017
Browse in : All > Topics > Design
All > Journals > Overload > o140
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: A Usable C++ Dialect that is Safe Against Memory Corruption

Author: Bob Schmidt

Date: 07 August 2017 00:55:27 +01:00 or Mon, 07 August 2017 00:55:27 +01:00

Summary: Suitable allocators for (Re)Actors can speed things up. Sergey Ignatchenko continues his investigation in Allocator for (Re)Actors (Part 2).

Body: 

We have this handy fusion reactor in the sky. You don’t have to do anything, it just works.~ Elon Musk

Disclaimer: as usual, the opinions within this article are those of ‘No Bugs’ Hare, and do not necessarily coincide with the opinions of the translators and Overload editors; also, please keep in mind that translation difficulties from Lapine (like those described in [Loganberry04]) might have prevented an exact translation. In addition, the translator and Overload expressly disclaim all responsibility from any action or inaction resulting from reading this article.

As we briefly discussed in Part I of this mini-series [NoBugs17], message-passing technologies such as (Re)Actors (a.k.a. Actors, Reactors, ad hoc FSMs, and event-driven programs) have numerous advantages, ranging from being debuggable (including post-factum production debugging), to providing better overall performance.

In [NoBugs17], we discussed an approach to handling allocations for (Re)Actors – and were able to reach kinda-safety at least in what we named ‘kinda-safe’ and ‘safe with relocation’ mode. Unfortunately, kinda-safety didn’t really provide the Holy Grail™ of safety against memory corruptions. Now, we can extend our allocation model with a few additional guidelines, and as long as we’re following these rules/guidelines, our C++ programs WILL become perfectly safe against memory corruptions.

#define (Re)Actors

To make this article self-contained and make sure that we’re all on the same page with terminology, let’s repeat the definition of what we’re considering: (Re)Actors [NoBugs17].

Let’s begin with a common denominator for all our (Re)Actors: a GenericReactor. GenericReactor is just an abstract class – and has a pure virtual function react():

  class GenericReactor {
    virtual void react(const Event& ev) = 0;
  }

Let’s define what we will refer to as ‘infrastructure code’: a piece of code which calls GenericReactor’s react(). Quite often this call will be within a so-called ‘event loop’ (Listing 1).

std::unique_ptr<GenericReactor> r
  = reactorFactory.createReactor(...);
while(true) { //event loop
  Event ev = get_event();
    //from select(), libuv, ...
  r->react(ev);
}
			
Listing 1

Let’s note that the get_event() function can obtain events from wherever we want; anything from select() (which is quite typical for servers) to libraries such as libuv (which is common for clients).

Also let’s note that an event loop, such as the one above, is certainly not the only way to call react(): I’ve seen implementations of infrastructure code ranging from one running multiple (Re)Actors within the same thread, to another which deserialized (Re)Actor from DB, then called react() and then serialized (Re)Actor back to a database. What’s important, though, is that even if react() can be called from different threads, it MUST be called as if it is one single thread (if necessary, all thread sync should be done OUTSIDE of our (Re)Actor, so react() doesn’t need to bother about thread sync regardless of the infrastructure code in use).

Finally, let’s refer to any specific derivative from GenericReactor (which implements our react() function) as a SpecificReactor:

  class SpecificReactor : public GenericReactor {
    void react(const Event& ev) override;
  };

In addition, let’s observe that whenever (Re)Actor needs to communicate with another (Re)Actor – adhering to the ‘Do not communicate by sharing memory; instead, share memory by communicating’ principle – it merely sends a message, and it is only this message which will be shared between (Re)Actors. In turn, this means that we can (and should) use single-threaded allocation for all (Re)Actor purposes – except for allocation of those messages intended for inter-(Re)Actor communications.

Rules to ensure memory safety

With (Re)Actors defined, we can formulate our rules to make our (Re)Actor code (Reactor::react() and all the stuff called from it) perfectly safe.

First, let’s postulate that there are three different types of pointers in our program: ‘owning’ pointers, ‘soft’ pointers, and ‘naked’ pointers.

‘Owning’ pointers delete their contents in destructors, and within our rules, should comply with the following:

‘Soft’ pointers are obtained from ‘owning’ ones. Whenever we’re trying to access an already deleted object via a ‘soft’ pointer (or create a ‘naked’ pointer from a ‘soft’ pointer which points to an already deleted object) – we are guaranteed to get an exception. ‘Soft’ pointers should comply with the following:

‘Naked’ pointers are our usual C-style pointers – and are inherently very dangerous as a result. Apparently, we can still handle them in a safe manner, as long as the following rules are followed:

The most reliable way to enforce the ‘lifetime is never extended’ rule above is to say that all copying of ‘naked’ pointers is prohibited, except for a few well-defined cases:

On the other hand, the following constructs are known to violate the ‘lifetime is never extended’ rule, and are therefore prohibited:

Note that the respective lists of ways to create pointers are exhaustive; in other words: the ONLY way to create an ‘owning’ pointer is from operator new of the same type; the ONLY ways to create a ‘safe’ pointer is (a) from an ‘owning’ pointer of the same base type, or (b) as a copy of a ‘safe’ pointer of the same type; and the ONLY way to create a ‘naked’ pointer is from {‘owning’|‘soft’|‘naked’} pointer as long as the ‘naked’ pointer doesn’t extend the lifetime of the original pointer.

This implies prohibiting casting to pointers (and also prohibits C-style cast and static_cast<> with respect to pointers; however, implicit pointer casts and dynamic_cast<> are ok). Note that although casting from pointers won’t cause memory corruption, it is not a good idea in general.

This also implies that assigning the result of new to anything except an ‘owning’ pointer is prohibited.

Implementations for both ‘owning’ and ‘safe’ pointers should take into account that their methods may be invoked after their destructor is called (see discussion in (*) paragraph below); in this case, we’ll either guarantee that no pointer to a non-existing object will be returned, or (even better) will throw an exception.

Note that for the time being, we do NOT handle collections and arrays; in particular, we have to prohibit indexed dereferencing (a[i] is inherently dangerous unless we’re ensuring boundary checks).

That’s it – we’ve got our perfectly safe dialect of C++, and while it doesn’t deal with arrays or collections, it is a very good foundation for further refinements.

Proof sketch

The formal proof of the program under the rules above is going to be lengthy and, well, formal, but a sketch of such a proof is as follows.

First, let’s note that our rules do NOT allow the creation of any pointers, unless it is a pointer to an existing on-heap object, or an on-stack object (the latter is for ‘naked’ pointers only). NB: if we also want to deal with globals, this is trivial too, but for the time being let’s prohibit globals within (Re)Actors, which is good practice anyway.

As a result, there is no risk of the pointer pointing somewhere where there was never an object, and the only risks we’re facing are about the pointers to objects which did exist but don’t exist anymore. We have two types of such objects: on-stack objects, and on-heap ones.

For on-stack objects which don’t exist anymore:

For on-heap objects which don’t exist anymore:

(*) Note that via ‘naked’ pointers, we are still able to access objects which have already had their destructors called (but memory unreleased); this means that to ensure safety, those objects from supporting libraries which don’t follow the rules above themselves (in particular, collections) must ensure that their destructors leave the object in a ‘safe’ state (at least with no ‘dangling’ pointers left behind; more formally: there should be a firm guarantee that any operation over a destructed object cannot possibly cause memory corruption or return a pointer which is not a nullptr, though ideally it should cause an exception).

Phew. Unless I’m mistaken somewhere, it seems that we got our perfectly safe dialect of C++ (without collections, that is).

Enter collections

[Enter Romeo and Juliet]

Romeo: Speak your mind. You are as worried as the sum of yourself

and the difference between my small smooth hamster and my nose.

Speak your mind!

Juliet: Speak YOUR mind! You are as bad as Hamlet!

You are as small as the difference between the square of the difference

between my little pony and your big hairy hound

and the cube of your sorry little codpiece. Speak your mind!

[Exit Romeo]

~ Program in The Shakespeare Programming Language

As noted above, collections (including arrays) are not covered by our original rules above. However, it is relatively easy to add them, by adding a few additional rules with regards to collections.

First, we will NOT use the usual iterators (including pointers within arrays); instead, we’re using ‘safe iterators’. A ‘safe iterator’ (or ‘safe range’) is a tuple/struct/class/… which contains:

The second rule about collections is that all the access to the collections (including iterator dereferencing) MUST be written in a way which guarantees safety.

For example, if we’re trying to access an element of the array via our ‘safe iterator’, it is the job of the operator* of our ‘safe iterator’ to ensure that it stays within the array (and to throw an exception otherwise).

This is certainly possible:

This way, whenever we want to use such a ‘safe iterator’/‘safe range’, first we’ll reach the collection (relying on our usual safety guarantees for our {'owning'|'soft'|'naked'} pointers), and then the collection itself will guarantee that its own iterator is valid before dereferencing it.

Different approaches to safety in infrastructure code and Reactor code

20% of people consume 80% of beer~ Pareto principle as applied to beer consumption

An observation (*) above, as well as the discussion about optimized collections, highlights one important property of our Perfectly Safe Reactors:

we can (and often SHOULD) have different approaches to safety of the Reactor::react() and the rest of the code.

This dichotomy between infrastructure code and Reactor code is actually very important in practice.

Infrastructure code (including supporting libraries such as collections, etc.) is:

In contrast, (Re)Actor code:

As a result, we can observe that for small, never-changing, and performance-critical Infrastructure Code, it is both feasible and desirable to provide safe highly-optimized versions (which may or may not follow our rules above in the name of performance). On the other hand, for (Re)Actor Code, formal safety is usually much more important than bare performance. This is especially so as, in the case of our rules, the expected performance hit is pretty much negligible: the only two runtime checks we’re doing happen at ‘safe’ pointer to ‘naked’ pointer conversion (or at ‘safe’ pointer dereferencing), and at collection accesses; neither of them is expected to be noticeable (except in some very performance-critical code).

Generalizing this point further, we can split our code base into a small performance critical part (which we’ll handle without our safety rules, but which is small enough to be scrutinized in a less formal manner), and a large performance-agnostic part (which we’ll handle according to the safety rules above); however, in practice, these lines will be usually very close to the lines between Infrastructure Code and (Re)Actor Code.

One important thing to keep in mind when writing those Infrastructure objects which are intended to be called from (Re)Actors is ensuring that they’re safe even after their destructor is called (as discussed in the (*) paragraph above). On the other hand, if our object follows our safety rules above, this will be achieved automagically.

All our rules are very local, which enables automated checks

One further very important property of our safety rules is that

they’re very local.

Indeed, all the rules above can be validated within the scope of one single function. In other words, it is possible to find whether our function f() is compliant with our safety rules using function f() and only function f().

This not only allows for simple code reviews, but also means that this process can be automated relatively easily. Implementing such a tool is a different story (and it is still going to take a while) but is perfectly feasible (well, as long as we find a tool to parse C++ and get some kind of AST, but these days at least Clang does provide this kind of functionality).

As soon as such an automated check tool is implemented, development will become a breeze:

While this is not a real ‘silver bullet’ (nothing really is – in fact, the safety of theoretically safe languages also hinges on the safety of their compilers and standard libraries), this approach is expected to improve memory safety of the common business-level code by orders of magnitude (and even if your code is already perfectly safe, this approach will provide all the necessary peace of mind with regards to safety).

Conclusion

That’s pretty much it – we DID get a perfectly usable C++ dialect which is also 100% safe against memory corruption and against memory leaks. BTW, if necessary our approach can easily be extended to a more flexible model which relies on semantics similar to that of std:shared_ptr<> and std::weak_ptr<>; while I am not a fan of reference-counted semantics (from my experience, reference counting causes much more trouble than it is worth – and simplistic ‘owning’ pointers are more straightforward and are perfectly usable for millions of LOC projects) – it is perfectly feasible to implement shared ownership along the same lines as discussed above; the only substantial twist on this way is that as std::shared_ptr<> (unlike our model above) does allow for circular references and resulting memory leaks, we will probably need to detect them (which can be done, for example, by running some kind of incremental garbage collection at those points where we’re waiting for the input, sitting outside of Reactor::react()).

Phew. BTW, as the whole thing is quite complicated, please make sure to email me if you find any problem with the approach above (while I’m sure that it is possible to achieve safety along the lines discussed above, C++ is complicated enough we might need another restriction or two on this method).

References

[Loganberry04] David ‘Loganberry’, Frithaes! – an Introduction to Colloquial Lapine!, http://bitsnbobstones.watershipdown.org/lapine/overview.html

[NoBugs17] ‘No Bugs’ Hare, ‘Allocator for (Re)Actors with Optional Kinda-Safety and Relocation’, Overload #139, Jun 2017

Acknowledgement

Cartoon by Sergey Gordeev from Gordeev Animation Graphics, Prague

Notes: 

More fields may be available via dynamicdata ..