Journal Articles

CVu Journal Vol 27, #5 - November 2015 + Programming Topics
Browse in : All > Journals > CVu > 275 (10)
All > Topics > Programming (877)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Functional Programming in C++

Author: Martin Moene

Date: 05 November 2015 09:07:10 +00:00 or Thu, 05 November 2015 09:07:10 +00:00

Summary: Richard Falconer reports on an ACCU talk by Kevlin Henney.

Body: 

A common belief is that functional programming is constrained to the realm of Haskell and Lisp. On 14th September, Oxford Asset Management hosted a talk by Kevlin Henney showing that this is not the case, and that you can in fact follow functional programming paradigms in C++.

If we had a time machine the first thing any of us would do is go back in time and make const the default modifier. Well, maybe not the first thing. After all one must spare a thought to how the C++11 committee would deal with re-purposing the original auto as a result of such temporal meddling. The point, Kevlin explains, is pure functional programming has no side-effects.

When it is not necessary to change, it is necessary not to change
~ Lucius Cary

To move towards functional programming is to move towards a stateless world, a world where moving parts of a system should be isolated and data flows in one direction through classes. ‘Values’ are a core concept of functional programming; rather than objects with a stateful identity these are objects with unchanging values. Kevlin gives the canonical example of a poorly-written Date class (see Listing 1, based on slides 37 and 39).

class date
{
public:
  date(int year, int month, int day);
  int get_year() const;
  int get_month() const;
  int get_day() const;
  int set_year(int);
  int set_month(int);
  int set_day(int);
void set(int year, int month, int day);
private:
  ...
};
auto today = date(2015, 9, 15);
today.set_day(16);
			
Listing 1

Here it’s clear that by calling the set methods you can leave the object in an invalid state; 30th Feb never occurs unless you’re PHP (which thinks Feb 30th is basically the same as March 2nd [1]).

Instead, Kevlin points out, you should rebind state via the assignment operator rather than permuting the existing object (see Listing 2, based on slide 40).

class date
{ 
public:
  date(int year, int month, int day);
  int year() const;
  int month() const;
  int day() const;
};
auto today = date(2015, 9, 15);
today = date(2015, 9, 16);
			
Listing 2

Now this is a value-semantic class whose instances can be passed around and referenced without worrying about race conditions or other threads modifying our object through C++’s many aliasing mechanisms.

This isn’t to say that we should do away with all state, but minimising it allows you to reason about the interaction between functions more easily; there’s no need to keep your mental buffer busy tracking side-effects and then lose 20 minutes work when someone comes to your desk to ask you if you received the email they just sent.

Now that we have stateless classes (or at least, classes whose state does not change from the perception of the class API boundary) we can start to build by composition. A core requirement for functional programming is to have functions as first-class citizens; that is, functions as things you can return and accept as arguments. This function-passing allows complex behaviour to be composed by expressing ideas in terms of combinations of other functions. This composition is made syntactically easier through the use of lambdas, which are now available in C++11. (C# has boasted true lambda support since v3.0 2007, but everyone is too polite to mention they’ve been around as a concept since Alonzo Church’s 1932 journal [2] in Annals of Mathematics).

Evaluation of our functions objects does not depend on mutable state. Expressions created by composing such objects are said to be referentially-transparent; the expression could be replaced by its value without changing the behaviour of the program [3].

Asking a question should not change the answer
(Nobody tell Heisenberg.)

This is all very well for simple value types like Date, but what about something less trivial?

Persistent Data Structures [4] (not to be confused with a persistent storage) are effectively immutable (see Listing 3, based on slide 56).


template<typename T>
class vector
{
public:
  typedef const T * iterator;
  ...
  bool empty() const;
  std::size_t size() const;
  iterator begin() const;
  iterator end() const;
  const T & operator[](std::size_t) const;
  const T & front() const;
  const T & back() const;
  const T * data() const;
  vector popped_front() const;
  vector popped_back() const;
private:
  ...
  iterator from, until;
};
			
Listing 3

Note popped_front/back are const. They return the view of the data with those operations applied: “what would you look like with the front popped”. The underlying data is unpermuted, and any aliases to the original full vector can still resolve. You can then have many views on the same data, with no need to copy the full vector. (The vector returned from popped_front is just the original vector with the internal iterator modified to point to the 2nd element).

This all greatly simplifies the addition of threads. When people think ‘Threads’, they think ‘Locks’, yet locks just make your thread wait, and all computers wait at the same speed. Even shared_ptr introduces interlocked increments/decrements for all its reference counting operations, but letting threads lose on a code base without sufficient const-correctness is tantamount to releasing a pack of rabid dogs through your code. Or has results similar to the times when you introduce a void* and tell the compiler “hold my beer and watch this”.

Some people, when confronted with a problem, think “I know, I’ll use threads”, and then havthey e two prbolems.

With our immutable value-types and generous use of const this is all a lot easier.

The solution is to compose from scratch without thinking about locks, and thus be in a situation where your data is immutable or unshared, or both. Kevlin recalls a consultancy job where introduction of one-way data-flow and proper composition reduced the total number of locks used by the system from 30,000 to 6.

Kevlin closes with some thoughts on memory management, which was a problem apparently plaguing England even in the time of Shakespeare’s plays:

Hamlet: From the table of my memory I'll wipe away all trivial fond records.

Clearly Hamlet is a garbage-collection fan.

Ophelia: ’Tis in my memory locked, and you yourself shall keep the key of it.

Whereas Ophelia prefers reference counting.

Memory management is especially significant here because of the different consumers sharing the same state. Consider a referentially-transparent list class such as Listing 4 (from slide 63).


template<typename T>
class list
{
public:
  class iterator;
...
std::size_t size() const;
  iterator begin() const;
  iterator end() const;
  const T & front() const;
  list popped_front() const;
  list pushed_front() const;
private:
  struct link
  {
    link(const T & value,
         std::shared_ptr<link> next);
    T value;
    std::shared_ptr<link> next;
  };
  std::shared_ptr<link> head;
  std::size_t length;
};
			
Listing 4

This is much the same concept as the vector example but with all the operations on front instead. Multiple lists are built up to form an inverted tree structure, with the root node in the tree being the last element of all lists:

Kevlin demonstrated this with some entertaining audience participation, with Nigel doing a good job of representing the dangling pointer.

The problems start when a list’s destructor runs on the leaf node of a long chain. If the reference count of all nodes in that chain are just 1 they too must run their own destructors:

  {
    list<anything> chain;
    std::fill_n(std::front_inserter(chain),
       how_many, something);
  } // What happens when we reach this brace?

So the destructors now run recursively, blowing up the stack very quickly (even with small lists in the order of thousands of elements). As a result we have to fallback to garbage-collection when working with these referentially-transparent structures. Notably the Standard allows for garbage collection, but that is a subject for another day.

Our thanks goes out to Oxford Asset Management (especially Charlie, Charlotte, and Tom) for hosting us at their great venue and for providing the buffet. Around 50 people attended; a new ACCU Oxford record.

More information

Kevlin’s slides: http://www.slideshare.net/Kevlin/functional-c

ACCU Oxford Meetup page: http://www.meetup.com/ACCU-Oxford/events/223715720/

Any marks referenced herein are property of their respective owners and used without permission but on a good faith basis that such use herein is Fair Use. No claim of ownership or licensure of said marks is made herein. Please see the respective owners for more information regarding said marks.

References

[1] <?php echo date("M-d-Y", strtotime('2015-02-30'));, although checkdate(2015,02,30) does at least return false.

[2] A. Church, ‘A set of postulates for the foundation of logic’, Annals of Mathematics, Series 2, 33:346–366 (1932).

[3] Referential transparency – https://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29

[4] Persistent Data Structures – https://en.wikipedia.org/wiki/Persistent_data_structure

Notes: 

More fields may be available via dynamicdata ..