In Overload 25 I looked at counted_ptr<type>. Specifically I used it to implement a string class, which was able to share common state.
namespace accu { class string { public: string(const char *literal = ""); char &operator[](size_t index); ... private: struct body; counted_ptr<body> ptr; }; } ...
Towards the end of that article I hinted at problem with counted_ptr<type> that remained unresolved…
using accu::string; string theory("hello"); string vest(theory); theory[0] = 'J'; cout << vest << endl; // must print "hello" and not "jello"
The basic problem is that sometimes you have to stop sharing data. What should happen when you attempt to change the shared data? There is no right or wrong answer. It depends on the context. It depends why you are sharing. You have to decide whether the sharing is an externally visible feature or an internal implementation detail. For my fledgling string class I was sharing state because it gave lazy evaluation. It saved me having to make a deep copy of the state during a copy construction or a copy assignment. In other words it was an internal implementation detail.
There are a number of ways to solve this problem. In case you haven't got the previous article (it was a while ago) and because it's not the focus of this article I'll unwrap the counted pointer. Ok, to get things started the first thing to notice is that we can make use of const overloading:
namespace accu { class string { public: string(const char *literal = ""); ... char &operator[](size_t index); const char &operator[](size_t index) const; ... private: char *text; size_t *count; }; }
An alternative version of the const array-subscript operator could return a plain char (by value). There is not much to choose between the two, but you might consider using the one that gives the clearest (least obscure) error message when misused (if they differ). The implementation of the const version is simple and does not need to make a deep copy since the data cannot change. The implementation of the non-const version is not so simple since it may need to make a deep copy.
namespace accu // string { string::string(const char *literal) : text(new char[strlen(literal)+1], count(new size_t(1)) { strcpy(text, literal); } char &string::operator[](size_t index) { bounds_check(index); unshare_state(); return text[index]; } }
And string::unshare_state() will need to ensure its state is not shared by any other string objects. There are a number of details to remember when writing unshare_state. Firstly there is no need to make a deep copy if the state is already unshared, in other words if the reference count is one. Secondly, if a deep copy is required then a new reference count will also need to be allocated. That's two allocations inside a single function. Ensuring such a function is exception safe can be tricky...
namespace accu { void string::unshare_state() { if (*count != 1) { auto_ptr<size_t> new_count(new size_t(1)); char *new_text = new char[strlen(text)+1]; strcpy(new_text, text); --*count; count = new_count.release();; text = new_text; } } }
With this in place we can now check the following example cases are well-behaved
string writeable("hello"); string another(writeable); writable[0] = 'J'; // 0 cout << writeable << endl; // 1 cout << another << endl; // 2 const string readonly("Pat"); readonly[0] = 'C'; // 3
Line 1 should print "Jello" because writeable is modified in line 0. Line 2 should print "hello". Line 3 should give a compiler error. Fine.
writeable[0] = 'J'; // 0 cout << writeable[0] << endl; // 4
The point to note is that a read access of a modifiable string will cause the method string::unshare_state() to be invoked. That's a pity seeing as the statement on line 4 is not modifying the string. Many of you will have read Jim Coplien's book Advanced C++ Programming Styles and Idioms and will know a way of solving this. The trick is not to return a "real" char reference from the non const version of string::operator[] but to return something that looks like, acts like, feels like, smells like and behaves like char reference. A proxy. I'll call it char_reference[1].
namespace accu { class string { public: // types class char_reference { public: ... void operator=(char new_ch); operator char() const; ... private: ... }; char_reference operator[] (size_t index); const char &operator[] (size_t index) const; ... private: ... }; }
I've made the char_reference assignment operator a void function for simplicity of exposition. The return type is not the focus of this article. With this sleight of hand in place we can look at lines 0 and 4. Here's line 0 with the peel removed bit by bit...
writeable[0] = 'J'; writeable.operator[](0) = 'J'; writeable.operator[](0).operator=('J');
And here's the expression in line 4 with the peel removed bit by bit...
writeable[0] writeable.operator[](0) writeable.operator[](0).operator char()
Fine, but I'm going to explore this a little further: in particular, the implementation of the two char_reference methods. A first attempt might look like this (implemented inline to save space)
class char_reference { public: char_reference(char &it) : ch(it) {} void operator=(char new_ch) { ch = new_ch; } operator char() const { return ch; } private: char &ch; };
However, this is flawed. Once again we need to remember to unshare the string state when it is being modified. Changing an element of a string is something that should be done by a method of string. Here's another attempt:
class char_reference { public: char_reference (string &s, size_t index); void operator=(char new_char); operator char() const; private: string &s; size_t index; };
This leaves the interesting question of what methods of string the methods of char_reference should delegate to. The conversion operator can be implemented like this (in a conforming compiler):
string::char_reference::operator char() const { const string &ro = s; return ro[index]; }
Note that s must be used as a read only string reference, to avoid infinite recursion. But, how can the assignment operator be implemented? Not like this, because we're back to infinite recursion again.
void string::char_reference::operator= (char new_ch) { s[index] = new_ch; }
One way to solve this is to create a new string method. For example
void string::assign (size_t index, char new_ch) { bounds_check(index); unshare_state(); text[index] = new_ch; }
The question is whether to make string::assign public or private[2]. There are conflicting forces. On the one hand you might want to make it private, viewing it as an implementation detail. You might also want to make it private so that a string client has just one syntax for assignment. But, how does char_reference gain access to this private method? The usual solution is as a friend.
namespace accu { class string { public: // types class char_reference { public: ... void operator=(char new_ch) { s.assign(index, new_ch); } }; ... private: friend char_reference; void assign(size_t index, char new_ch); ... }; }
On the other hand you might consider that the cure is worse than the symptoms. Granting char_reference total friendship when limited friendship (to assign) was all that was required might be seen as something of a large sledge-hammer cracking a small nut. If this is your view, you'd probably make the primitive public, and accept a choice of assignment syntax.
However, there is an alternative. You can remove the friendship, and make string::assign public but uncallable! Bizarre[3]. The trick is to use an opaque type.
namespace accu { class string { public: // types class char_reference { public: ... void operator=(char new_ch); }; char_reference operator[] (size_t index); ... public: // but uncallable! struct position; // HERE void assign(position index, char new_ch); private: ... }; }
namespace accu { struct string::position { size_t index; }; ... void string::char_reference::operator= (char new_ch) { position p = { index }; s.assign(p, new_ch); } ... void string::assign (position pos, char new_ch) { bounds_check(pos.index); unshare_state(); text[pos.index] = new_ch; } ... }
That's all for now. Cheers
[1] Note that char_reference will also be valuable when implementing string::iterator::operator*()
[2] There was a long thread on ACCU.general essentially boiling down to this recently
[3] There is also another solution. It is possible to grant limited friendship. Mark Radford showed me how. Perhaps I'll cover that in another article.
Overload Journal #29 - Dec 1998 + Programming Topics
Browse in : |
All
> Journals
> Overload
> 29
(12)
All > Topics > Programming (877) Any of these categories - All of these categories |