    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Code Critique Competition 115</title>
        <link>https://members.accu.org/index.php/articles/2606</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Student Code Critiques from CVu journal. + CVu Journal Vol 30, #6 - January 2019</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c184/">Journal Columns</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c183/">Code Critique</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c394/">306</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-183-394/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+183+394/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Code Critique Competition 115</h1>
<p><strong>Author:</strong>&nbsp;Bob Schmidt</p>
<p>
<strong>Date:</strong> 03 January 2019 16:05:04 +00:00 or Thu, 03 January 2019 16:05:04 +00:00</p>
<p><strong>Summary:</strong>&nbsp;Set and collated by Roger Orr. A book prize is awarded for the best entry.</p>
<p><strong>Body:</strong>&nbsp;<p class="EditorIntro">Please note that participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.</p>

<p class="EditorIntro">Note: if you would rather not have your critique visible online please inform me. (Email addresses are not publicly visible.)</p>

<h2>Last issueâ€™s code</h2>

<p class="blockquote">I am trying to write a simple test for some C++17 code, but I am getting some unexpected output. I have simplified the code quite a bit, and in the resultant code below I was expecting to see the output &quot;A,B&quot; but on the latest version of both gcc and MSVC I get just &quot;AB&quot;. Where has my comma gone?! Even more oddly, if I use an older compiler I get different output: &quot;65,66&quot;, which gives me back the comma but loses the letters. Has C++17 broken something?</p>

<p>The code is in Listing 1.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &lt;iostream&gt;
#include &lt;string&gt;
#include &lt;vector&gt;
template &lt;typename T&gt;
using coll = std::vector&lt;T&gt;;
// Start the collection
template &lt;typename T&gt;
void start(coll&lt;T&gt; &amp;up)
{
  up.clear();
}
// End the collection
template &lt;typename T&gt;
void end(coll&lt;T&gt; &amp;up)
{
  up.push_back({});
}
// Extract the non-zero data as a string
template &lt;typename T&gt;
std::string data(coll&lt;T&gt; const &amp;up)
{
  std::string result;
  for (auto v : up)
  {
    if (v)
    {
      result += std::to_string(v);
      result += &quot;,&quot;;
    }
  }
  result.pop_back();
  return result;
}
void test_char()
{
  auto i2 = coll&lt;char&gt;();
  start(i2);
  i2.push_back('A');
  i2.push_back('B');
  end(i2);
  // actual test code elided
  std::cout &lt;&lt; data(i2) &lt;&lt; '\n';
}
int main()
{
  test_char();
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 1</td>
	</tr>
</table>

<h2>Critiques</h2>

<h3>Balog Pal</h3>

<p>The code looks simple enough so I try a pure reading, no compiler solution. Letâ€™s scan the code first:</p>

<ul>
	<li><code>coll</code> is defined as an alias for <code>vector&lt;&gt;</code> for no visible reason, but that may be part of simplification</li>
	
	<li>a function called â€˜<code>end</code>â€™ is looking for trouble, clashing with <code>begin() </code>and <code>end()</code>. Also the name makes no sense for the semantics, might be <code>finish()</code></li>
	
	<li>inside <code>end</code>, a more optimal form of insertion would be <code>emplace_back()</code></li>
	
	<li>another nonsense function name: â€˜<code>data</code>â€™</li>
	
	<li>implementation iterates by value instead of <code>auto const&amp;</code>, that is okay with the <code>char</code> type, but suboptimal for a general <code>T</code></li>
	
	<li><code>T</code> must support <code>if(v)</code> â€“ this works for <code>char</code> but might be bad, no information was provided on the intended <code>T</code> for the code</li>
	
	<li>I have to look up <code>std::to_string</code> on cppreference: it appears we can use it for numeric types and will produce string-ized version of the value. Like &quot;65&quot; for input of 'A'.</li>
	
	<li>if empty <code>v</code> is meant as a sentinel, Iâ€™d expect a break on <code>else</code></li>
	
	<li><code>pop_back</code> will be bugged if <code>up</code> was <code>empty</code> (but not executed this way by the test)</li>
	
	<li>the test starts with an <code>auto X =</code> ...; form that seems to be gaining popularity lately, Iâ€™m still undecided if it is good or bad. The traditional form is certainly a direct <code>init: coll&lt;char&gt; i2;</code></li>
</ul>

<p>And nothing else interesting, so I would expect &quot;65,66&quot; as result. And the description states it is &quot;AB&quot;? Hmm, that looks like we asked for <code>i2.data()</code>. Maybe we added another free function <code>data()</code> to the company of <code>begin</code>/<code>end</code>, <code>size</code> and friends? A look in cppreference shows that has been the case since C++17.</p>

<p>I guess telling that to the OP would result in the question: how come that is relevant to me, I never said <code>std::data</code>, not even used using namespace <code>std</code>. And we are in global scope. Shouldnâ€™t <code>::data</code> be called? </p>

<p>Well, for some interpretations of â€˜shouldâ€™, it should be alright. It seems we have stumbled on another case of computers work on commands instead of wishes. At first I thought of just passing over the issue stating that name lookup and overload resolution rules are not for ordinary humans to explain, but this particular case is not THAT complicated, and even I could figure it out without extra reading (of course ONLY after the fact/accident &#9786;) The explanation is deliberately simplified for this case. Anyone who wants the full run-down can read it at <a href="https://en.cppreference.com/w/cpp/language/overload_resolution">https://en.cppreference.com/w/cpp/language/overload_resolution</a> and the name lookup link at its beginning.</p>

<p>The first thing to explain is how <code>std::data</code> gets considered at all. That is a thing that ordinary programmers should be aware of really: we have names visible from the current scope plus any associated namespaces related by arguments. So the latter, argument-dependent lookup drags in <code>std::</code> because our argument is in reality <code>std::vector</code>. That we used it through an alias does not change its origin. If <code>coll</code> was a class template that just reproduced <code>vector</code> or subclassed it, we would not fall into this trap.</p>

<p>While we are at it, letâ€™s mention that this only applies to unqualified lookup, if spelled <code>::data()</code> in the call, again we would be calling our function.</p>

<p>The next step is more interesting: why is <code>std::data</code> better than <code>::data</code>? The compiler could still pick ours or declare it ambiguous. The rules state an ordering of candidates to find a best fit based on fewer conversions or more specialization. And at first glance ours looks winning, as <code>coll&lt;T&gt;</code> is more specialized than just <code>&lt;T&gt;</code>. But again the language drops another rule on us that has proven unintuitive and misleading to programmers in many other cases. <code>std::data</code> has not one but two overloads matching our case, one taking <code>T&amp;</code> and one taking <code>T const&amp;</code>. And our ammo, <code>i2</code>, is â€˜lvalue of type <code>coll&lt;char&gt;</code>â€™. What I usually just think of in <code>decltype</code> terms as â€˜<code>coll&lt;char&gt; &amp;</code>â€™. certainly can be bound to a <code>const&amp;</code>, but that involves a qualification conversion. If all we had to select from were the <code>std::data()</code> functions, the non-const one is used. And with <code>::data</code>, there is no such version, so that will lose the race. We would not have the problem if <code>i2</code> was const. </p>

<p>So what we get is what we ordered. Did C++17 â€˜break somethingâ€™? That is up to interpretation. The language is not supposed to make subtle changes to well-written code without making some noise. And the presented code in not obviously badly written. Strictly speaking it was 1) open to ADL from the start and 2) the standard stated from the start that it can add names to <code>std::</code> at any time. And it does. Though I didnâ€™t find a promise, the names used by the standard are all-lowercase. For practice the industry relies on that, and expects no clash from mixed-case names.</p>

<p>This code used all-lowercase names and really poor ones too. Iâ€™d rather go with the other wisdom of â€œBad names will come and bite you in the ass. If not today, then tomorrow.â€ Here it took all the way to C++17, but it happened. (Letâ€™s just kindly overlook the multitude of elements required to create the perfect storm, even be careful to assemble a fine zero-terminated string in <code>i2</code> to avoid UB.... &#9786;</p>

<p>The one thing I could not figure out is why the OP expected to see letters &quot;A&quot; and &quot;B&quot; from this, but Iâ€™m happy to leave it as mystery.</p>

<h3>Alastair Harrison</h3>

<p>Given the unexpected behaviour of the program, a good approach is to try stepping through the <code>data</code> function with a debugger. Perhaps surprisingly, this shows that the <code>data</code> function turns out to never be executed. Neither does the <code>end</code> function. In fact, you can totally remove them from the program and it will still compile and run without complaint on C++17.</p>

<p>The reason is Argument Dependent Lookup (ADL) <a href="#[1]">[1]</a>. ADL is a mechanism which allows the compiler to find free functions that live in other namespaces, without the calling code needing to specify that namespace. The example below shows why thatâ€™s (usually) helpful. The <code>accu::print</code> function is effectively treated as part of the public interface of the <code>Widget</code> class. This can be very useful when writing generic template functions when we donâ€™t know up-front which namespace the arguments live in. But sometimes ADL can unexpectedly call the wrong function, as is the case in the original program.</p>

<pre class="programlisting">
  namespace accu {
    class widget {};
    void print(widget const&amp;);
  }
  void foo(accu::widget const&amp; w) {
    ...
    // ADL allows compiler to find the
    // accu::print function, even though it's in
    // a different namespace.
    print(w);
    ...
  }</pre>
  
<p>In C++11, the <code>std::end</code> template function was introduced. You pass in a reference to an STL container, and it will return an iterator to the end of the element range. Itâ€™s an unconstrained template (ie. accepts any argument type) and ADL is choosing it instead of the <code>template &lt;typename T&gt; void end(coll&lt;T&gt; &amp;up)</code> template function that weâ€™ve defined in the global namespace.</p>

<p>In the C++ Core Guidelines, rule T.47 <a href="#[2]">[2]</a> tells us to â€œ<em>Avoid highly visible unconstrained templates with common names</em>â€ because often ADL will select them in preference to the function you might expect. The text points out that the standard library violates the rule widely. And thatâ€™s exactly what is causing problems for us now.</p>

<p>Since <code>std::end</code> doesnâ€™t mutate the container, the effect of the <code>end(i2)</code> call in the <code>test_char</code> function is to do absolutely nothing. Thus the container will be left holding only the letters <code>A</code> and <code>B</code>, without a terminating character.</p>

<p>But it gets better, because the C++17 standard introduced another unconstrained template function called <code>std::data</code>. It accepts a reference to a container and returns a pointer to the block of memory where the containerâ€™s elements are stored. So when the <code>test_char</code> function calls <code>data(i2)</code>, ADL is actually selecting the <code>std::data</code> template, which returns a <code>char*</code> pointer to the contents of the <code>i2</code> container. Thatâ€™s then being passed to the <code>char const*</code> overload of <code>operator&lt;&lt;</code>, which treats the input as if itâ€™s a null-terminated character array.</p>

<p>As we discussed above, our container doesnâ€™t actually contain a null terminator. But luckily for us, the vector... probably allocated more than two bytes of storage, and itâ€™s... probably filled with zeros that look like null terminators. So the program prints the string &quot;AB&quot; and quickly finishes before anyone notices that itâ€™s just caused some Undefined Behaviour.</p>

<p>If youâ€™re curious, you can call <code>i2.reserve(2);</code> immediately after instantiating the container, to limit the memory allocated. Then compile the code using Clangâ€™s address sanitizer <a href="#[3]">[3]</a>. The program will now blow up impressively when it tries to print to <code>std::cout</code>, whizzes right off the end of the memory allocated by the vector and plunges into the nether regions of the free store.</p>

<p>The obvious solution for the ADL problems is to rename the free functions to give them less common names. Another option is to create a strong type to implement the custom container, rather than simply making it a typedef for <code>std::vector</code>.</p>

<p>So whatâ€™s happening on older compilers? Under C++14, the <code>std::data</code> template doesnâ€™t exist, so the <code>data</code> template in the program gets used as expected, and prints some comma-separated values. Unfortunately <code>std::to_string</code> doesnâ€™t have an overload for <code>char</code>, so it ends up performing an implicit conversion to some other integral type, then returns a textual representation of the integer value as a <code>std::string</code>.</p>

<p>Sadly weâ€™re not done quite yet, as thereâ€™s another bug lurking in the <code>data</code> function. When the input container is empty or contains only null elements the <code>result</code> string wonâ€™t have any characters appended to it. The <code>result.pop_back()</code> call will then fail (with more of the dreaded Undefined Behaviour) as weâ€™ve violated its pre-condition that the string is non-empty. This is the sort of thing that unit-tests should be exercising.</p>

<h4>Other improvements</h4>

<p>Because of the code authorâ€™s simplification efforts, itâ€™s not clear how the original code is intended to be used (especially given the alarmingly terse variable names). If you delimit multiple sequences by calling <code>end</code> after each one then the <code>data</code> function wonâ€™t format them properly; itâ€™ll just remove the comma between subsets. If instead the aim is to have only a single sequence then the <code>start</code> and <code>end</code> calls are superfluous. Worse, thereâ€™s nothing to stop someone calling <code>end</code> multiple times and no mechanism to prevent elements from being inserted after a call to <code>end</code>.</p>

<p>We also need to talk about the use of a default-constructed element as the end-delimiter. Though this seems to work tolerably well when the element type is <code>char</code>, it could be a total disaster for <code>int</code>, because the caller could legitimately want to place a zero value into their container. A more robust design would ensure that the <strong>only</strong> way to insert an end delimiter is via an explicit API call. It shouldnâ€™t be possible to do it accidentally.</p>

<p>For the multiple-sequence case a better design could involve using another container as the element type, as shown below. It requires no delimiters, so the question of delimiter representations is completely avoided, as is the need for calls to <code>start</code> and <code>end</code>.</p>

<pre class="programlisting">
  template &lt;typename T&gt;
  using subset = std::vector&lt;T&gt;;
  
  template &lt;typename T&gt;
  using coll = std::vector&lt;subset&lt;T&gt;&gt;;
  
  // Used like this
  auto i2 = coll&lt;char&gt;();
  i2.push_back({'A', 'B'});
  i2.push_back({'C', 'D', 'E'});</pre>
  
<p>Finally, the <code>data</code> function in the original code uses a fairly awkward loop to print a comma-delimited list of elements. It could also be quite inefficient for non-trivial element types, as the <code>range</code>-<code>for</code> loop copies every element. An alternative would be to use an <code>ostream_iterator</code>, as shown below. If the thought of a trailing comma horrifies you, then you can use a custom iterator such as Jerry Coffinâ€™s <code>infix_iterator </code><a href="#[4]">[4]</a>. Or wait for <code>std::ostream_joiner</code> to be standardised.</p>

<pre class="programlisting">
  template &lt;typename InputIter&gt;
  std::string comma_delimit(InputIter first,
    InputIter last) {
    std::stringstream ss;
    std::copy(first, last,
      std::ostream_iterator&lt;char&gt;(ss, &quot;,&quot;));
    return ss.str();
  }</pre>
  
<h4>References</h4>

<p class="bibliomixed"><a id="[1]"></a>[1]	Argument-dependent lookup in cppreference.com:<a href="https://en.cppreference.com/w/cpp/language/adl">https://en.cppreference.com/w/cpp/language/adl</a></p>

<p class="bibliomixed"><a id="[2]"></a>[2]	C++ Core Guidelines: <a href="http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#t47-avoid-highly-visible-unconstrained-templates-with-common-names">http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#t47-avoid-highly-visible-unconstrained-templates-with-common-names</a></p>

<p class="bibliomixed"><a id="[3]"></a>[3]	AddressSanitizer in the Clang 8 documentation:<a href="https://clang.llvm.org/docs/AddressSanitizer.html">https://clang.llvm.org/docs/AddressSanitizer.html</a>)</p>

<p class="bibliomixed"><a id="[4]"></a>[4]	c++ infix_iterator code on Code Review Stack Exchange:<a href="https://codereview.stackexchange.com/questions/13176/infix-iterator-code">https://codereview.stackexchange.com/questions/13176/infix-iterator-code</a></p>

<h3>James Holland</h3>

<p>At first sight, the behaviour of the running code, as reported by the student, seems impossible. Fortunately, a little investigation reveals what is really going on. The solution to the puzzle centres on the call to <code>data()</code> in the output statement of <code>test_char()</code>. It would be reasonable to assume that the function <code>data()</code> defined in the studentâ€™s code is being called but this is not the case. In fact it is <code>std::vector::data()</code> that is being called.</p>

<p>One of the mechanisms that contribute to determine which function is invoked (when there is more than one candidate) is called Argument Dependent Look-up (ADL). The argument of <code>data()</code>, <code>i2</code>, is of type <code>std::vector&lt;char&gt;</code> and by means of a fairly complicated set of ADL rules, the compiler determines that <code>std::vector::data()</code> should be called in preference to version of <code>data()</code> defined in the studentâ€™s code sample.</p>

<p>The function <code>std::vector::data()</code> returns a pointer to the data contained within the vector <code>i2</code> which, in our case, is an array of <code>char</code>s. The array is then displayed by <code>std::cout</code>. It is fortunate that <code>end()</code> is called before being printed to the screen. The function <code>end()</code> adds an element to the end of the vector, <code>i2</code>, that has a value of zero. Without calling <code>end()</code>, the array of characters printed would not be null-terminated and so random characters would be printed after &quot;AB&quot; until the character with a value of zero was chanced upon. This also explains why there is no comma in the output, <code>std::vector::data()</code> simply prints the characters in the vector without any intervening characters.</p>

<p>The student noted that with older compilers, the printed result is &quot;65,66&quot;. This is because <code>std::vector::data()</code> was only introduced in C++11. When using a pre-C++11 compiler, ADL would not have found a function named <code>std::vector::data()</code> and so the studentâ€™s version of <code>data()</code> would have been called.</p>

<p>Unfortunately, there is a â€˜featureâ€™ of the studentâ€™s <code>data()</code> function that converts the characters to be printed to the ASCII representation. The call to <code>std::to_string()</code> performs this transformation. The ASCII representation of 'A' and 'B' is &quot;65&quot; and &quot;66&quot; respectively. The statement containing <code>std::to_string()</code> should be replaced by <code>result += v</code>, or something equivalent.</p>

<p>There are several ways to ensure the studentâ€™s version of <code>data()</code> is called. One way is to qualify the call of the function <code>data()</code> with the scope resolution operator, <code>::</code>. Doing this will disable the ADL mechanism and force the compiler to look in the global namespace for the function where it will find the correct version of the function. Another way is to call the required function with an explicit template parameter, i.e. <code>data&lt;char&gt;()</code>. Perhaps the safest way is to rename the function to something that is not in the <code>std::vector</code> namespace and is more specific to the application at hand.</p>

<p>The problems described above illustrates that an addition to the C++ standard can break existing code. In this case, the addition of <code>std::vector::data()</code> in C++11 was the cause of the trouble. It is important that when code is compiled with a new or different compiler, the software is retested before release.</p>

<h3>Marcel MarrÃ© and Jan Eisenhauer</h3>

<p>There are two aspects to the code for Code Critique 114. One is the behaviour on compilers that do not support C++ 17, and the other is the behaviour under the current standard. Finally, there are some general comments on the code.</p>

<h4>Behaviour pre-C++ 17</h4>

<p>Type <code>char</code> is actually an integer type. Therefore, when <code>result += std::to_string(v);</code> is executed, the number in <code>v</code>, which is the ASCII code of the character, is turned into a string and added to the result. Hence, the result is 65,66. Leaving out <code>std::to_string</code> might work, i.e. <code>result += v;</code>. Alternatively, using <code>std::string::append</code> certainly should work with type <code>char</code>: <code>result.append(v);</code>. Of course, the comma works as expected between the output of the ASCII codes of the characters in the vector.</p>

<h4>Behaviour under C++ 17</h4>

<p>The problem here is a combination of factors â€˜conspiringâ€™ to hide the userâ€™s <code>data()</code> function for the call made at the end of <code>test_char()</code>.</p>

<p>The new standard provides a function template <code>std::data()</code> with a very similar signature to that the user has defined. Due to argument dependent lookup, this is in scope for the call in question. Argument dependent lookup means that because the collection <code>i2</code> is in fact a <code>std::vector</code>, functions in the containing namespace, <code>std</code>, are valid candidates.</p>

<p>Since <code>std::data()</code> also offers a template for non-const parameters â€“  <code>template &lt;class C&gt; constexpr auto data(C&amp; c) -&gt; decltype(c.data());</code>, see <a href="https://en.cppreference.com/w/cpp/iterator/data">https://en.cppreference.com/w/cpp/iterator/data</a>, overload (1) â€“  this is a better fit than the userâ€™s own definition of <code>data()</code>.</p>

<p>The <code>std::data()</code> function invokes the containerâ€™s own <code>data()</code>, which in the case of a <code>std::vector&lt;char&gt;</code> returns &quot;AB&quot; as a single string.</p>

<p>This problem may also hold for <code>end()</code>; while there is no <code>std::start()</code> I am aware of, a non-member <code>std::end()</code> exists to get the iterator one past the end of a collection. In this example, the code works even if <code>end()</code> does not add a default-constructed item and the iterator returned by <code>std::end()</code> is ignored, but it does constitute a â€˜code smellâ€™.</p>

<p>To avoid such problems, it is a good idea to follow the advice of the (forthcoming) Standard Library Compatibility Guidelines (see <a href="https://isocpp.org/std/standing-documents/sd-8-standard-library-compatibility">https://isocpp.org/std/standing-documents/sd-8-standard-library-compatibility</a>). At CppCon 2018, Titus Winters introduced these in his talk â€˜Standard Library Compatibility Guidelines (SD-8)â€™ (see <a href="https://www.youtube.com/watch?v=BWvSSsKCiAw">https://www.youtube.com/watch?v=BWvSSsKCiAw</a>). Pertinent to this code is the fact that the standard library reserves the right to add new functions, so if you have any functions that can potentially take data types from the std-namespace as parameters, these should be put into a namespace and the namespace specified explicitly in the call to avoid problems.</p>

<h4>General comments on the code</h4>

<p>In its current form, <code>coll</code>, <code>start()</code> and <code>end()</code> are not a proper abstraction, because to add an item, the user has to use the underlying containerâ€™s support for adding items. Furthermore, it is assumed that <code>operator bool() const</code> of type <code>T</code> is equivalent to a function that would best be called <code>isPrintable</code> or <code>is_printable</code> (although snake case is discouraged, see again the aforementioned talk by Titus Winters).</p>

<p>This could lead to bugs, such as if a type has an <code>operator bool() const</code> that does not yield false for a default constructed object, so the assumed end marker (which is not required anyway, because the range-based for-loop prevents overstepping the vector bounds anyway) would be printed, adding an item that was added by the userâ€™s <code>end()</code> function, rather than an explicitly added item.</p>

<p>The other way around, there might be types where some values yield true, but <code>isPrintable</code> should really be false for them. And, in fact, if <code>char</code>s were not treated as numbers, <code>char</code> is such a type with many of the ASCII-values below 32 representing non-printable characters like bell and end of text.</p>

<h3>Jim Segrave</h3>

<p>This one was interesting â€“ I tried it with g++ 4.85 and clang++ 3.4.2 (the default Centos 7 distributions) and got the <code>65,66</code> output as described in the problem statement. I then switched to c++17 compilers (g++ 7.31 and clang++ 5.01) and got the <code>AB</code> output.</p>

<p>The <code>65,66</code> output is a normal result, thereâ€™s no reason to use <code>to_string()</code> if you want the ASCII character. Changing the line</p>

<pre class="programlisting">
  result += std::to_string(v);</pre>
  
<p>to</p>

<pre class="programlisting">
  result += v;</pre>
  
<p>fixed the <code>65,66</code> problem when compiled with the option <code>-std=c++11</code>.</p>

<p>To identify the cause of the <code>AB</code> output when using the <code>-std=c++17</code> option, I tried placing some output statements within the <code>data()</code> function, because I could not see why it wouldnâ€™t at least output values separated by commas. I discovered that, when using c++17 as a standard, the function was never called. Checking with the nm utility showed it was in the compiled executable with the expected return type and parameter list. Two quick experiments told me it was time to dig deeper into the changes from c++11 to c++17.</p>

<p>Changing the line:</p>

<pre class="programlisting">
  std::cout &lt;&lt; data(i2) &lt;&lt; '\n';</pre>
  
<p>to</p>

<pre class="programlisting">
  std::cout &lt;&lt; xdata(i2) &lt;&lt; '\n';</pre>
  
<p>caused the function to be invoked. It also gets invoked if that line is changed to:</p>

<pre class="programlisting">
  std::cout &lt;&lt; data&lt;char&gt;(i2) &lt;&lt; '\n';</pre>
  
<p>This tells me that the compiler isnâ€™t seeing <code>data(i2)</code> as a function call, but something else entirely. I thought it was time to look deeper still into the c++17 changes and discovered this (described in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4280.pdf) which details changes to allow getting the size of a container, testing if the container is empty and getting a pointer of the appropriate type to the containerâ€™s data all as <code>constexpr</code>â€™s rather than requiring a member function call on the container.</p>

<p>From the above document, the standard now adds:</p>

<p class="blockquote"><code>template &lt;class C&gt; constexpr auto </code></p>

<p class="blockquote"><code>data(const C&amp; c) -&gt; decltype(c.data());</code></p>

<p class="blockquote">Returns: <code>c.data()</code>.</p>

<p>In other words, the line in <code>test_char</code>:</p>

<pre class="programlisting">
  std::cout &lt;&lt; data(i2) &lt;&lt; '\n';</pre>
  

<p>does not call the function <code>data()</code>, it simply replaces <code>data(i2)</code> with a pointer to the data in the vector <code>i2</code> (equivalent to <code>i2.data()</code>). This is a pointer to <code>char</code>, pointing to <code>i2[0]</code>, <code>i2</code>â€™s values are stored contiguously in memory and, luckily, the sample program pushes an empty value onto the vector (a NUL char), so <code>cout</code> outputs the contents of <code>i2</code> in order as a C-string. Itâ€™s not clear to me that in the absence of the call to <code>end()</code> which appends a NUL to <code>i2</code> that you might not get a core dump or a collection of output for locations not a part of <code>i2</code>â€™s data at all.</p>

<p>As a side note, perhaps itâ€™s just my own personal prejudice, but the method of building a string of values separated by commas is better done by always outputting a separator and the next value. At the start, the separator is set to a null string, after outputting a value, the separator is changed to the actual string to separate the outputs:</p>

<pre class="programlisting">
  const char *sep = &quot;&quot;;
  std::string res;
  for (auto v: i2) {
    res += sep;
    res += v;
    sep = [&lt;CodeComment&gt;
whatever you want as a separator&lt;/CodeComment&gt;
]
  }</pre>
  
<p>The advantages of this method are that the string being built is always ready for use, it contains values separated as desired with no separator before the first value or after the last one. If you need to use the string while itâ€™s still being built, itâ€™s ready for that. It also means that if you decide to change the separator from a single character <code>','</code> for example, to include a space as well: <code>&quot;, &quot;</code>, you only change one line of code, you donâ€™t need to add a second <code>pop_back()</code> call. The separator could even be passed as a pointer to <code>const char</code>.</p>

<p>My submitted code makes this change, it will call <code>test_char()</code> once with the default separator, giving output <code>A,B</code>, then call it again with a separator of <code>&quot;--&gt;&quot;</code>, giving an output of <code>A--&gt;B</code>.</p>

<pre class="programlisting">
  #include &lt;iostream&gt;
  #include &lt;string&gt;
  #include &lt;vector&gt;
  template &lt;typename T&gt;
  using coll = std::vector&lt;T&gt;;
  // Start the collection
  template &lt;typename T&gt;
  void start(coll&lt;T&gt; &amp;up) {
    up.clear();
  }
  // End the collection
  template &lt;typename T&gt;
  void end(coll&lt;T&gt; &amp;up) {
    up.push_back({});
  }
  // Extract the non-zero data as a string
  template &lt;typename T&gt;
  std::string data(coll&lt;T&gt; const &amp;up,
    const char *separator) {
    std::string result;
    const char *sep = &quot;&quot;;
    for (auto v : up) {
      if (v) {
        result += sep;
        result += v;
        sep = separator;
      }
    }
    return result;
  }
  void test_char(const char *separator = &quot;,&quot;) {
    auto i2 = coll&lt;char&gt;();
    start(i2);
    i2.push_back('A');
    i2.push_back('B');
    end(i2);
    // actual test code elided
    std::cout &lt;&lt; data&lt;char&gt;(i2, separator)
      &lt;&lt; '\n';
  }
  int main() {
    test_char();
    test_char(&quot;--&gt;&quot;);
  }</pre>

<h2>Commentary</h2>

<p>This issueâ€™s critique was inspired by recent discussions and example code demonstrating how code could be broken by the addition of new functions into namespace <code>std</code>, including cases where many programmers would not be expecting trouble.</p>

<p>The second angle in this critique is the dual role of <code>char</code> in C++ to represent characters and as an arithmetic type. When a <code>char</code> argument is passed to <code>std::to_string()</code>, the best overload is selected using the <em>integral promotion</em> rules. This will typically select the <code>int</code> overload of <code>std::to_string()</code> (although it could select the <code>unsigned</code> overload on some platforms.)</p>

<p>Note, too, the final subtlety of the rules for overload resolution and ADL in that the call to <code>end() </code>picks the locally defined function template while the call to <code>data()</code> picks the one found through ADL â€“ this is because the of the different <code>const</code> qualification of the argument in the two functions.</p>

<h2>The winner of Code Critique 114</h2>

<p>The critiques between them seemed to give good coverage of the issues in the code. There were a variety of ways used to explain the name lookup issue that resulted in finding an â€˜unexpectedâ€™ overload of <code>data()</code> â€“ this is the sort of explanation that is much easier face-to-face as you can quickly adjust the explanation to the degree of understanding of the recipient!</p>

<p>There are several ways to avoid the unfortunate name lookup â€“ itâ€™s good to know more than one technique as the best one to use depends on the circumstances of the case and James provided three. Alastair pointed out that avoiding the use of a simple alias for <code>std::vector</code> side-steps the problem here, Marcel and Janâ€™s critique suggested putting the function into its own namespace, and multiple critiques mentioned renaming the function and using the scope resolution operator. </p>

<p>Alastair reported that <code>end()</code> was not called â€“ which puzzled me since it <em>should</em> be called in all versions of C++. However, he was correct that it can be removed as this then allows overload resolution to pick the less specialized <em>end()</em> function from namespace <code>std</code>. (Adding a suitable <code>static_assert</code> to the <code>end()</code> function template is one way of demonstrating whether or not the template is used.) This led him slightly astray â€“ but into an interesting discussion about whether or not the â€˜one past the endâ€™ character would be zero.</p>

<p>Pal pointed out a couple of details in the implementation where there might be optimisation opportunities: one of the challenges with writing templates in C++ is to try and consider the possible types that might be used in instantiations and ensure that good choices are made for the general case, such as avoiding unnecessary copies.</p>

<p>Jim gave a useful side-note about delimited string, or alternatively you might like one of the iterator solutions provided by Alastair.</p>

<p>Both Alastairâ€™s and Marcel and Janâ€™s critiques also gave some useful discussion about the design issues with the code as presented (this is always a bit of a challenge given the simplified nature of the critiques presented in this column).</p>

<p>Taking all the points into consideration, I decided to award the prize for this issue to Marcel and Janâ€™s critique.</p>

<h2>Postscript to Code Critique 113</h2>

<p>Balog Pal writes:</p>

<p>I almost wrote my entry at first sight â€“ it was so perfect a bait â€“ just one of my solutions didnâ€™t compile and I held backâ€¦ 2 months is plenty of time, so obviously I realized what I forgot 2 days after the deadline. I then thought still to post it as a late entry, but expected plenty of submissions with similar points, so thought it better to wait and only post if something was missed.</p>

<p>And there is such a point: there was much talk about undefined behavior and what to expect under specific ABIs, yet the first critique never mentioned it. Some may even believe that if an implementation specifies how arguments are passed and they are placed as we would like them, the example would be correct. It is not.</p>

<p>The expression <code>&amp;first + 1 + sizeof...(rest)</code> has defined behavior only when the last part is 0. Let me quote the standard at [epr.add] on p4:</p>

<p class="blockquote">When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements<sup>86</sup>, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 _ i + j _ n; otherwise, the behavior is undefined. Likewise, the expression P - J points to the (possibly-hypothetical) element x[i - j] if 0 _ i - j _ n; otherwise, the behavior is undefined.</p>

<p class="blockquote"><sup>86)</sup> An object that is not an array element is considered to belong to a single-element array for this purpose; see 7.6.2.1. A pointer past the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n] for this purpose; see 6.7.2.</p>

<p>Now, <code>first</code> is a solo object, so the pointer <code>&amp;first</code> is good for the addition of 0 and 1. Anything beyond that is undefined behavior.</p>

<p>IME, itâ€™s a point too often overlooked. Probably because it happens to â€˜workâ€™ on most current flat-memory platforms. Those who remember the old DOS or Windows 16-bit time where you had <code>__huge</code> pointers and such a pointer had a 16-bit paragraph or selector part and a 16-bit offset part. And using a wild + (or especially -) that looked outside an â€˜objectâ€™ could easily produce broken result or even a direct crash.</p>

<h2>Code Critique 115</h2>

<p>(Submissions to scc@accu.org by Feb 1st)</p>

<p class="blockquote">I am trying to write a generic â€˜fieldâ€™ that holds a typed value, optionally with a â€˜dimensionâ€™ such as â€˜kgâ€™. One use case is to be able to hold attributes of various different types in a standard collection. I want to be able to compare two attributes to see if theyâ€™re the same (and maybe later to sort them?) â€“ two simple fields are only comparable if they are the same underlying type and value and two â€˜dimensionedâ€™ fields need to be using the same dimension too. Iâ€™ve put something together, but Iâ€™m not quite sure why in my little test program I get true for comparing house number and height:</p>

<pre class="programlisting">
    Does weight == height? false
    Does weight == house number? false
    Does house number == height? true
    Re-reading attributes
    Unchanged
    Reading new attributes
    some values were updated</pre>
	
<p class="blockquote">Can you help with the problem reported, and point out some potential flaws in the direction being taken and/or alternative ways to approach this problem?</p>

<p>The code is in Listing 2 (<span class="filename">field.h</span>) and Listing 3 (<span class="filename">field test.cpp</span>).</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#pragma once 
#include &lt;memory&gt;
#include &lt;string&gt;
// Base class of the 'field' hierarchy
class field
{
protected:
  virtual bool
  equality(field const &amp;rhs) const = 0;
public:
  // convert any field value to a string
  virtual std::string to_string() const = 0;

  // try to compare any pair of fields
  friend bool operator==(field const &amp;lhs,
    field const &amp;rhs)
  {
    return lhs.equality(rhs);
  }
};
// Holds a polymorphic field value
struct field_holder : std::unique_ptr&lt;field&gt;
{
  using std::unique_ptr&lt;field&gt;::unique_ptr;
  bool
  operator==(field_holder const &amp;rhs) const
  {
    return **this == *rhs;
  }
};
// Implementation in terms of an underlying
// representation
template &lt;typename Rep&gt;
class field_t : public field
{
public:
  field_t(Rep const &amp;t) : value_(t) {}
  std::string to_string() const override
  {
    return std::to_string(value_);
  }
protected:
  bool equality(field const &amp;f) const override
  {
    // compare iff of the same type
    auto p =
      dynamic_cast&lt;field_t const *&gt;(&amp;f);
    if (p)
      return value_ == p-&gt;value_;
    else
      return false;
  }
private:
  Rep value_;
};
// Add a dimension
#include &lt;typeinfo&gt;
template &lt;typename dimension,
  typename Rep = int&gt;
class typed_field : public field_t&lt;Rep&gt;
{
public:
  using field_t&lt;Rep&gt;::field_t;
  std::string to_string() const override
  {
    return field_t&lt;Rep&gt;::to_string()
      + dimension::name();
  }
protected:
  bool equality(field const &amp;f) const override
  {
    // compare iff the _same_ dimension
    return typeid(*this) == typeid(f) &amp;&amp;
      field_t&lt;Rep&gt;::equality(f);
  }
};
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 2</td>
	</tr>
</table>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &quot;field.h&quot;
#include &lt;iostream&gt;
#include &lt;map&gt;
// Example use
struct tag_kg{ static const char *name()
  { return &quot;kg&quot;; } };
struct tag_m{ static const char *name()
  { return &quot;m&quot;; } };
using kg = typed_field&lt;tag_kg&gt;;
using metre = typed_field&lt;tag_m&gt;;
using attribute_map = std::map&lt;std::string,
  field_holder&gt;;
// Dummy method for this test program
attribute_map read_attributes(int value)
{
  attribute_map result;
  result[&quot;weight&quot;]
    = std::make_unique&lt;kg&gt;(value);
  result[&quot;height&quot;]
    = std::make_unique&lt;metre&gt;(value);
  result[&quot;house number&quot;]
    = std::make_unique&lt;field_t&lt;int&gt;&gt;(value);
  return result;
}
int main()
{
  auto old_attributes = read_attributes(130);
  // height and weight aren't comparable
  std::cout &lt;&lt; &quot;Does weight == height? &quot;
    &lt;&lt; std::boolalpha &lt;&lt; 
     (old_attributes[&quot;weight&quot;] ==
      old_attributes[&quot;height&quot;])
    &lt;&lt; '\n';
  // weight and house number aren't comparable
  std::cout &lt;&lt; &quot;Does weight == house number? &quot;
    &lt;&lt; std::boolalpha &lt;&lt; 
     (old_attributes[&quot;weight&quot;] ==
      old_attributes[&quot;house number&quot;])
    &lt;&lt; '\n';
  // house number and height aren't comparable
  std::cout &lt;&lt; &quot;Does house number == height? &quot;
    &lt;&lt; std::boolalpha &lt;&lt; 
     (old_attributes[&quot;house number&quot;] ==
      old_attributes[&quot;height&quot;])
    &lt;&lt; '\n';
  std::cout &lt;&lt; &quot;Re-reading attributes\n&quot;;     
  auto new_attributes = read_attributes(130);
  if (old_attributes == new_attributes)
  {
    std::cout &lt;&lt; &quot;Unchanged\n&quot;;
  }
  else
  {
    // update ...
    std::cout &lt;&lt; &quot;Some values were updated\n&quot;;
  }

  std::cout &lt;&lt; &quot;Reading new attributes\n&quot;;
  new_attributes = read_attributes(140);
  if (old_attributes == new_attributes)
  {
    std::cout &lt;&lt; &quot;Unchanged\n&quot;;
  }
  else
  {
    // update ...
    std::cout &lt;&lt; &quot;some values were updated\n&quot;;
  }
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 3</td>
	</tr>
</table>

<p>You can also get the current problem from the accu-general mail list (next entry is posted around the last issueâ€™s deadline) or from the ACCU website (<a href="http://accu.org/index.php/journal">http://accu.org/index.php/journal</a>).</p>

<p>This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.</p>

<p class="bio"><span class="author"><b>Roger Orr</b></span>  Roger has been programming for over 20 years, most recently in C++ and Java for various investment banks in Canary Wharf and the City. He joined ACCU in 1999 and the BSI C++ panel in 2002.</p>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
