    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Code Critique Competition 117</title>
        <link>https://members.accu.org/index.php/journals/2651</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 31, #2 - May 2019 + Student Code Critiques from CVu journal.</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c398/">312</a>
                    (7)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c184/">Journal Columns</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c183/">Code Critique</a>
                    (70)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c398-183/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c398+183/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Code Critique Competition 117</h1>
<p><strong>Author:</strong>&nbsp;Bob Schmidt</p>
<p>
<strong>Date:</strong> 03 May 2019 23:24:43 +01:00 or Fri, 03 May 2019 23:24:43 +01:00</p>
<p><strong>Summary:</strong>&nbsp;Set and collated by Roger Orr. A book prize is awarded for the best entry.</p>
<p><strong>Body:</strong>&nbsp;<p class="EditorIntro">Please note that participation in this competition is open to all members, whether novice or expert. Readers are also encouraged to comment on published entries, and to supply their own possible code samples for the competition (in any common programming language) to scc@accu.org.</p>

<p class="EditorIntro">Note: if you would rather not have your critique visible online please inform me. (Email addresses are not publicly visible.)</p>

<h2>Last issueâ€™s code</h2>

<p class="blockquote">Iâ€™m trying to extract the count of letters in a file and while the program itself seems to work OK, Iâ€™m not getting the debugging output I expected. I was expecting to see a debug line of each letter being checked during the final loop, but I only get some of the letters shown.</p>

<p class="blockquote">For example, using <span class="filename">sample.txt</span> that consists of â€œThis is a some sample letter inputâ€ I get:</p>

<pre class="programlisting">
    letter sample.txt
    Seen T
    Seen h
    Seen n
    Seen o
    Seen r
    Seen u
    Commonest letter: e (4)
    Rarest letter: T (1)</pre>
	
<p class="blockquote">Why are only <em>some</em> of the letters being shown as <code>Seen</code>?</p>

<p>Can you solve the problem â€“ and then give some further advice?</p>

<p>Listing 1 contains  <span class="filename">logging.h</span> and Listing 2 is <span class="filename">letter.cpp</span>.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &lt;iostream&gt;
namespace
{
  class null_stream_buf
  : public std::streambuf
  {
  } nullbuf;
  std::ostream null(&amp;nullbuf);
}
#ifdef NDEBUG
#define WRITE_IF(X) if (false) null
#else
#define WRITE_IF(X) ((X == true) ? std::cout 
  : null)
#endif
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 1</td>
	</tr>
</table>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &quot;logging.h&quot;
#include &lt;fstream&gt;
#include &lt;map&gt;

// get the most and least popular chars
// in a file
int main(int argc, char **argv)
{
  WRITE_IF(argc &lt; 2)
    &lt;&lt; &quot;No arguments provided&quot;;
  if (argc &gt; 1)
  {
    std::ifstream ifs(argv[1]);
    std::map&lt;char, int&gt; letters;
    char ch;
    while (ifs &gt;&gt; ch)
      ++letters[ch];
     
    std::pair&lt;char, int&gt; most =
      *letters.begin();
    std::pair&lt;char, int&gt; least =
      *letters.begin();
    for (auto pair : letters)
    {
      WRITE_IF(pair.second) &lt;&lt; &quot;Seen &quot;
        &lt;&lt; pair.first &lt;&lt; std::endl;
      if (pair.second &gt; most.second)
      {
        most = pair;
      }
      else if (pair.second &lt; least.second)
      {
        least = pair;
      }
    }
    std::cout &lt;&lt; &quot;Commonest letter: &quot;
      &lt;&lt; most.first &lt;&lt; &quot; (&quot;
      &lt;&lt; most.second &lt;&lt; &quot;)\n&quot;;
    if (most != least)
      std::cout &lt;&lt; &quot;Rarest letter: &quot;
        &lt;&lt; least.first &lt;&lt; &quot; (&quot;
        &lt;&lt; least.second &lt;&lt; &quot;)\n&quot;;
  }
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 2</td>
	</tr>
</table>

<h2>Critique</h2>

<h3>Hans Vredeveld</h3>

<p>I can see a couple of issues with this code, one of them resulting in only part of the letters being shown as <code>Seen</code>. Letâ€™s go through the code starting with <span class="filename">logging.h</span>.</p>

<p>The first thing we encounter is a class <code>null_stream_buf</code> and variables <code>nullbuf</code> and <code>null</code> that are defined in an unnamed namespace. The idea behind an unnamed namespace is that you can define things in a translation unit for use in that translation unit only without having to think about whether a name is already used in another translation unit. The compiler will create unique names for them in each translation unit, so there is no ODR violation. The result of this is that we will have as many instances of <code>null</code> and <code>nullbuf</code> in our binary as there are translation units. A waste of space in the binary.</p>

<p>If <code>null_stream_buf</code> were a buffer for some device (or file), the result would be a messy output. Messages written to <code>null</code> in different translation units would end up in different buffers and only when the underlying library code decides to write the buffer to the device would the data be written. With different, independent, buffers, the order the data is written will appear random and the data will typically contain partial messages. This is similar to when two console applications write simultaneously to the console (or a single log file).</p>

<p>To get the expected result of the messages appearing in the output in the order they were written to the stream, we have to get rid of the unnamed namespace. The class <code>null_stream_buf</code> has to be defined in the global namespace (or in a named namespace), and the variables <code>nullbuf</code> and <code>null</code> have to be declared <code>extern</code> in the global namespace (or in a named namespace). Also,  <code>nullbuf</code> and <code>null</code> each have to be defined in just one implementation file.</p>

<p>Now we have only one output stream <code>null</code> and one buffer <code>nullbuf</code> in our application, but there is still an issue with <code>null_steam_buf</code>. It doesnâ€™t override any of the virtual functions of <code>std::streambuf</code>. In particular, it doesnâ€™t override <code>int_type overflow(int_type ch)</code> in <code>std::streambuf</code>. The default implementation in <code>std::streambuf</code> always returns end-of-file. As a result, as soon as a single character is written to <code>null</code>, the fail-bit and bad-bit will be set. Basically, <code>null</code> will tell us that itâ€™s full and canâ€™t accept anything any more, but we happily continue to stream data to it. For a null stream, this is very strange. We expect it to always accept data and just discard it. To get this behaviour, we have to implement a public override for the overflow member function:</p>

<pre class="programlisting">
  int_type overflow(int_type ch) { return ch; }</pre>
  
<p>As an aside, note that, while the code is legal C++, clang++ (version 5.0.1) with all warnings enabled warns that <code>nullbuf</code> and <code>null</code> both require global and exit-time destructors. The warnings tell us that we may have a problem when there are multiple global variables in different translation units. We could get rid of these warnings by removing the global variables.</p>

<p>Moving to the next lines in <span class="filename">logging.h</span>, we come to the <code>WRITE_IF</code> macro. When <code>NDEBUG</code> is defined, nothing is logged because of the <code>if (false)</code>. It doesnâ€™t matter what <code>ostream</code> object follows. It could be <code>std::cout</code> instead of <code>null</code>.</p>

<p>When <code>NDEBUG</code> is not defined, compiling with g++ -Wall gives us a warning on <code>WRITE_IF(argc &lt; 2)</code>, suggesting we put parentheses around the argument. The parentheses can best be put around <code>X</code> in the definition of <code>WRITE_IF</code>. The real problem with this macro can be seen in its other use in the application, <code>WRITE_IF(pair.second)</code>. After macro substitution, we have the comparison <code>(pair.second == true)</code>. However, <code>pair.second</code> is of type <code>int</code> and that means that we do an integer comparison and the <code>bool</code> constant <code>true</code> gets promoted to the <code>int</code> 1. The result is that we only print the characters that occur only once in the input file. What we need is a conversion to <code>bool</code> of <code>pair.second</code>, which we get with</p>

<pre class="programlisting">
  #define WRITE_IF(X) ((X) ? std::cout : null)</pre>
  
<p>With this modification, we will stream to <code>std::cout</code> when <code>X</code> evaluates to <code>true</code> and to <code>null</code> when it evaluates to <code>false</code>. But, why stream to a null output stream that discards everything that is streamed to it at all? When we change the definition to</p>

<pre class="programlisting">
  #define WRITE_IF(X) if (X) std::cout</pre>
  
<p>we only send data to a stream when we want to see it. If we now change the definition in the case that <code>NDEBUG</code> is defined to</p>

<pre class="programlisting">
  #define WRITE_IF(X) if (false) std::cout</pre>
  
<p>we can remove the global variables <code>null</code> and <code>nullbuf</code>, and the class <code>null_stream_buf</code>.</p>

<p>Now that we have <code>WRITE_IF</code>â€™s definition right, letâ€™s have a quick look at the statements where it is used in <span class="filename">letter.cpp</span>. The first time it is used, some text is output without a newline at the end. The second time, the newline is output through <code>std::endl</code>. While I prefer that the logging framework takes care of outputting the newline at the end, <code>WRITE_IF</code> is such a simple logging macro that this is not realistic. In this case it is up to the user to end each statement with outputting a <code>'\n'</code>, but not through <code>std::endl</code>, as Chris Sharpe described so well in <em>Overload</em> 149.</p>

<p>Letâ€™s continue with <span class="filename">letter.cpp</span> and in particular the <code>while</code>-statement that reads the file. I keep on wondering whether this <code>while</code>-statement does what the codeâ€™s author intended. The condition of the <code>while</code>-statement, <code>ifs &gt;&gt; ch</code>, uses formatted input to read a character from the file. The effect of this is that it will skip over whitespace, but not over e.g. punctuation. Do we really want to ignore spaces, but count commas and full stops? As the map containing the data is called letters, I suspect that punctuation should also be ignored. Then an extra check with <code>std::isalpha</code> is in place:</p>

<pre class="programlisting">
  while (ifs &gt;&gt; ch)
    if (isalpha(ch)) ++letters[ch];</pre>
	
<p>Next is the <code>for</code>-loop. I donâ€™t like the use of <code>pair</code> as a variable name, as it can cause confusion with the well-known <code>std::pair</code>. My first suggestion would be to name it <code>letter</code>, but whether that is a good name is also open for debate. Instead we can replace the entire <code>for</code>-loop with a call to <code>std::minmax_element</code>:</p>

<pre class="programlisting">
  auto [least, most] =
   std::minmax_element(letters.begin(), 
     letters.end(),
     [](const auto&amp; a, const auto&amp; b) {
       return a.second &lt; b.second;
     });</pre>
	 
<p>For this to work, we need to <code>#include &lt;algorithm&gt;</code>. With this change we lose the debug output of which letter we have seen, but in my opinion it has also lost its meaning.</p>

<p>Finally a note about the includes. The <code>std::cout</code> in <span class="filename">letter.cpp</span> is only made known through the include of <span class="filename">logging.h</span>. I strongly believe that any identifier used in a file should be declared in the file itself or brought in by an include in the file itself or, for an implementation file, its associated header file. This means that <span class="filename">letter.cpp</span> should <code>#include &lt;iostream&gt;</code> directly. For the same reason I would prefer to <code>#include &lt;utility&gt;</code> for <code>std::pair</code> (not needed any more when using <code>std::minmax_element</code>) in <span class="filename">letter.cpp</span>, and to <code>#include &lt;ostream&gt; </code>and <code>#include &lt;streambuf&gt;</code> in <span class="filename">logging.h</span> for <code>std::ostream</code> and <code>std::streambuf</code> (not needed any more after the rewrite of <code>WRITE_IF</code>).</p>

<h3>James Holland</h3>

<p>There are two features to the software than I thought could be the cause of the problem and, therefore, worth investigating. The first is the <code>null_stream_buff</code> class; the second is the macro <code>WRITE_IF</code>.</p>

<p>Although <code>null_stream_buff</code> may be a somewhat novel mechanism, it appeared to be working as desired, namely to produce no output when called. My attention then turned to the macro. The use of all but the simplest macros is discouraged as they can lead to some very obscure bugs that can be very difficult to diagnose. I was hopeful that the problem with the software debug output had something to do with the use of such macros.</p>

<p>I decided that the best way to proceed was to discover what the source code looked like after being preprocessed. This was accomplished by using the <code>-E</code> compiler switch and observing the resulting output. The statement of interest is the <code>WRITE_IF</code> macro within the <code>for</code>-loop of <code>main()</code> and is shown below.</p>

<pre class="programlisting">
  WRITE_IF(pair.second) &lt;&lt; &quot;Seen &quot;
    &lt;&lt; pair.first &lt;&lt; std::endl;</pre>
	
<p>After preprocessing, the code the compiler sees is as follow.</p>

<pre class="programlisting">
  ((pair.second == true) ? std::cout:null)
    &lt;&lt; &quot;Seen &quot; &lt;&lt; pair.first &lt;&lt; std::endl;</pre>
	
<p>It can now be seen that whether <code>std::cout</code> or <code>null</code> is called depends on the comparison of <code>pair.second</code> with <code>true</code>. The type of <code>pair.second</code> is <code>int</code> and represents the number of occurrences of a particular letter within the text file. The type of <code>true</code> is <code>bool</code>. The problem now boils down to how the C++ compares for equality an <code>int</code> and a <code>bool</code>. From observing the program output, it can be seen that debug information is produced only when the value of <code>pair.second</code> is 1. This suggests that, in effect, the boolean value of <code>true</code> is converted to an <code>int</code> of value 1. The comparison, therefore, consists of comparing the value of <code>pair.second</code> with 1. Output will, therefore, occur only when <code>pair.second</code> is equal to 1. This is consistent with observations.</p>

<p>The above analysis suggests ways of correcting the problem. One way is to convert <code>pair.second</code> to a boolean value with a <code>static_cast</code> before the comparison is made. A value 0 will be converted to <code>false</code> and any other value will be converted to <code>true</code>. Now, <code>false == true</code> has a value of <code>false</code> and <code>null</code> will be called; <code>true == true</code> has a value of <code>true</code> and <code>std::cout</code> will be called. This is as required and is borne out in practice by compiling and running the modified code shown below. </p>

<pre class="programlisting">
  WRITE_IF(static_cast&lt;bool&gt;(pair.second))
    &lt;&lt; &quot;Seen &quot; &lt;&lt; pair.first &lt;&lt; std::endl;</pre>
	
<p>The comparison of a boolean value with <code>true</code> is a redundant operation. This suggests a simpler way of correcting the program. Instead of casting <code>pair.second</code> to <code>bool</code>, the macro defined in <span class="filename">logging.h</span> can be changed from</p>

<pre class="programlisting">
  WRITE_IF(X)((X == true) ? std::cout:null)</pre>
  
<p>to</p>

<pre class="programlisting">
  WRITE_IF(X)((X) ? std::cout:null).</pre>
  
<p>With this arrangement <code>pair.second</code> (an <code>int</code>) will be inherently converted to <code>bool</code> with a value <code>true</code> if <code>pair.second</code> is anything other than 0. This is as required and produces the required result.</p>

<p>Taking a slightly wider view of the program, it would seem preferable to modify the <code>if</code> statement that warns that <code>pair.second</code> is zero. The modification would add an <code>else</code> clause that prints the letter and the number seen. This would remove the need for the macro.</p>

<p>It is always worth considering whether a â€˜hand-rolledâ€™ loop can be replaced with an existing algorithm from the standard library. In the code presented, a <code>for</code>-loop scans through a <code>map</code> to determine the minimum and maximum of <code>map</code>â€™s values. There is a standard library algorithm named <code>minmax_element</code> that sounds as if it could to the same job as the <code>for</code>-loop. With the use of a lambda, as shown below, it can.</p>

<pre class="programlisting">
  const auto [least, most] =
    std::minmax_element(letters.cbegin(), 
      letters.cend(),
      [](const auto &amp; a, const auto &amp; b) {
        return a.second &lt; b.second;
      });</pre>

<p>It should be noted that if more than one minimum or maximum element exists, <code>minmax_element()</code> returns the first minimum and the last maximum element. The elements are stored alphabetically within letters and therefore, in this case, <code>minmax_element()</code> returns <code>T</code> as the rarest and <code>s</code> as the commonest. This is different from the code presented which returns the first commonest, namely <code>e</code>. If the software is to find the first commonest, two other algorithms (both in the standard library) can be used in place of <code>minmax_element()</code> (with a slight loss in efficiency) namely <code>min_element()</code> and <code>max_element()</code>. I have assumed the first commonest letter is to be returned as in the original code resulting in my version of the program shown below.</p>

<pre class="programlisting">
  #include &lt;algorithm&gt;
  #include &lt;fstream&gt;
  #include &lt;iostream&gt;
  #include &lt;map&gt;
  int main(int argc, char **argv)
  {
    if (argc &lt; 2)
    {
      std::cout &lt;&lt; &quot;No arguments provided&quot;;
    }
    else if (argc &gt; 1)
    {
      std::fstream ifs(argv[1]);
      std::map&lt;char, int&gt; letters;
      char ch;
      while (ifs &gt;&gt; ch)
        ++letters[ch];
      const auto compare_function =
        [](const auto &amp; a, const auto &amp; b) {
          return a.second &lt; b.second;
        };
      const auto least =
        std::min_element(letters.cbegin(), 
          letters.cend(), compare_function);
      const auto most =
        std::max_element(letters.cbegin(),
          letters.cend(), compare_function);
      if (most != letters.cend())
      {
        std::cout &lt;&lt; &quot;Commonest letter: &quot; &lt;&lt;
          most-&gt;first &lt;&lt; &quot; (&quot; &lt;&lt; most-&gt;second &lt;&lt;
          &quot;)\n&quot;;
        if (most != least)
        {
         std::cout &lt;&lt; &quot;Rarest letter: &quot; &lt;&lt;
          least-&gt;first &lt;&lt; &quot; (&quot; &lt;&lt; least-&gt;second
          &lt;&lt; &quot;)\n&quot;;
        }
      }
    }
  }</pre>
  
<p>I have also added some protection against the input file being empty.</p>

<h3>Balog PÃ¡l</h3>

<p>Eh, stream-based logging. I hope this is one of the last specimens, we finally get a sensible text formatting library in C++ standard and can leave this monster behind along with all the related quirks. But that is besides the point, letâ€™s read the code.</p>

<ul>
	<li>We have a <span class="filename">logging.h</span> that looks like a header-only library designed for this specific purpose</li>
	
	<li>namespace <code>{}</code> in header: this is normally frowned upon, but fits with the purpose. We can live with this much bloat</li>
	
	<li>Iâ€™m puzzled on the empty subclassing. On a review, Iâ€™d require a good explanation on why we have that <code>null_stream_buf</code> class instead of just using <code>std::streambuf</code>. If it has some ADL-related reason to exist, that must appear in comments</li>
	
	<li>3 names are used up for no good reason, internal purpose only. Especially <code>null</code> stands out. A snap-in like this should use only non-clashing names beyond its public interface</li>
	
	<li>We got to the main feature, that is a macro. I postpone discussion in the idea.</li>
	
	<li>The name <code>WRITE_IF</code> looks suboptimal too</li>
	
	<li>The macro is conditional on <code>NDEBUG</code>. That is a fishy choice, as <code>NDEBUG</code> controls assert, and this feature is not related. </li>
	
	<li>The first version could be just <code>null</code>, the <code>if(false)</code> part serves to save runtime processing of the <code>&lt;&lt;</code>s that follow. The condition is completely eliminated</li>
	
	<li>The second version has a <code>== true</code> comparison that is a major blunder â€“ later we will see that it the actual source of misbehavior</li>
	
	<li>The main code starts in weird way, we have a <code>WRITE_IF</code> followed by another check of the same condition â€“ carefully phrased differently. The normal way would emit message on the <code>else</code>. And not even with this conditional facility.</li>
	
	<li>I accept the reading of the file into the map though it could be arranged with some algorithm without a loop.</li>
	
	<li>The stream-related error handling is missing, but on problem we just result an empty map that should be okay.</li>
	
	<li>We run ahead dereferencing <code>*begin</code> without first checking for empty map that is a no-no.</li>
	
	<li>A manual search for element is presented instead of using <code>std::minmax_element</code> that would just do the job.</li>
</ul>

<p>Certainly the last would kill our chance to use <code>WRITE_IF</code> and notice the misbehavior too... not sure to count that as pro or a con. The way we use it in the loop, we pass the <code>int</code> with the count. Honestly it beats me why we do that, as the method of our creating the map ensures it will not be zero. </p>

<p>The <code>int</code> is fine as a boolean expression and would work as intended if the macro just used it that way, just having <code>(X)</code>. With the <code>== true</code> added, the promotions work â€˜upwardsâ€™, instead of forcing the count to a <code>bool</code> and compare with <code>true</code>, it stays the <code>int</code> and <code>true</code> is promoted to integer 1. Then <code>==</code> will pass only letters with exactly one instance in the input. Which matches the behavior we see.</p>

<p>Beyond that the loop looks correct though massively wasting resources: for has <code>auto</code> instead of <code>const&amp;</code> making copy of every entry in the map and then making copies into <code>most</code> and <code>least</code>. With a <code>pair&lt;char, int&gt;</code> it might not be noticeable, but should be avoided. And as usual, using the stock algorithm would prevent this kind of mess-up and many others. </p>

<p>Finally, the deferred discussion on <code>WRITE_IF</code>. The design goes half-way in many points with respect to compile and runtime. Iâ€™d prefer an all-or-nothing approach. Iâ€™d rather work with a facility like </p>

<pre class="programlisting">
  TRACE_IF( bool_cond, message_string );</pre>
  
<p>that, if arranged with a macro can compile to nothing. Getting rid of both the condition and the message instructions. Or if we prefer to always have the code, can be a simple function. No switcheroo with streams and whatnot.</p>

<p>The â€˜weakâ€™ point is the string part, that in the presented version is assembled with the <code>&lt;&lt;</code> method. That some people still consider a good idea, while others not so much. My practice showed that even a <code>Sprintf() </code>function that uses the traditional <code>sprintf</code> interface with all the potentially related problems (I think we can ask a warning on all of them), and returns the favorite flavor of string is much better. But there are modern ways that combine the benefits of type safety and formatting with extras. One of them is supposedly getting into the standard too.</p>

<h3>Peter Sommerlad</h3>

<p>I will first comment the code directly line by line and then provide a more idiomatic solution.</p>

<pre class="programlisting">
  logging.h</pre>
  
<p>This header file misses an include guard. If this file is included in the same translation unit twice, the global variables with internal linkage <code>nullbuf</code> and <code>null</code> will be doubly-defined and thus the code will no longer compile.</p>

<pre class="programlisting">
  #include &lt;iostream&gt;</pre>

<p>This include is not recommended, the code in this file requires the definitions of <code>std::streambuf</code> and <code>std::ostream</code>, both are available through the <code>&lt;ostream&gt;</code> header. However, the macro will require <code>std::cout</code> in case of debug-level compilation. But the macro definition will not require the definition of this global variable, but the macro expansion eventually. I teach my students to only include <code>&lt;iostream&gt;</code> in the file where the <code>main()</code> function is defined and only use the global variables <code>std::cin</code> and <code>std::cout</code> within the <code>main</code> function and write all code that does I/O taking a parameter of <code>std::istream&amp;</code> or <code>std::ostream&amp;</code> respectively. That principle allows writing unit tests for these functions, which are almost impossible to formulate, when the global stream variables are used directly.</p>

<pre class="programlisting">
  namespace</pre>
  
<p>Defining an anonymous namespace in a header file is not recommended, because wherever that header file is included all definitions exist with internal linkage. If done within the implementation file, I could live with that.</p>

<pre class="programlisting">
  {
    class null_stream_buf
    : public std::streambuf
    {
    } nullbuf;
    std::ostream null(&amp;nullbuf);</pre>
	
<p>This is an interesting attempt to introduce the null-object pattern for <code>ostream</code>. However, using global variables to achieve that might not be the best way. Especially since formatting operations still will be performed, just no output actually happens.</p>

<p>For modern code formatting, I would always use braced-initialization for initializing variables, this removes the mental gymnastics to ensure that a variable is really initialized, since the default for trivial types without braces might mean uninitialized (for automatic storage duration variables), using uninitialized variables can lead to undefined behavior. While we have global variables here, it is still a practice recommended.</p>

<p>On the other side, there is some problem in the macro logic below.</p>

<pre class="programlisting">
  }
  #ifdef NDEBUG
  #define WRITE_IF(X) if (false) null</pre>
  
<p>defining a macro taking a parameter <code>(X)</code> and not using it can be strange. Also having a statement that is incomplete and not using braces for a block is very brittle, especially when surrounding code contains other conditional statements and a following <code>else</code> might not be attached to the <code>if</code> in the surrounding code but to the <code>if</code> from the macro.</p>

<p>On the other hand, having that <code>if</code> statement should guarantee that the output will actually never be generated, because the code path is not taken and the optimizer will eliminate that code.</p>

<pre class="programlisting">
  #else
  #define WRITE_IF(X) ((X == true) ? std::cout 
  : null)</pre>
  
<p>Here is the bug that leads to only outputting letters with the count 1 (that is the integral conversion value of <code>true</code>). Whenever I see code that compares a (potential <code>bool</code>) value with either <code>true</code> or <code>false</code> that code is unprofessional (regardless of the language).</p>

<p>If the macro parameter is meant to be a <code>bool</code> value, then it should read <code>(X)?..:..</code></p>

<p>Note that a macro parameter should always be in parenthesis when expanded to avoid passing â€˜interestingâ€™ strings that, when expanded, change the syntax of the expression or statement.</p>

<p>Again, we have a partial expression of type <code>ostream&amp;</code></p>

<p>If we want to stick with a macro for the debugging, I would suggest to pass all the output to be logged as parameter and not to introduce a conditional even in the debug case.</p>

<pre class="programlisting">
  #endif
  
  --- letter.cpp ---
  #include &quot;logging.h&quot;</pre>
  
<p>It is good practice to include the â€˜ownâ€™ header first, because that guarantees that the own header is self-contained.</p>

<pre class="programlisting">
  #include &lt;fstream&gt;
  #include &lt;map&gt;
  
  // get the most and least popular chars in a file
  int main(int argc, char **argv)
  {</pre>
  
<p>If we follow the unix pipes and filters practice, if no file is given standard input (<code>std::cin</code>) should be processed. This will make the program logic not more complex, but will actually provide a more useful program.</p>

<p>A big problem of this program is that almost all code resides within the <code>main </code>function and is thus not unit-testable. BAD! For very simple example programs that are obviously correct, that might be OK, but as we see from the bugs, this program is not simple enough.</p>

<pre class="programlisting">
  WRITE_IF(argc &lt; 2)
    &lt;&lt; &quot;No arguments provided&quot;;
  if (argc &gt; 1)
  {</pre>
  
<p>There is no check that the file named <code>argv[1]</code> actually exists. From C++17 on we have <code>std::filesystem</code> that allows us to have such a sanity check. This can still cause a race condition when a file is deleted between the check and the actually opening, but it usually allows providing better error messages. A side effect is, that on filesystems with interesting character encodings for file names the string represented by <code>argv[1]</code> can actually be converted into <code>std::path</code> to provide platform independent (unicode!).</p>

<pre class="programlisting">
  std::ifstream ifs(argv[1]);</pre>
  
<p>no error checking, but OK, just no input here.</p>

<pre class="programlisting">
  std::map&lt;char, int&gt; letters;
  char ch;
  while (ifs &gt;&gt; ch)
    ++letters[ch];</pre>
	
<p>This is quite idiomatic and automatically skips whitespace by using <code>operator&gt;&gt;</code>.</p>

<p>However, I would extract filling the map into a separate function, that returns the map by value. Such a function can be independently tested with unit tests.</p>

<p>When there is no input, and thus the map is empty, the following code dereferencing the <code>begin()</code> iterator of the map is undefined behavior.</p>

<p>It might be better to use iterators instead of the pair values for keeping track of the found elements. But using iterators then requires the check, if the iterators are valid (for example not equal to the mapâ€™s <code>end()</code> iterator.</p>

<pre class="programlisting">
  std::pair&lt;char, int&gt; most =
    *letters.begin();
  std::pair&lt;char, int&gt; least =
    *letters.begin();</pre>
	
<p>Later on, writing your own loop for a perfectly ready-made linear search algorithm is a code smell and it can be wrong. The algorithm in the standard library to use here would be <code>minmax_element</code> taking a comparison function object (a lambda) that compares the counters in the mapâ€™s <code>pair::second</code> member.</p>

<pre class="programlisting">
  for (auto pair : letters)
  {
    WRITE_IF(pair.second) &lt;&lt; &quot;Seen &quot;
      &lt;&lt; pair.first &lt;&lt; std::endl;</pre>
	  
<p>Here the debug macro is used and passing the actual count that is then compared with the value of <code>true</code>, results in only those letters to be printed that have a count of one. Using <code>std::endl</code> instead of <code>'\n'</code> is a smell, since in some cases it will result in unneeded operating system interaction, because of flushing the buffer prematurely.</p>

<p>Also test output is bad for a filter program in a pipeline, since that extra output will taint the processing of potentially later filters.</p>

<pre class="programlisting">
  if (pair.second &gt; most.second)
    {
      most = pair;
    }
    else if (pair.second &lt; least.second)
    {
      least = pair;
    }
  }</pre>
  
<p>we will get rid of the whole loop above, so that we do not need to have the debug output at all.</p>

<pre class="programlisting">
  std::cout &lt;&lt; &quot;Commonest letter: &quot;
    &lt;&lt; most.first &lt;&lt; &quot; (&quot;
    &lt;&lt; most.second &lt;&lt; &quot;)\n&quot;;
  if (most != least)</pre>
  
<p>An <code>if</code> statement where the body lasts over several lines without braces is dangerous. if the single statement is split for whatever reason, the logic might be wrong. Also the</p>

<pre class="programlisting">
    std::cout &lt;&lt; &quot;Rarest letter: &quot;
      &lt;&lt; least.first &lt;&lt; &quot; (&quot;
      &lt;&lt; least.second &lt;&lt; &quot;)\n&quot;;
    }
  }</pre>
  
<p>In contrast to my usual practice, I provide a solution without accompanying unit tests, but I have refactored the code, so that unit tests might be added easily, by splitting the code besides <code>main</code> into a separate compilation unit that can be linked to test cases.</p>

<p>As we will see, <code>main()</code> contains (almost) no branching logic and thus can not be wrong. All the other functions do just one thing, that should be testable and they are not dependent on global variables.</p>

<p>I will address the logging macro separately, since I believe it is no longer needed.</p>

<pre class="programlisting">
  #include &lt;fstream&gt;
  #include &lt;string&gt;
  #include &lt;map&gt;
  #include &lt;algorithm&gt;
  using charmap=std::map&lt;char,int&gt;;
  charmap countChars(std::istream &amp;in) {
    std::map&lt;char, int&gt; letters { };
    char ch { };
    while (in &gt;&gt; ch) {
      ++letters[ch];
    }
    return letters;
  }
  using mapiter=charmap::const_iterator;
  void printresult(
    std::ostream &amp;out,
    std::string text,
    mapiter res,
    mapiter noresult) {
    if (res != noresult) {
      out &lt;&lt; text &lt;&lt; res-&gt;first &lt;&lt; &quot; (&quot;
        &lt;&lt; res-&gt;second &lt;&lt; &quot;)\n&quot;;
    }
  }
  #include &lt;iostream&gt;
  // get the most and least popular chars in a file
  int main(int argc, char **argv) {
    std::ifstream ifs {
      argc &gt; 1 ?
      std::ifstream { argv[1] }
      : std::ifstream { 0 } };
    auto const letters { countChars(ifs) };
    auto const least_most {
      minmax_element(letters.begin(),
        letters.end(),
        [](auto l, auto r) {
          return l.second &lt; r.second;
        }) };
    printresult(std::cout,
      &quot;Commonest letter: &quot;,
      least_most.second, letters.end());
    printresult(std::cout, &quot;Rarest letter: &quot;,
      least_most.first, letters.end());
  }</pre>
  
<p>What needs to be explained?</p>

<p>I included <code>&lt;iostream&gt;</code> directly in front of <code>main</code>, to demonstrate that the other functions are independent of the global stream objects. While this is not very idiomatic, it was my test to make sure that I did not leave <code>std::cout</code> somewhere in the extracted code and thus make it untestable.</p>

<p>Let us start with <code>main()</code>. I extended the functionality to be able to operate the program as a filter. The first statement initializing <code>ifs</code> is using a trick. Since <code>iostream</code> objects in general are neither copy-able nor move-able it is tricky to provide a factory that either returns <code>std::cin</code> (or a reference to it) or otherwise a <code>std::ifstream</code> that opens the file given as an optional argument. File stream objects can be moved, so it is possible to initialize an <code>std::ifstream</code> variable from a temporary. Here I use the trick, that an <code>ifstream</code> can either open a file, given as a string or path argument or binds to an already open file descriptor. Since 0 is the file descriptor of the standard input of a program, I create an <code>ifstream</code> object using that file descriptor when no file name is given. (I should have implemented a factory with my filesystem readability check first, but I refrained from it for brevity. This is left as an exercise to the student.)</p>

<p>The code reading the stream is extracted into a separate function and thus becomes testable.</p>

<p>Instead of searching for the letter with the lowest and highest count, I use the <code>minmax_element</code> algorithm available since C++11. To avoid having to spell <code>std::pair&lt;char,int&gt;</code> I use a generic lambda doing the comparison on the counts. Note that we get a slight deviation from the logic, because the â€˜commonestâ€™ (largest) letter count is the last one of potentially similar counts. The original algorithm behaved slightly different. Nevertheless, the logic is OK, because we do not have a specification (or test case) demonstrating what should happen if we have multiple â€˜commonestâ€™ letters.</p>

<p>Since the output needs to check for validity (the input might have been empty) and do so twice, I extracted the output code into the <code>printresult</code> function.</p>

<p>For the logging functionality, I refer to a stackoverflow entry, that demonstrates different approaches to achieve logging: <a href="https://stackoverflow.com/questions/19415845/a-better-log-macro-using-template-metaprogramming">https://stackoverflow.com/questions/19415845/a-better-log-macro-using-template-metaprogramming</a></p>

<p>But my guideline, especially for beginners, when you need logging or interactive debugging to understand what your code does, your code is usually very badly structured. So, simplify it into smaller parts, write unit tests against these parts, or even better write the tests first, this will help you to grow your software according to your needs instead of guesswork and wasting time (see Pete Goodliffeâ€™s article in <em>CVu</em> that presented that problem).</p>

<p>Thatâ€™s it. I hope I did not forget anything crucial, because I was too lazy to write test cases. Shame on me.</p>

<h3>Silas S. Brown</h3>

<p>The reason why only some of the letters are being shown as <code>Seen</code> is that your <code>WRITE_IF(X)</code> macro tests for <code>(X == true)</code> when <code>X</code> is an integer. The integer value of <code>true</code> is 1, so only letters that occurred exactly once (and no more than once) are being printed as <code>Seen</code>.</p>

<p>Instead of checking that <code>X</code> is equal to <code>true</code>, check that <code>X</code> is not equal to <code>false</code>, i.e. <code>(X != false)</code>.</p>

<p>But you really need to put extra parentheses around that <code>X</code>, i.e. <code>((X) != false)</code>, because the macro is expanded without regard to C++ syntax and you could break it if you ever wrote, for example, <code>WRITE_IF(A &amp; B)</code> expecting to get the bitwise AND of <code>A</code> and <code>B</code>: you would instead get the bitwise AND of <code>A</code> and either a 0 or a 1 depending on whether <code>B</code> differs from <code>false</code>, since <code>!=</code> has a higher precedence than the bitwise <code>&amp;</code> operator.  Ouch. (This particular problem would have been flagged up by GCC if you had passed the <code>-Wall</code> (warnings all) parameter to the g++ command. I really donâ€™t understand why <code>-Wall</code> is not the default.)</p>

<p>In this particular case, we could side-step the need for extra parentheses by noting that <code>if ((X) != false)</code> is equivalent to <code>if (X)</code> (since C++, unlike languages such as Java, does let you put non-boolean types into an <code>if</code> expression, and automatically applies an <code>!= 0</code> when it sees a numeric type), but I still made the point about the extra parentheses because youâ€™ll need to know that if you do any more things with macros.</p>

<p>But ideally I donâ€™t want you to be doing any more with macros at all. To misquote Doc Brown from <em>Back To The Future</em>: â€œwhere weâ€™re going, we donâ€™t need macros.â€
 Try this:</p>

<pre class="programlisting">
  static inline auto&amp; WRITE_IF(bool X) {
  #ifdef NDEBUG
    return null;
  #else
    return X ? std::cout : null;
  #endif
  }</pre>
  
<p>If your compiler is too old for <code>auto&amp;</code> to work, then write <code>std::ostream&amp;</code> instead, but do consider upgrading your compiler because <code>auto</code> is a really useful feature of modern C++ that saves us from having to figure out exactly what type somethingâ€™s going to be when the compiler can do that job.</p>

<p>(The usual convention is that a name written in all capitals implies a macro, so really we should be changing it to lower case now that itâ€™s a function.  But Iâ€™ve left it as capitals for now so you can drop the above into the existing code without having to change anything else and itâ€™ll still work.)</p>

<p>Here are just some of the advantages of that function-based approach:</p>

<ol>
	<li>The compiler can check that our function makes sense, even before we try to use it. (Macros are not checked until they are used, and if thereâ€™s a problem the error messages are usually harder to understand, that is if theyâ€™re not missed altogether as in your case.)</li>
	
	<li>We can specify a type for the parameter <code>X</code>. In this case, we say we want a <code>bool</code>, so anything else will be converted to a <code>bool</code>. If you had written <code>X==true</code> in <em>this</em> version, it would not have been a problem.</li>
	
	<li>We donâ€™t have to worry about extra parentheses (and I havenâ€™t even mentioned multiple evaluations of the same variable).</li>
	
	<li>Itâ€™s easier to span multiple lines, and as the above shows, putting the <code>#ifdef NDEBUG</code> inside the function saves us from writing the first line twice (which is good, because writing something twice means there are two places to fix it if it turns out to be wrong, and if one of those is not even reached by the compiler due to an <code>#ifdef</code> then it could easily be missed).</li>
</ol>

<p>In the old days, a counterpoint to all of this might have been â€˜but what about the overhead of having a function callâ€™. But with modern optimising compilers, especially as we said â€˜static inlineâ€™, I would be amazed if the actual machine code generated were any different from doing it with a macro, whether or not <code>NDEBUG</code> is defined.</p>

<p>Another improvement that is possible (now that we have a function) is to hide the <code>null</code> stream inside that function, by making it a static instead of a global. What I mean is to change the start of the function to:</p>

<pre class="programlisting">
  static inline auto&amp; WRITE_IF(bool X) {
    static std::ostream null(&amp;nullbuf);</pre>
	
<p>and remove the <code>std::ostream null(&amp;nullbuf);</code> from your namespace block. This has the advantage of not creating a thing called <code>null</code> thatâ€™s just â€˜floating aroundâ€™ your fileâ€™s global namespace, which could be accidentally referred to in the wrong context. After all thereâ€™s nothing in the name to reinforce the idea that itâ€™s a null stream, rather than a null something else. Yes, you could change the name, but Iâ€™d rather show you how to encapsulate it in the function so you no longer have to worry about a good global name.</p>

<h3>Chris Trobridge</h3>

<p>Considering the output of the program:</p>

<ul>
	<li>The program is only outputting letters that occur exactly once;</li>
	<li>The program counting capitals separately. Thereâ€™s no reason to count each case separately;</li>
	<li>There is more than one most common and least common (case insensitive) letter and it is misleading to just print the first one.</li>
</ul>

<p>Considering <span class="filename">logging.h</span>â€¦</p>

<p>Line 15:</p>

<pre class="programlisting">
  #define WRITE_IF(X) ((X == true) ? std::cout 
  : null)</pre>

<p>Comparison against <code>true</code> is tautology at best but dangerous as bool <code>true</code> is promoted to an integer value of 1, when compared against an <code>int</code>, while any non-zero integer converts to bool <code>true</code>.</p>

<p>I donâ€™t like an object being named as a (conditional) verb.</p>

<p>Considering <span class="filename">letter.cpp</span>â€¦</p>

<p>Lines 10â€“11:</p>

<pre class="programlisting">
  WRITE_IF(argc &lt; 2)
  &lt;&lt; &quot;No arguments provided&quot;;</pre>

<p>This is an error message and would typically be output unconditionally to <code>std::cerr</code>.</p>

<p>Line 18:</p>

<pre class="programlisting">
  ++letters[ch];</pre>

<p>This is case sensitive and <code>ch</code> should be converted to lowercase first.</p>

<p>Line 26:</p>

<pre class="programlisting">
  WRITE_IF(pair.second ) &lt;&lt; &quot;Seen &quot;</pre>

<p>As noted previously <code>WRITE_IF</code> compares its argument against <code>true</code>, which is promoted to 1 and hence only letters that have a frequency of one are displayed.</p>

<p>This may be fixed by removing the comparison in the definition of <code>WRITE_IF</code> above by replacing <code>(X == true)</code> with <code>(X)</code>.</p>

<p>Replacing the macro invocation with <code>WRITE_IF(pair.second &gt; 0)</code> would also fix the issue and express the programmerâ€™s intention better.</p>

<p>The rest of the <code>for</code> loop looks for the most and least common letters. This most and least could be replaced with vectors of <code>char</code> to capture all the most and least popular letters.</p>

<p>Finally line 40:</p>

<pre class="programlisting">
  if (most != least)
    std::cout &lt;&lt; &quot;Rarest letter: &quot;
      &lt;&lt; least.first &lt;&lt; &quot; (&quot;
      &lt;&lt; least.second &lt;&lt; &quot;)\n&quot;;</pre>

<p>This is interesting in as much as it only prints out the rarest letter if it is not equal to the most common. Both the most common and least common letters are set to the first letter in the map and are only changed if there is at least one other letter more common/rarer than the initial letter. The rarest letter will be printed unless all the letters in the file appear with equal frequency.</p>

<p>Making the changes I have suggested I get the following output:</p>

<pre class="programlisting">
  Seen a 2
  Seen e 4
  Seen h 1
  Seen i 3
  Seen l 2
  Seen m 2
  Seen n 1
  Seen o 1
  Seen p 2
  Seen r 1
  Seen s 4
  Seen t 4
  Seen u 1
  Most common letters: e s t (4)
  Rarest letters: h n o r u (1)</pre>
  
<p>Amended <span class="filename">letter.cpp</span>:</p>

<pre class="programlisting">
  #include 'logging.h'
  #include &lt;fstream&gt;
  #include &lt;map&gt;
  #include &lt;vector&gt;

  // get the most and least popular chars in a file
  int main(int argc, char **argv)
  {
  if (argc &lt; 2)
    {
      // Require at least one argument
      std::cerr &lt;&lt; 'No arguments provided\n';
    }
    if (argc &gt; 1)
    {
      std::ifstream ifs(argv[1]);
      std::map&lt;char, int&gt; letters;
      char ch;
      while (ifs &gt;&gt; ch)
        // Ignore case of letters when counting
        ++letters[tolower(ch)];
      // The most common letter must have a
      // frequency &gt;= 0
      int mostFrequency = 0;
      std::vector&lt;char&gt; most;

      // The rarest letter must have a
      // frequency &lt;= INT_MAX
      int leastFrequency = INT_MAX;
      std::vector&lt;char&gt; least;

      for (auto pair : letters)
      {
        // Diagnostics: display the frequency of
        // all letters seen in the file
        WRITE_IF(pair.second &gt; 0) &lt;&lt; 'Seen '
          &lt;&lt; pair.first &lt;&lt; ' ' &lt;&lt; pair.second
          &lt;&lt; std::endl;

        if (pair.second &gt; mostFrequency)
        {
          // This letter is more common than any
          // seen before so reset
          // the most common list to this letter
          mostFrequency = pair.second;
          most = { pair.first };
        }
        else if (pair.second == mostFrequency)
        {
          // This letter is as common as the most
          // common seen so far
          // so add to the most common list
          most.push_back(pair.first);
        }

        if (pair.second &lt; leastFrequency)
        {
          // This letter is less common than any
          // seen before so reset the most rarest
          // list to this letter
          leastFrequency = pair.second;
          least = { pair.first };
        }
        else if (pair.second == leastFrequency)
        {
          // This letter is as rare as the rarest
          // seen so far so add to the rarest list
          least.push_back(pair.first);
        }
      }

      if (!most.empty())
      {
        // There is at least one letter so print
        // out the list of most common letters and
        // their frequency
        std::cout &lt;&lt; 'Most common letters: ';
        for (auto ch : most)
          std::cout &lt;&lt; ch &lt;&lt; ' ';
        std::cout &lt;&lt; '(' &lt;&lt; mostFrequency
          &lt;&lt;  ')\n';
      }
      if (!least.empty() &amp;&amp;
        mostFrequency != leastFrequency)
      {
        // There is at least one letter that is not
        // the most common so print out the list of
        // rarest letters and their frequency
        std::cout &lt;&lt; 'Rarest letters: ';
        for (auto ch : least)
          std::cout &lt;&lt; ch &lt;&lt; ' ';
        std::cout &lt;&lt; '(' &lt;&lt; leastFrequency
          &lt;&lt; ')\n';
      }
    }
  }</pre>
  
<h2>Commentary</h2>

<p>I think the six entrants between them covered pretty well all the ground, so there seems little left for me to add!</p>

<p>One thing that wasnâ€™t explicitly discussed was the performance downside of using a map where the input is a small discrete set of values. I suggest an <code>array</code> indexed by the (unsigned) <code>char</code> value might be simpler and faster or, in a more general case, perhaps an <code>unordered_map</code>.</p>

<p>Tying the logging to <code>NDEBUG</code>, as PÃ¡l suggested, may be a mistake: especially for the error message written if the argument count is wrong. While some of the logging frameworks out there do this, others use other ways to enable/disable the logging.</p>

<p>Note that the empty class <code>null_stream_buf</code> is required since the default constructor of <code>std::streambuf</code> is protected.</p>

<p>It might also be worth a little more explanation of the problem with an anonymous namespace in a header: everything inside the namespace will have a <strong>unique</strong> name for each translation unit that includes the header. The result can be that multiple copies of entities from the header may end up in the resultant binary, differing only in their namespace.</p>

<p>If this sort of thing is needed, in C++17 you can use a named namespace and <code>inline</code> variable declarations to ensure only one definition will be included in the built program.</p>

<p>While I like Peterâ€™s solution that uses <code>std::cin</code> if no filename is provided please note that his solution relies on a non-standard extension that allows construction of an <code>ifstream</code> from a file handle of 0.</p>

<p>Finally, in correspondence received that was not an actual <em>critique</em> as such, it was suggested you could write the below command in less time it takes to compile the C++ program:</p>

<pre class="programlisting">
  cat sample.txt |fold -w1|grep -v ' ' \
   |sort|uniq -c|sort -n -k 1 \
   |gawk 'NR==1; END{print}'</pre>
   
<h2>The winner of CC 116</h2>

<p>All the entrants correctly identified the bug in the <code>WRITE_IF</code> macro with the expression <code>X == true</code>. Explicitly checking against <code>true</code> or <code>false </code>is, as Chris said, a tautology at best. While some languages require this, C++ does not.</p>

<p>I liked Silasâ€™ approach of fixing the problem by replacing the macro with an <code>inline</code> function â€“ in C++17 there are even more places where this may be possible.</p>

<p>A couple of people ensured the code was valid if the file failed to open, Peter also suggested checking the file exists using <code>std::filesystem</code>.</p>

<p>Most of the critiques addressed the â€˜C++â€™ issues in the code; Hans noticed that the code incorrectly treated <em>punctuation</em> as letters and Chris recognised in addition to this that the code treated upper and lower case letters differently. (I suspect an <em>international</em> solution might need to deal with Unicode rather than the <code>char</code>-based code presented...).</p>

<p>While many of the critiques suggested using <code>minmax_element</code> to replace a hand-written loop (I was pleased to see this!) and some people noted that this might change <em>which</em> element was found where there was a duplication, only Chris suggested that this reflected a problem with the code itself in that it incorrectly failed to handle the case where multiple letters tied as most or least frequent.</p>

<p>After reading again through each critique and weighing the various slightly different approaches each took, in my opinion Chris gave the most helpful answers to the writer of the code and so I have awarded him the prize for this issueâ€™s critique. But thanks to <em>all</em> who took part; one of the things I appreciate is looking at the different ways that different programmers can view the same code, and the variety of possible approaches they suggest to improving it.</p>

<h2>Code Critique 117</h2>

<p>(Submissions to scc@accu.org by June 1st)</p>

<p class="blockquote">Iâ€™m trying to do some simple statistics for some experimental data but I seem to have got an error: I donâ€™t <em>quite</em> get the results I am expecting for the standard deviation. I canâ€™t see whatâ€™s wrong: Iâ€™ve checked and I donâ€™t get any compiler warnings. I ran a test with a simple data set as follows:</p>

<pre class="programlisting">
     echo 1 2 3 4 5 6 7 8 9 | statistics
     mean: 5, sd: 1.73205</pre>

<p class="blockquote">The mean is right but I think I should be getting a standard deviation of something like 2.16 for the set (without outliers this is [2,8].)</p>

<p>Can you solve the problem â€“ and then give some further advice?</p>

<p>Listing 3 contains  <span class="filename">statistics.h</span> and Listing 4  is <span class="filename">statistics.cpp</span>.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
namespace statistics
{
  // get rid of the biggest and smallest
  template &lt;typename T&gt;
  void remove_outliers(T&amp; v)
  {
    using namespace std;
    auto min = min_element(v.begin(), v.end());
    auto max = max_element(v.begin(), v.end());
    v.erase(remove_if(v.begin(), v.end(),
      [min, max](auto v) {
        return v == *min || v == *max; 
      }),
      v.end());
  }
  template &lt;typename T&gt;
  auto get_sums(T v)
  {
    typename T::value_type sum{}, sumsq{};
    for (auto i : v)
    {
      sum += i;
      sumsq += i * i;
    }
    return std::make_pair(sum, sumsq);
  }
  template &lt;typename T&gt;
  auto calc(T v)
  {
    remove_outliers(v);
    auto sums = get_sums(v);
    auto n = v.size();
    double mean = sums.first / n;
    double sd = sqrt((sums.second -
      sums.first * sums.first / n) / n - 1);
    return std::make_pair(mean, sd);
  }
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 3</td>
	</tr>
</table>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
#include &lt;algorithm&gt;
#include &lt;cmath&gt;
#include &lt;iostream&gt;
#include &lt;iterator&gt;
#include &lt;vector&gt;
#include &quot;statistics.h&quot;
void read(std::vector&lt;int&gt;&amp; v)
{
  using iter = std::istream_iterator&lt;int&gt;;
  std::copy(iter(std::cin), iter(),
    std::back_inserter(v));
}
int main()
{
  std::vector&lt;int&gt; v;
  read(v);
  auto result = statistics::calc(v);
  std::cout &lt;&lt; &quot;mean: &quot; &lt;&lt; result.first
    &lt;&lt; &quot;, sd: &quot; &lt;&lt; result.second &lt;&lt; '\n';
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 4</td>
	</tr>
</table>

<p>You can also get the current problem from the accu-general mail list (next entry is posted around the last issueâ€™s deadline) or from the ACCU website (<a href="http://accu.org/index.php/journal">http://accu.org/index.php/journal</a>).</p>

<p>This particularly helps overseas members who typically get the magazine much later than members in the UK and Europe.</p>

<p class="bio"><span class="author"><b>Roger Orr</b></span>  Roger has been programming for over 20 years, most recently in C++ and Java for various investment banks in Canary Wharf and the City. He joined ACCU in 1999 and the BSI C++ panel in 2002.</p>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
