    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Learning Standard C++ as a New Language</title>
        <link>https://members.accu.org/index.php/journals/952</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 12, #1 - Jan 2000 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c128/">121</a>
                    (30)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c65/">Programming</a>
                    (877)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c128-65/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c128+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Learning Standard C++ as a New Language</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 January 2000 13:15:34 +00:00 or Mon, 03 January 2000 13:15:34 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<div class="sidebar">
<p>This article was originally published in the May 1999 issue of
C/C++ Users Journal. I am grateful for to the author and original
publisher for allowing me to reprint it here.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e23" id="d0e23"></a>ABSTRACT</h2>
</div>
<p>To get the most out of Standard C++ [<a href=
"#Cpp1998">Cpp1998</a>], we must rethink the way we write C++
programs. An approach to such a &quot;rethink&quot; is to consider how C++
can be learned (and taught). What design and programming techniques
do we want to emphasize? What subsets of the language do we want to
learn first? What subsets of the language do we want to emphasize
in real code? This paper compares a few examples of simple C++
programs written in a modern style using the standard library to
traditional C-style solutions. It argues briefly that lessons from
these simple examples are relevant to large programs. More
generally, it argues for a use of C++ as a higher-level language
that relies on abstraction to provide elegance without loss of
efficiency compared to lower-level styles.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e31" id="d0e31"></a>1
Introduction</h2>
</div>
<p>We want our programs to be easy to write, correct, maintainable,
and acceptably efficient. It follows that we ought to use C++ - and
any other programming language - in ways that most closely
approximate this ideal. It is my conjecture that C++ community has
yet to internalize the facilities offered by Standard C++ so that
major improvements relative to the ideal can be obtained from
reconsidering our style of C++ use. This paper focuses on the
styles of programming that the facilities offered by Standard C++
support - not the facilities themselves.</p>
<p>The key to major improvements is a reduction of the size and
complexity of the code we write through the use of libraries.
Below, I demonstrate and quantify these reductions for a couple of
simple examples such as might be part of a introductory C++
course.</p>
<p>By reducing size and complexity, we reduce development time,
ease maintenance, and decrease the cost of testing. Importantly, we
also simplify the task of learning C++. For toy programs and for
students who program only to get a good grade in a nonessential
course, this simplification would be sufficient. However, for
professional programmers efficiency is a major issue. Only if
efficiency isn't sacrificed can we expect our programming styles to
scale to be usable in systems dealing with the data volumes and
real-time requirements regularly encountered by modern services and
businesses. Consequently, I present measurements that demonstrate
that the reduction in complexity can be obtained without loss of
efficiency. Finally, I discuss the implications of this view on
approaches to learning and teaching C++</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e40" id="d0e40"></a>2
Complexity</h2>
</div>
<p>Consider a fairly typical second exercise in using a programming
language:</p>
<div class="literallayout">
<p>write a prompt &quot;Please enter your first name&quot;<br>
read the name<br>
write out &quot;Hello &lt;name&gt;&quot;</p>
</div>
<p>In Standard C++, the obvious solution is:</p>
<pre class="programlisting">
#include&lt;iostream&gt; // get standard I/O facilities
#include&lt;string&gt; // get standard string facilities
int main() {
  using name space std; // gain access to standard library
  cout &lt;&lt; &quot;Please enter your first name:\n&quot;;
  string name;
  cin &gt;&gt; name;
  cout &lt;&lt; &quot;Hello &quot; &lt;&lt; name &lt;&lt; &acute;\n&acute;;
}
</pre>
<p>For a real novice, we need to explain the ''scaffolding:'' What
is main()? What does #include mean? What does using do? In
addition, we need to understand all the ''small'' conventions, such
as what \n does, where semicolons are needed, etc.</p>
<p>However, the main part of the program is conceptually simple and
differs only notationally from the problem statement. We have to
learn the notation, but doing so is relatively simple: string is a
string, cout is output, &lt;&lt; is the operator we use write to
output, etc. To compare, consider a traditional C-style
solution<sup>[<a name="d0e56" href="#ftn.d0e56" id=
"d0e56">1</a>]</sup>:</p>
<pre class="programlisting">
#include&lt;stdio.h&gt; // get standard I/O facilities
int main(){
  const int max = 20; // maximum name length is 19 characters
  char name[max] ;
  printf(&quot;Please enter your first name&quot;) ;
  scanf(&quot;%s&quot;,name) ; // read characters into name
  printf(&quot;Hello %s\n&quot;, name) ;
  return 0;
}
</pre>
<p>Objectively, the main logic here is slightly - but only slightly
- more complicated than the C++-style version because we have to
explain about arrays and the magic %s. The main problem is that
this simple C-style solution is shoddy. If someone enters a ''first
name'' that is longer than the magic number 19 (the stated number
20 minus one for a C-style string terminating zero), the program is
corrupted. It can be argued that this kind of shoddiness is
harmless as long as a proper solution is presented ''later on.''
However, that line of argument is at best ''acceptable'' rather
than ''good.'' Ideally, a novice user isn't presented with a
program that brittle.</p>
<p>What would a C-style program that behaved as reasonably as the
C++-style one look like? As a first attempt we could simply prevent
the array overflow by using scanf() in a more appropriate
manner:</p>
<pre class="programlisting">
#include&lt;stdio.h&gt; // get standard I/O facilities
int main(){
  const int max = 20;
  char name[max] ;
  printf(&quot;Please enter your first name:\n&quot;) ;
  scanf(&quot;%19s&quot;, name) ; // read at most 19 characters into name
  printf(&quot;Hello %s\n&quot;, name) ;
  return 0;
}
</pre>
<p>There is no standard way of directly using the symbolic form of
the buffer size, max, in the scanf() format string, so I had to use
the integer literal. That is bad style and a maintenance hazard.
The expert-level alternative is not one I'd care to explain to
novices: char fmt[10] ;</p>
<pre class="programlisting">
sprintf(fmt,&quot;%%%ds&quot;,max-1); // create a format string: plain %s can overflow
scanf(fmt, name) ; // read at most max-1 characters into name
</pre>
<p>Furthermore, this program throws ''surplus'' characters away.
What we want is for the string to expand to cope with the input. To
achieve that, we have to descend to a lower level of abstraction
and deal with individual characters:</p>
<pre class="programlisting">
#include&lt;stdio.h h&gt;
#include&lt;ctype.h h&gt;
#include&lt;stdlib.h h&gt;
void quit() {   // write error message and quit
  fprintf(stderr, &quot;memory exhausted\n&quot;) ;
  exit(1) ;
}
int main() {
  int max = 20;
  char* name = (char*)malloc(max);   // allocate buffer
  if (name == 0) quit() ;
  printf(&quot;Please enter your first name:\n&quot;) ;
  while (true) {     // skip leading whitespace
    int c = getchar() ;
    if (c == EOF) break; // end of file
    if (!isspace(c)) {
      ungetc(c, stdin) ;
      break;
    }
  }
  int i = 0;
  while (true) {
    int c = getchar() ;
    if (c == &acute;\n&acute; || c == EOF) {   // at end; add terminating zero
      name[i] = 0;
      break;
    }
    name[i] = c;
    if (i==max-1) { // buffer full
      max = max+max;
      name = (char*)realloc(name, max) ;   // get a new and larger buffer
      if (name == 0) quit() ;
    }
    i++;
  }
  printf(&quot;Hello %s\n&quot;, name) ;
  free(name) ; // release memory
  return 0;
}
</pre>
<p>Compared to the previous versions, this seems rather complex. I
feel a bit bad adding the code for skipping whitespace because I
didn't explicitly require that in the original problem statement.
However, skipping initial whitespace is the norm and the other
versions of the program skip whitespace. One could argue that this
example isn't all that bad. Most experienced C and C++ programmers
would - in a real program - probably (hopefully?) have written
something equivalent in the first place. We might even argue that
if you couldn't write that program, you shouldn't be a professional
programmer. However, consider the added conceptual load on a
novice. This variant uses nine different standard library
functions, deals with character-level input in a rather detailed
manner, uses pointers, and explicitly deals with free</p>
<p>To use realloc() while staying portable, I had use malloc()
(rather than new). This brings the issues of sizes and
casts<sup>[<a name="d0e80" href="#ftn.d0e80" id=
"d0e80">2</a>]</sup> into the picture. It is not obvious what is
the best way to handle the possibility of memory exhaustion in a
small program like this. Here, I simply did something obvious to
avoid the discussion going off on another tangent. Someone using
the C-style approach would have to carefully consider which
approach would form a good basis for further teaching and eventual
use. To summarize, to solve the original simple problem, I had to
introduce loops, tests, storage sizes, pointers, casts, and
explicit free-store management in addition to whatever a solution
to the problem inherently needs. This style is also full of
opportunity for errors. Thanks to long experience, I didn't make
any of the obvious off-by-one or allocation errors. Having
primarily worked with stream I/O for a while, Initially made the
classical beginner's error of reading into a char (rather that into
an int) and forgetting to check for EOF. In the absence of
something like the C++ standard library, it is no wonder that many
teachers stick with the ''shoddy'' solution and postpone these
issues until later. Unfortunately, many students simply note that
the shoddy style is ''good enough'' and quicker to write than the
(non-C++ style) alternatives. Thus they acquire a habit that is
hard to break and leave a trail of buggy code behind. This last
C-style program is 41 lines compared to 10 lines for its
functionally equivalent C++-style program.</p>
<p>Excluding ''scaffolding,'' the difference is 30 lines vs 4.
Importantly, the C++-style lines are also shorter and inherently
easier to understand. The number and complexity of concepts needed
to be explained for the C++-style and C-style versions are harder
to measure objectively, but I suggest a 10-to-1 advantage for the
C++-style version.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e86" id="d0e86"></a>3
Efficiency</h2>
</div>
<p>Efficiency is not an issue in a trivial program like the one
above. For such programs, simplicity and (type) safety is what
matters. However, real systems often have parts where efficiency is
essential. For such systems, the question becomes ''can we afford a
higher level of abstraction?'' Consider a simple example of the
kind of activity that occurs in programs where efficiency
matters:</p>
<div class="literallayout">
<p>read an unknown number of elements<br>
do something to each element<br>
do something with all elements</p>
</div>
<p>The simplest specific example I can think of is a program to
find the mean and median of a sequence of double precision
floating-point numbers read from input. A conventional C-style
solution would be:</p>
<pre class="programlisting">
// C-style solution:
#include&lt;stdlib.h h&gt;
#include&lt;stdio.h h&gt;
int compare(const void* p p, const void* q) {
        // comparison function for use by qsort()
  register double p0 = *(double*)p; // compare doubles
  register double q0 = *(double*)q;
  if (p0 &gt; q0) return 1;
  if (p0 &lt; q0) return -1;
  return 0;
}
void quit() {    // write error message and quit
  fprintf(stderr,&quot;memory exhausted\n&quot;) ;
  exit(1) ;
}
int main(int argc, char* argv[]) {
  int res = 1000; // initial allocation
  char* file = argv[2] ;
  double * buf = (double*)malloc(sizeof(double)*res) ;
  if (buf==0) quit() ; //see editors footnote<sup>[<a name="d0e97"
href="#ftn.d0e97" id="d0e97">3</a>]</sup>
  double median = 0;
  double mean = 0;
  int n = 0; // number of elements
  FILE* fin = fopen(file,&quot;r&quot;) ; // open file for reading
  double d;
  while (fscanf(fin,&quot;%lg&quot;,&amp;d)==1) { // read number, update running mean
    if (n==res) {
      res += res;
      buf = (double*)realloc(buf, sizeof(double)*res) ;
      if (buf==0) quit() ;
    }
    buf[n++] = d;
    mean = (n==1) ? d : mean+(d-mean)/n; // prone to rounding errors
  }
  qsort(buf, n, sizeof(double) , compare) ;
  if (n) {
    int mid = n/2;
    median = (n%2) ? buf[mid] : (buf[mid-1]+buf[mid])/2;
  }
  printf(&quot;number of elements = %d, median = %g, mean = %g\n&quot; 
            ,n, median, mean);
  free(buf);
}
</pre>
<p>To compare, here is an idiomatic C++ solution:</p>
<pre class="programlisting">
// Solution using the Standard C++ library:
#include&lt;vector r&gt;
#include&lt;fstream m&gt;
#include&lt;algorithm m&gt;
using namespace std;
int main(int argc, char* argv[]) {
  char* file = argv[2] ;
  vector r&lt;double&gt; buf;
  double median = 0;
  double mean = 0;
  fstream fin(file, ios::in) ; // open file for input
  double d;
  while(fin&gt;&gt;d) {
    buf.push _back k(d) ;
    mean = (buf.size()==1) ? d : mean+(d-mean)/buf.size() ; 
              // prone to rounding errors
  }
  sort(buf.begin() ,buf.end()) ;
  if (buf.size()) {
    int mid = buf.size()/2;
    median = (buf.size()%2) ? buf[mid] : (buf[mid-1]+buf[mid])/2;
  }
  cout &lt;&lt; &quot;number of elements = &quot; &lt;&lt; buf.size()
       &lt;&lt; &quot;, median = &quot; &lt;&lt; median &lt;&lt; &quot;, mean = &quot; &lt;&lt; mean &lt;&lt; &acute;\n&acute;;
}
</pre>
<p>The size difference is less dramatic than in the previous
example (43 vs 24 non-blank lines). Excluding, irreducible common
elements such as the declaration of main() and the calculation of
the median (13 lines) the difference is 20 lines vs 11. The
critical input-and-store loop and the sort are both significantly
shorter in the C++-style program (9 vs 4 lines for the
read-and-store loop, and 9 lines vs 1 line for the sort). More
importantly, their logic is far simpler in the C++ version - and
therefore far easier to get right. Again, memory management is
implicit in the C++-style program; a vector grows as needed when
elements are added using push_back(). In the C-style program,
memory management is explicit using realloc(). Basically, the
vector constructor and push_back() in the C++-style program does
what mallloc() , realloc() , and the code tracking the size of
allocated memory does in the C-style program. In the C++ style
program, I rely on the exception handling to report memory
exhaustion. In the C-style program, I added explicit tests to avoid
the possibility of memory corruption.</p>
<p>Not surprisingly, the C++ version was easier to get right. I
constructed this C++-style version from the C-style version by
cut-and-paste. I forgot to include &lt;algorithm&gt;, I left n in
place rather than using buf.size() twice , and my compiler didn't
support the local using-directive so I had to move it outside
main(). One the other hand, after fixing these four errors, the
program ran correctly first time.</p>
<p>To a novice, qsort() is ''odd.'' Why do you have to give the
number of elements? (because the array doesn't know it). Why do you
have to give the size of a double? (because qsort() doesn't know
that it is sorting doubles). Why do you have to write that ugly
function to compare doubles? (because qsort() needs a pointer to
function because it doesn't know the type of the elements that it
is sorting). Why does qsort()'s comparison function take const
void* arguments rather than char* arguments? (because qsort() can
sort based on non-string values). What is a void* and what does it
mean for it to be const? (''Eh, hmmm, we'll get to that later'').
Explaining this to a novice without getting a blank stare of
wonderment over the complexity of the answer is not easy.
Explaining, sort(v.begin(), v.end()) is comparatively easy: ''Plain
sort(v) would have been simpler in this case, but sometimes we want
to sort part of a container so it's more general to specify the
beginning and end of what we want to sort.''</p>
<p>To compare efficiencies, I first determined how much input was
needed to make an efficiency comparison meaningful. For 50,000
numbers the programs ran in less than half a second each, so I
choose to compare runs with 500,000 and 5,000,000 input values:</p>
<div class="table"><a name="d0e113" id="d0e113"></a>
<table summary="Read, sort, and write floating-point numbers"
border="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th> </th>
<th colspan="3" align="center">unoptimized</th>
<th colspan="3" align="center">optimized</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>elements</td>
<td>C++</td>
<td>C</td>
<td>C/C++ ratio</td>
<td>C</td>
<td>C++</td>
<td>C/C++ ratio</td>
</tr>
<tr>
<td>500 000</td>
<td>3.5</td>
<td>6.1</td>
<td>1.74</td>
<td>2.5</td>
<td>5.1</td>
<td>2.04</td>
</tr>
<tr>
<td>5 000 000</td>
<td>38.4</td>
<td>172.6</td>
<td>4.49</td>
<td>27.4</td>
<td>126.6</td>
<td>4.62</td>
</tr>
&lt;/tbody&gt;
</table>
<p class="title c2">Table 1. Read, sort, and write floating-point
numbers</p>
</div>
<p>The key numbers are the ratio; a ratio larger than one means
that the C++-style version is faster. Comparisons of languages,
libraries, and programming styles are notoriously tricky, so please
do not draw sweeping conclusions from these simple tests. The
numbers are averages of several runs on an otherwise quiet machine.
The variance between different runs of an example was less than 1%.
I also ran strictly ISO C conforming versions of the C-style
programs. As expected there were no performance difference between
those and their C-style C++ equivalents. I had expected the
C++-style program to be only slightly faster. Checking other C++
implementations, I found a surprising variance in the results. In
some cases, the C-style version even outperformed the C++-style
version for small data sets. However, the point of this example is
that a higher level of abstraction and a better protection against
errors can be affordable given current technology: The
implementation I used is widely available and cheap - not a
research toy. Implementations that claim higher performance are
also available.</p>
<p>It is not unusual to find people being willing to pay a factor
of 3, 10, or even 50 for convenience and better protection against
errors. Getting the benefit together with a doubling or quadrupling
of speed is spectacular. These figures should be the minimum that a
C++ library vendor would be willing to settle for. To get a better
idea of where the time was spent, I ran a few additional tests:</p>
<div class="table"><a name="d0e183" id="d0e183"></a>
<table summary="500 000 elements" border="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th> </th>
<th colspan="3" align="center">unoptimized</th>
<th colspan="3" align="center">optimized</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>task</td>
<td>C++</td>
<td>C</td>
<td>C/C++ ratio</td>
<td>C</td>
<td>C++</td>
<td>C/C++ ratio</td>
</tr>
<tr>
<td>read</td>
<td>2.1</td>
<td>2.8</td>
<td>1.33</td>
<td>2.0</td>
<td>2.8</td>
<td>1.4</td>
</tr>
<tr>
<td>generate</td>
<td>0.6</td>
<td>0.3</td>
<td>0.5</td>
<td>0.4</td>
<td>0.3</td>
<td>0.75</td>
</tr>
<tr>
<td>read &amp; sort</td>
<td>3.5</td>
<td>6.1</td>
<td>1.75</td>
<td>2.5</td>
<td>5.1</td>
<td>2.04</td>
</tr>
<tr>
<td>generate &amp; sort</td>
<td>2.0</td>
<td>3.5</td>
<td>1.75</td>
<td>.9</td>
<td>2.6</td>
<td>2.89</td>
</tr>
&lt;/tbody&gt;
</table>
<p class="title c2">Table 2. 500 000 elements</p>
</div>
<p>Naturally, ''read'' simply reads the data and ''read &amp;
sort'' reads the data and sorts it but doesn't produce output. To
get a better feel for the cost of input, ''generate'' produces
random numbers rather than reading.</p>
<div class="table"><a name="d0e281" id="d0e281"></a>
<table summary="5 000 000 elements" border="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th> </th>
<th colspan="3" align="center">unoptimized</th>
<th colspan="3" align="center">optimized</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>task</td>
<td>C++</td>
<td>C</td>
<td>C/C++ ratio</td>
<td>C</td>
<td>C++</td>
<td>C/C++ ratio</td>
</tr>
<tr>
<td>read</td>
<td>21.5</td>
<td>29.1</td>
<td>1.35</td>
<td>21.3</td>
<td>28.6</td>
<td>1.34</td>
</tr>
<tr>
<td>generate</td>
<td>7.2</td>
<td>4.1</td>
<td>0.57</td>
<td>5.2</td>
<td>3.6</td>
<td>0.69</td>
</tr>
<tr>
<td>read &amp; sort</td>
<td>38.4</td>
<td>172.6</td>
<td>4.49</td>
<td>27.4</td>
<td>126.6</td>
<td>4.62</td>
</tr>
<tr>
<td>generate &amp; sort</td>
<td>24.4</td>
<td>147.1</td>
<td>6.03</td>
<td>11.3</td>
<td>100.6</td>
<td>8.90</td>
</tr>
&lt;/tbody&gt;
</table>
<p class="title c2">Table 3. 5 000 000 elements</p>
</div>
<p>From other examples and other implementations, I had expected
streamio to be somewhat slower than stdio. That was actually the
case for a previous version of this program where I use cin rather
than a filestream. It appears that on some C++ implementations,
file I/O is much faster than cin. The reason is at least partly
poor handling of the tie between cin and cout. However, these
numbers demonstrate that C++-style I/O can be as efficient as
C-style I/O.</p>
<p>Changing the programs to read and sort integers instead of
floating-point values did not change the relative performance -
though it was nice to note that making that change was much simpler
in the C++ style program (2 edits as compared to 12 for the C-style
program). That is a good omen for maintainability. The differences
in the ''generate'' tests reflect a difference in allocation costs.
A vector plus push_back() ought to be exactly as fast as an array
plus malloc()/free(), but it wasn't. The reason appears to be
failure to optimize away calls of initializers that do nothing.
Fortunately, the cost of allocation is (always) dwarfed by the cost
of the input that caused the need for the allocation. As expected,
sort() was noticeably faster than qsort(). The main reason is that
sort() inlines its comparison operations whereas qsort() must call
a function.</p>
<p>It is hard to choose an example to illustrate efficiency issues.
One comment I had from a colleague was that reading and comparing
numbers wasn't realistic. I should read and sort strings. So I
tried this program:</p>
<pre class="programlisting">
#include&lt;vector&gt;
#include&lt;fstream&gt;
#include&lt;algorithm&gt;
#include&lt;string&gt;
using namespace std;
int main(int argc, char* argv[]){
  char* file = argv[2] ; // input file name
  char* of file = argv[3] ; // output file name
  vector r&lt;string&gt; buf;
  fstream fin(file,ios: :in) ;
  string d;
  while(getline(fin,d)) buf.push _back(d) ; // add line from input to buf
  sort(buf.begin() ,buf.end()) ;
  fstream fout(of file,ios: :out) ;
  copy(buf.begin() ,buf.end() ,ostream_iterator r&lt;string&gt;(fout,&quot;\n&quot;)) ; 
                  // copy to output
}
</pre>
<p>I transcribed this into C and experimented a bit to optimize the
reading of characters. The C++-style code performs well even
against hand-optimized C-style code that eliminates copying of
strings. For small amounts of output there is no significant
difference and for larger amounts of data sort() again beats
qsort() because of its better inlining:</p>
<div class="table"><a name="d0e387" id="d0e387"></a>
<table summary="Read, sort, and write strings" border="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>elements</td>
<td>C++</td>
<td>C</td>
<td>C/C++ ratio</td>
<td>C (no string copy)</td>
<td>C/C++ ratio</td>
</tr>
<tr>
<td>500 000</td>
<td>8.4</td>
<td>9.5</td>
<td>1.13</td>
<td>8.3</td>
<td>0.99</td>
</tr>
<tr>
<td>2 000 000</td>
<td>37.4</td>
<td>81.3</td>
<td>2.17</td>
<td>76.1</td>
<td>2.03</td>
</tr>
&lt;/tbody&gt;
</table>
<p class="title c2">Table 4. Read, sort, and write strings</p>
</div>
<p>I used 2 million strings because I didn't have enough main
memory to cope with 5 million strings without paging.</p>
<p>To get an idea of what time was spent where, I also ran the
program with the s sort() omitted:</p>
<div class="table"><a name="d0e441" id="d0e441"></a>
<table summary="Read and write strings" border="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>elements</td>
<td>C++</td>
<td>C</td>
<td>C/C++ ratio</td>
<td>C (no string copy)</td>
<td>C/C++ ratio</td>
</tr>
<tr>
<td>500 000</td>
<td>2.5</td>
<td>3.0</td>
<td>1.2</td>
<td>2.0</td>
<td>0.80</td>
</tr>
<tr>
<td>2 000 000</td>
<td>9.8</td>
<td>12.6</td>
<td>1.29</td>
<td>8.9</td>
<td>0.91</td>
</tr>
&lt;/tbody&gt;
</table>
<p class="title c2">Table 5. Read and write strings</p>
</div>
<p>The strings were relatively short (seven characters on
average).</p>
<p>Note that string is a perfectly ordinary user-defined type that
just happens to be part of the standard library. What we can do
efficiently and elegantly with a string, we can do efficiently and
elegantly with many other user-defined types.</p>
<p>Why do I discuss efficiency in the context of programming style
and teaching? The styles and techniques we teach must scale to
real-world problems. C++ is - among other things - intended for
large-scale systems and systems with efficiency constraints.</p>
<p>Consequently, I consider it unacceptable to teach C++ in a way
that leads people to use styles and techniques that are effective
for toy programs only; that would lead people to failure and to
abandon what was taught. The measurements above demonstrate that a
C++ style relying heavily on generic programming and concrete types
to provide simple and type-safe code can be efficient compared to
traditional C styles. Similar results have been obtained for
object-oriented styles.</p>
<p>It is a significant problem that the performance of different
implementations of the standard library differ dramatically. For a
programmer who wants to rely on standard libraries (or widely
distributed libraries that are not part of the standard), it is
often important that a programming style that delivers good
performance on one system give at least acceptable performance on
another. I was appalled to find examples where my test programs ran
twice as fast in the C++ style compared to the C style on one
system and only half as fast on another. Programmers should not
have to accept a variability of a factor of four between systems.
As far as I can tell, this variability is not caused by fundamental
reasons, so consistency should be achievable without heroic efforts
from the library implementors. Better optimized libraries may be
the easiest way to improve both the perceived and actual
performance of Standard C++. Compiler implementors work hard to
eliminate minor performance penalties compared with other
compilers. I conjecture that the scope for improvements is larger
in the standard library implementations.</p>
<p>Clearly, the simplicity of the C++-style solutions above
compared to the C-style solutions was made possible by the C++
standard library. Does that make the comparison unrealistic or
unfair? I don't think so. One of the key aspects of C++ is its
ability to support libraries that are both elegant and efficient.
The advantages demonstrated for the simple examples hold for every
application area where elegant and efficient libraries exist or
could exist. The challenge to the C++ community is to extend the
areas where these benefits are available to ordinary programmers.
That is, we must design and implement elegant and efficient
libraries for many more application areas and we must make these
libraries widely available.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e503" id="d0e503"></a>Learning
C++</h2>
</div>
<p>Even for the professional programmer, it is impossible to first
learn a whole programming language and then try to use it. A
programming language is learned in part by trying out its
facilities for small examples. Consequently, we always learn a
language by mastering a series of subsets. The real question is not
''Should I learn a subset first?'' but ''Which subset should I
learn first?'' One conventional answer to the question ''Which
subset of C++ should I learn first?'' is ''The C subset of C++.''
In my considered opinion, that's not a good answer. The C-first
approach leads to an early focus on low-level details. It also
obscures programming style and design issues by forces the student
to face many technical difficulties to express anything
interesting. The examples in &sect;2 and &sect;3 illustrate this
point. C++'s better support of libraries, better notational
support, and better type checking are decisive against a ''C
first'' approach. However, note that my suggested alternative isn't
''Pure Object-Oriented Programming first.'' I consider that the
other extreme.</p>
<p>For programming novices, learning the programming language
should support the learning of effective programming techniques.
For experienced programmers who are novices at C++, the learning
should focus on how effective programming techniques are expressed
in C++ and on techniques that are new to the programmer. For
experienced programmers, the greatest pitfall is often to
concentrate on using C++ to express what was effective in some
other language. The emphasis for both novices and experienced
programmers should be concepts and techniques. The syntactic and
semantic details of C++ are secondary to an under-standing of
design and programming techniques that C++ supports.</p>
<p>Teaching is best done by starting from well-chosen concrete
examples and proceeding towards the more general and more abstract.
This is the way children learn and it is the way most of us grasp
new ideas. Language features should always been presented in the
context of their use. Otherwise, the programmer's focus shifts from
producing systems to delight over technical obscurities. Focussing
on language-technical details can be fun, but it is not effective
education.</p>
<p>On the other hand, treating programming as merely the handmaiden
of analysis and design doesn't work either. The approach of
postponing actual discussion of code until every high-level and
engineering topic has been thoroughly presented has been a costly
mistake for many. That approach drives people away from programming
and leads many to serious underestimate the intellectual challenge
in the creation of production-quality code.</p>
<p>The extreme opposite to the ''design first'' approach is to get
a C++ implementation and start coding. When encountering a problem,
point and click to see what the online help has to offer. The
problem with this approach is that it is completely biased towards
the understanding of individual features and facilities. General
concepts and techniques and not easily learned this way. For
experienced programmers, this approach has the added problem of
reinforcing the tendency to think in a previous language while
using C++ syntax and library functions. For the novice, the result
is a lot of if-then-else code mixed with code snippets inserted
using cut-and-paste from vendor-supplied examples. Often the
purpose of the inserted code is obscure to the novice and the
method by which it achieves its effect completely beyond
comprehension. This is the case even for clever people. This
''poking around approach'' can be most useful as an adjunct to good
teaching or a solid textbook, but on its own it is a recipe for
disaster. To sum up, I recommend an approach that</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>proceeds from the concrete to the abstract,</p>
</li>
<li>
<p>presents language features in the context of the programming and
design techniques that they exist</p>
</li>
<li>
<p>to support,</p>
</li>
<li>
<p>presents code relying on relatively high-level libraries before
going into the lower-level details (necessary to build those
libraries),</p>
</li>
<li>
<p>avoids techniques that do not scale to real-world
applications,</p>
</li>
<li>
<p>presents common and useful techniques and features before
details, and</p>
</li>
<li>
<p>focus on concepts and techniques (rather than language
features).</p>
</li>
</ul>
</div>
<p>No. I don't consider this particularly novel or revolutionary.
Mostly, I see it as common sense. However, common sense often gets
lost in heated discussion about more specific topics such as
whether C should be learned before C++, whether you must write
Smalltalk to really understand Object-Oriented programming, whether
you must start learning programming in a pure-OO fashion (whatever
that means), and whether a thorough understanding of the software
development process before trying to write code. Fortunately, there
is some experience with approaches that meet my criteria. My
favorite approach is to teaching the basic language concepts such
as variables, declarations, loops, etc. together with a good
library. The library is essential to enable to students to
concentrate on programming rather than the intricacies of, say
C-style strings. I recommend the use of the C++ standard libraries
or a subset of those. This is the approach taken by the Computer
Science Advanced Placement course taught in American high schools
[<a href="#Horwitz1999">Horwitz1999</a>]. A more advanced version
of that approach aimed at experienced programmers has also proved
successful; for example, see [<a href=
"#Koenig1998">Koenig1998</a>].</p>
<p>A weakness of these specific approaches is the absence of a
simple graphics and graphical user inter-faces early on. This could
(easily?) be compensated for by a very simple interface to
commercial libraries. By ''very simple,'' I mean usable by students
on day two of a C++ course. However, no such simple graphics and
graphical user interface C++ library is widely available.</p>
<p>After the initial teaching/learning that relies on libraries, a
course can proceed in a variety of ways based on the needs and
interests of the students. At some point, the messier and
lower-level features of C++ will have to be examined. One way of
teaching/learning about pointers, casting, allocation, etc. is to
examine the implementation of the classes used to learn the basics.
For example, the implementation of string, vector, and list classes
are excellent contexts for discussions of language facilities from
the C subset of C++ that are best left out of the first part of a
course.</p>
<p>Classes, such as vector and string, that manage variable amounts
of data require the use of free store and pointers in their
implementation. Before introducing those, classes that doesn't
require that (concrete classes), such as a Date, a Point, and a
Complex type can be used to introduce the basics of class
implementation.</p>
<p>I tend to present abstract classes and class hierarchies after
the discussion of containers and the implementation of containers,
but there are many alternatives here. The actual ordering of topics
should depend on the libraries used. For example, a course using a
graphics library relying on class hierarchies will have to explain
the basics of polymorphism and the definition of derived classes
relatively early.</p>
<p>Finally, please remember there is no one right way to learn and
teach C++ and its associated design and programming techniques. The
aims and backgrounds of students differ and so does the backgrounds
and experience of their teachers and textbook writers.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e556" id="d0e556"></a>5
Summary</h2>
</div>
<p>We want our C++ programs to be easy to write, correct,
maintainable, and acceptably efficient. To do that, we must design
and program at a higher level of abstraction than has typically
been done with C and early C++. Through the use of libraries, this
ideal is achievable without loss of efficiency compared to
lower-level styles. Thus, work on more libraries, on more
consistent implementation of widely-used libraries (such as the
standard library), and on making libraries more widely available
can yield great benefits to the C++ community.</p>
<p>Education must play a major role in this move to cleaner and
higher-level programming styles. The C++ community doesn't need
another generation of programmers who by default use the lowest
level of language and library facilities available out of misplaced
fear of inefficiencies. Experienced C++ programmers as well as C++
novices must learn to use Standard C++ as a new and higher-level
language as a matter of course and descend to lower levels of
abstraction only where absolutely necessary. Using Standard C++ as
a glorified C or glorified C with Classes only would be to waste
the opportunities offered by Standard C++.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e563" id="d0e563"></a>6
Acknowledgements</h2>
</div>
<p>Thanks to Chuck Allison for suggesting that I wrote a paper on
learning Standard C++. Thanks to Andrew Koenig and Mike Yang for
constructive comments on earlier drafts. My examples were compiled
using Cygnus' EGCS1.1 and run on a Sun Ultrasparc 10. The programs
I used can be found on my homepages:</p>
<p><a href="http://www.research.att.com/%CB%9Cbs" target=
"_top">http://www.research.att.com/&tilde;bs</a></p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e571" id="d0e571"></a>References</h2>
</div>
<div class="bibliomixed"><a name="Cpp1998" id="Cpp1998"></a>
<p class="bibliomixed">[Cpp1998] X3 Secretariat: Standard - The C++
Language. ISO/IEC 14882:1998(E). Information Technology Council
(NCITS). Washington, DC, USA. (See <span class=
"bibliomisc"><a href="http://www.ncits.org/cplusplus.htm" target=
"_top">http://www.ncits.org/cplusplus.htm</a></span>).</p>
</div>
<div class="bibliomixed"><a name="Horwitz1999" id=
"Horwitz1999"></a>
<p class="bibliomixed">[Horwitz1999] Susan Horwitz:
Addison-Wesley's Review for the Computer Science AP Exam in C++.
Addison-Wesley. 1999. ISBN 0-201-35755-0.</p>
</div>
<div class="bibliomixed"><a name="Koenig1998" id="Koenig1998"></a>
<p class="bibliomixed">[Koenig1998] Andrew Koenig and Barbara Moo:
Teaching Standard C++. (part 1, 2, 3, and 4) Journal of
Object-Oriented Programming, Vol 11 (8, 9) 1998 and Vol 12 (1, 2)
1999.</p>
</div>
<div class="bibliomixed"><a name="Stroustrup1997" id=
"Stroustrup1997"></a>
<p class="bibliomixed">[Stroustrup1997] Bjarne Stroustrup: The C++
Programming language (Third Edition). Addison-Wesley. 1997. ISBN
0-201-88954-4.</p>
</div>
</div>
<div class="footnotes"><br>
<hr class="c3" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e56" href="#d0e56" id=
"ftn.d0e56">1</a>]</sup> For aesthetic reasons, I use C++ style
symbolic constants and C++ style //-comments. To get strictly
conforming ISO C programs, use #define and /* */ comments.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e80" href="#d0e80" id=
"ftn.d0e80">2</a>]</sup> I know that C allows this to be written
without explicit casts. However, that is done at the cost of
allowing unsafe implicit conversion of a void* to an arbitrary
pointer type. Consequently, C++ requires that cast.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e97" href="#d0e97" id=
"ftn.d0e97">3</a>]</sup> This line cannot be here in strictly
conforming 1989 C, the new C standard allows mixing declarations
with other code. Yet one more place where C makes the task of
teaching harder than it need be. Francis</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
