    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Efficient Exceptions?</title>
        <link>https://members.accu.org/index.php/journals/225</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">Overload Journal #61 - Jun 2004 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c151/">61</a>
                    (10)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c65/">Programming</a>
                    (877)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c151-65/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c151+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Efficient Exceptions?</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 01 June 2004 16:51:14 +01:00 or Tue, 01 June 2004 16:51:14 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<p>I recently read a comment on a code review that produced a fair
amount of discussion about exceptions. Slightly simplified, the C#
code being reviewed was:</p>
<pre class="programlisting">
public bool isNumeric(string input) {
  bool ret = true;
  bool decimalFound = false;

  if(input == null
        || input.Length &lt; 1) {
    ret = false;
  }
  else {
    for(int i = 0
        ; i &lt; input.Length
        ; i ++) {
      if(!Char.IsNumber(input[i]))
        if((input[i] == '.')
              &amp;&amp; !decimalFound) {
          decimalFound = true;
        }
        else {
          ret = false;
        }
    }
  return ret;
}
</pre>
<p>The review comment that started the discussion was quite
short:</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>Your isNumeric function might be more efficient as:</p>
<pre class="programlisting">
public bool isNumeric(string input) {
  try {
    Double.Parse(input);
    return true;
  }
  catch(FormatException /*ex*/) {
    return false;
  }
}
</pre></blockquote>
</div>
<p>I thought some of the discussion inspired by this comment might
be of more general interest. There are several important issues
that are relevant here, and later in the article I will return to
some of them. However the first thing that struck me was the use of
the word &quot;efficient&quot; and it is this word that the bulk of the
article addresses.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>What Does it
Mean to be Efficient?</h2>
</div>
<p>In the coding context efficiency is usually concerned with size
and/or speed.</p>
<p>The second piece of code is more efficient in terms of source
code size. And it is probably slightly more efficient in terms of
image code size; but it is almost certainly not more efficient in
terms of runtime memory use, particularly on failure, since
exceptions in C# will allocate memory.</p>
<p>So I started wondering about runtime efficiency - which for
simplicity I will from here on refer to as 'performance'. Would the
proposed replacement function be any faster than the original
function?</p>
<div class="sidebar">
<p class="title c2">The Dangers of Performance Figures</p>
<p>Note: performance figures are very dangerous! They depend on all
sorts of factors, such as the language being used, the compiler
settings, the operating system being used and the hardware that the
program runs on.</p>
<p>Although I've done my best to produce repeatable performance
figures for this article please do not take any of the figures as
being more than indicative of the overall performance of the
languages mentioned. A small change to the programs being tested
could produce variations in the figures produced.</p>
<p>For those who care I was using Windows 2000 SP4 on a 733 MHz
single CPU machine with 768 Mb of RAM. (Yes, maybe it's time I
bought a newer machine!)</p>
<p>I was using:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>C# with csc from Visual Studio .NET (version 7.00.9466), both
with and without optimising</p>
</li>
<li>
<p>Java with JDK 1.3.0 and JDK 1.4.2</p>
</li>
<li>
<p>C++ with Microsoft VC++ 6 (version 12.00.8804) both with (/Ox)
and without (/Od) optimising. In both cases I was also using /GX
/GR.</p>
</li>
<li>
<p>C++ with gcc 2.95.3-4 under Cygwin (with and without -O)</p>
</li>
</ul>
</div>
<p>(I also repeated a couple of the C++ tests with the Microsoft
.NET and .NET 2003 C++ compilers but the results did not change
enormously.)</p>
<p>It is important to note that I was <span class=
"emphasis"><em>not</em></span> principally looking to optimise the
hand written code - I was interested in the effect on performance
of using exceptions. For this reason I deliberately kept the
implementations of each function similar to ones in the other
languages - hence, for example, the use of the member function
<tt class="methodname">at()</tt> in the C++ code rather than the
more idiomatic <tt class="literal">[]</tt> notation. In fact, after
being challenged about this, I tested both methods and to my shock
found that <tt class="methodname">at()</tt> was actually faster
than <tt class="methodname">operator[]</tt> with MSVC6. If you find
this unbelievable it only goes to show how unexpected performance
measurements can be, and how dependent they are on the
optimiser!</p>
<p>I also made the <tt class="methodname">IsNumeric</tt> method an
instance method of a class in all languages for consistency and
ease of testing. Changing this would have equally affected the
performance of both the exception and the non-exception code so I
left it as it was.</p>
</div>
<p>It is often said that it is better to use library functions than
to write your own code; apart from any other considerations library
functions are often optimised by experts using a wide variety of
techniques. However, in this case using the library function adds
exception handling into the equation - would the advice still
stand?</p>
<p>I thought I'd try to get some actual performance figures.</p>
<p>I wrote a simple test harness that called the first and the
second functions 1,000,000 times.</p>
<p>(execution times in seconds)</p>
<div class="table"><a name="d0e98" id="d0e98"></a>
<p class="title c2">Table 1. Unoptimised C#</p>
<table summary="Unoptimised C#" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.19</td>
<td>0.92</td>
</tr>
<tr>
<td>12345</td>
<td>0.56</td>
<td>1.13</td>
</tr>
<tr>
<td>10 digits</td>
<td>1.01</td>
<td>1.38</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.91</td>
<td>1.79</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<div class="table"><a name="d0e139" id="d0e139"></a>
<p class="title c2">Table 2. Optimised C#</p>
<table summary="Optimised C#" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.10</td>
<td>0.89</td>
</tr>
<tr>
<td>12345</td>
<td>0.33</td>
<td>1.01</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.65</td>
<td>1.36</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.28</td>
<td>1.76</td>
</tr>
<tr>
<td>30 digits</td>
<td>2.07</td>
<td>1.92</td>
</tr>
<tr>
<td>40 digits</td>
<td>2.55</td>
<td>2.38</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>So the first function is quite a bit faster for relatively short
strings, but degrades until it is eventually slower than the second
function. Similar results are generated when optimisation is turned
on, although the number of digits at the 'break even' point is
slightly more.</p>
<p>The main question I was investigating though is what happens
when a non-numeric value is supplied and an exception is
thrown.</p>
<div class="table"><a name="d0e198" id="d0e198"></a>
<p class="title c2">Table 3. Exception Results</p>
<table summary="Exception Results" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>X</td>
<td>0.20</td>
<td>147.60 (unoptimised)</td>
</tr>
<tr>
<td>X</td>
<td>0.11</td>
<td>143.24 (optimised)</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Yes, that's right - the decimal point is in the right place for
function #2! The code path through the exception throwing route
took almost 3 orders of magnitude longer than the raw code.</p>
<p>This is why, for this article, I'm just not interested in minor
optimisations of the source code, since the impact of exceptions
dwarfs them.</p>
<p>This was very intriguing - I wondered whether it was only a C#
issue or it was also an issue with Java and C++.</p>
<p>Here is an approximately equivalent pair of functions in
Java:</p>
<pre class="programlisting">
public boolean isNumeric(String input) {
  boolean ret = true;
  boolean decimalFound = false;
  if(input == null
        || input.length() &lt; 1) {
    ret = false;
  }
  else {
    for(int i = 0
        ; i &lt; input.length()
        ; i ++) {
      if(!Character.isDigit(
            input.charAt(i)))
        if((input.charAt(i) == '.')
              &amp;&amp; !decimalFound) {
          decimalFound = true;
        }
        else {
          ret = false;
        }
    }
  }
  return ret;
}
</pre>
<p>and:</p>
<pre class="programlisting">
public boolean isNumeric(String input) {
  try {
    Double.parseDouble(input);
    return true;
  }
  catch(NumberFormatException ex) {
    return false;
  }
}
</pre>
<p>Surely code that looks so similar must behave the same way :-)
?</p>
<p>Here are the results for the non-exception case:</p>
<div class="table"><a name="d0e243" id="d0e243"></a>
<p class="title c2">Table 4. jdk 1.3.0</p>
<table summary="jdk 1.3.0" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.13</td>
<td>0.81</td>
</tr>
<tr>
<td>12345</td>
<td>0.42</td>
<td>1.15</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.76</td>
<td>1.68</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.48</td>
<td>23.16</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<div class="table"><a name="d0e284" id="d0e284"></a>
<p class="title c2">Table 5. jdk 1.4.2</p>
<table summary="jdk 1.4.2" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.10</td>
<td>0.76</td>
</tr>
<tr>
<td>12345</td>
<td>0.29</td>
<td>1.12</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.51</td>
<td>1.63</td>
</tr>
<tr>
<td>20 digits</td>
<td>0.94</td>
<td>28.08</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>The results here are comparable to the optimised C# results -
apart from the last line. What happens here? The <tt class=
"methodname">parseDouble</tt> method is much slower once you exceed
15 digits - this is to do (at least with the versions of Java I'm
using) with optimisations inside <tt class=
"methodname">Double.parseDouble</tt> when the number is small
enough to be represented as an integer value. Whether this matters
in practice of course depends on the range of input values the
program actually passes to the <tt class=
"methodname">isNumeric</tt> function.</p>
<p>The exception results look like this:</p>
<div class="table"><a name="d0e338" id="d0e338"></a>
<p class="title c2">Table 6. Exception Results</p>
<table summary="Exception Results" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>X</td>
<td>0.16</td>
<td>15.33 (jdk 1.3.0)</td>
</tr>
<tr>
<td>X</td>
<td>0.12</td>
<td>18.15 (jdk 1.4.0)</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Well, this is not quite as awful as the C# case - the
performance of the second function is 'only' two orders of
magnitude worse than the first function when the exception is
thrown.</p>
<p>For completeness, how about a C++ solution?</p>
<p>The roughly equivalent functions I came up with were:</p>
<pre class="programlisting">
bool IsNumeric1::isNumeric(std::string
                   const &amp; input) const {
  bool ret = true;
  bool decimalFound = false;
  if(input.length() &lt; 1) {
    ret = false;
  }
  else {
    for(int i = 0
        ; i &lt; input.length()
        ; i ++) {
      if(!isdigit(input.at(i)))
        if((input.at(i) == '.')
              &amp;&amp; !decimalFound) {
          decimalFound = true;
        }
        else {
          ret = false;
        }
    }
  }
  return ret;
}
</pre>
<p>and:</p>
<pre class="programlisting">
bool IsNumeric2::isNumeric(std::string
                   const &amp; input) const {
  try {
    convert&lt;double&gt;(input);
    return true;
  }
  catch(std::invalid_argument const &amp; ex) {
    return false;
  }
}
</pre>
<p>where convert was derived from code in <tt class=
"classname">boost::lexical_cast</tt> from <a href=
"http://www.boost.org" target="_top">www.boost.org</a> (in the
absence of a standard C++ library function with similar syntax and
semantics to the C# and Java parse functions) and looks like
this:</p>
<pre class="programlisting">
template&lt;typename Target&gt;
Target convert(std::string const &amp; arg) {
  std::stringstream interpreter;
  Target result;
  if(!(interpreter &lt;&lt; arg)
        || !(interpreter &gt;&gt; result)
        || !(interpreter &gt;&gt; std::ws).eof())
    throw std::invalid_argument( arg );
  return result;
}
</pre>
<p>I decided using a reference in C++ kept the source code looking
more equivalent although a smart pointer could have been used
instead as its behaviour is more like that of a reference in the
other two languages.</p>
<p>How did C++ fare in the comparison?</p>
<div class="table"><a name="d0e391" id="d0e391"></a>
<p class="title c2">Table 7. MSVC optimised</p>
<table summary="MSVC optimised" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.07</td>
<td>5.73</td>
</tr>
<tr>
<td>12345</td>
<td>0.21</td>
<td>7.65</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.40</td>
<td>11.25</td>
</tr>
<tr>
<td>20 digits</td>
<td>0.74</td>
<td>17.12</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Our initial choice for the convert function is very slow -
perhaps it is a bad choice. The cost of using <tt class=
"classname">stringstream</tt> objects seems to be very high,
although that might be a problem with my compilers'
implementations. This is not really an entirely fair comparison
either - the convert template function is generic whereas the C#
and Java code is type-specific. So let me replace the generic
<tt class="function">convert</tt> function with:</p>
<pre class="programlisting">
double convert(std::string const &amp; arg) {
  const char *p = arg.c_str();
  char *pend = 0;
  double result = strtod(p, &amp;pend);
  if(*pend != '\0')
    throw std::invalid_argument(arg);
  return result;
}
</pre>
<p>This produces the following improved performance figures:</p>
<div class="table"><a name="d0e444" id="d0e444"></a>
<p class="title c2">Table 8. MSVC unoptimised</p>
<table summary="MSVC unoptimised" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.14</td>
<td>1.82</td>
</tr>
<tr>
<td>12345</td>
<td>0.46</td>
<td>2.71</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.83</td>
<td>5.1</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.57</td>
<td>8.62</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<div class="table"><a name="d0e485" id="d0e485"></a>
<p class="title c2">Table 9. MSVC optimised</p>
<table summary="MSVC optimised" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.07</td>
<td>1.80</td>
</tr>
<tr>
<td>12345</td>
<td>0.21</td>
<td>2.71</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.40</td>
<td>4.94</td>
</tr>
<tr>
<td>20 digits</td>
<td>0.74</td>
<td>8.39</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>And finally I recompiled the C++ code with gcc under Cygwin.</p>
<div class="table"><a name="d0e528" id="d0e528"></a>
<p class="title c2">Table 10. gcc unoptimised</p>
<table summary="gcc unoptimised" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.29</td>
<td>0.63</td>
</tr>
<tr>
<td>12345</td>
<td>0.98</td>
<td>0.70</td>
</tr>
<tr>
<td>10 digits</td>
<td>1.79</td>
<td>0.85</td>
</tr>
<tr>
<td>20 digits</td>
<td>3.44</td>
<td>3.87</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<div class="table"><a name="d0e569" id="d0e569"></a>
<p class="title c2">Table 11. gcc optimised</p>
<table summary="gcc optimised" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.05</td>
<td>0.33</td>
</tr>
<tr>
<td>12345</td>
<td>0.11</td>
<td>0.40</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.17</td>
<td>0.55</td>
</tr>
<tr>
<td>20 digits</td>
<td>0.27</td>
<td>3.55</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>However, what about the exception throwing case (which is after
all the motivating example) ?</p>
<div class="table"><a name="d0e612" id="d0e612"></a>
<p class="title c2">Table 12. Exception Results</p>
<table summary="Exception Results" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>X</td>
<td>0.17</td>
<td>11.69 (MSVC unoptimised)</td>
</tr>
<tr>
<td>X</td>
<td>0.08</td>
<td>11.03 (MSVC optimised)</td>
</tr>
<tr>
<td>X</td>
<td>0.40</td>
<td>4.15 (gcc unoptimised)</td>
</tr>
<tr>
<td>X</td>
<td>0.06</td>
<td>3.17 (gcc optimised)</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Even discounting the cost of solution #2 there is one to two
orders of magnitude difference between the return code and
exception throwing case, but with some significant differences
between the two compilers.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>What is the
Cost of an Exception?</h2>
</div>
<p>Exceptions tend to be expensive for a number of reasons,
described below:</p>
<div class="variablelist">
<dl>
<dt><span class="term">The exception object itself must be
created.</span></dt>
<dd>
<p>This is not usually very expensive in C++, although it does
obviously depend on the exact class being used. In Java and in C#
the exception object contains a call stack, and the runtime
environment has to create this before the exception is thrown. This
may be quite expensive, particularly if the function call stack is
deep.</p>
</dd>
<dt><span class="term">The act of throwing the exception can be
expensive.</span></dt>
<dd>
<p>For example, when using Microsoft compilers under Windows,
throwing a C++ exception involves calling the OS kernel to raise an
operating system exception, which includes capturing the state of
the thread for passing to the exception handler. This approach is
by no means universal - gcc under Cygwin does not use native
operating system exceptions for its C++ exceptions, which seems to
have as a consequence that the execution time cost of an exception
is lower.</p>
<p>Then, in C++, a copy of the supplied object is thrown, which can
impose some overhead for non-trivial exception objects.</p>
</dd>
<dt><span class="term">There is the cost of catching the
exception.</span></dt>
<dd>
<p>This in general involves unwinding the stack and finding
suitable catch handlers, using run time type identification to
match the types of the thrown object to each potential catch
handler. For example, if I throw a <tt class=
"classname">std::invalid_argument</tt> object in C++ this might be
caught by:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>catch(std::invalid_argument const &amp;)</p>
</li>
<li>
<p>catch(std::exception)</p>
</li>
<li>
<p>catch(...)</p>
</li>
</ul>
</div>
<p>with different behaviour in each case. The cost of this rises
with both the depth of the exception class hierarchy and the number
of catch statements that there are between the throw and the
successful catch.</p>
<p>Note that some experts in compiler and library implementation
claim that high performance exception handling is theoretically
possible; however in practice it seems than many of the popular
compilers out there do have less than optimal performance in this
area.</p>
</dd>
</dl>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Should I Care
How Slow Exceptions Are?</h2>
</div>
<p>Let's take stock of where we have reached. I've investigated the
'efficiency' claims for the proposed replacement code and found
that it is almost always slower for numeric input and very
significantly slower for non-numeric input.</p>
<p>On examining the two functions you can quickly see that they do
not produce the same answers for all inputs; this is probably much
more significant than which function runs faster since in most
applications 'right' is better than 'fast but wrong'.</p>
<p>Consider the results the two C# implementations give for the
following inputs:</p>
<div class="table"><a name="d0e707" id="d0e707"></a>
<p class="title c2">Table 13. Optimised C#</p>
<table summary="Optimised C#" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>&quot;+1&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;-1&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;.&quot;</td>
<td>True</td>
<td>False</td>
</tr>
<tr>
<td>&quot; 1&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;1 &quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;1e3&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;1,000&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>&quot;Infinity&quot;</td>
<td>False</td>
<td>True</td>
</tr>
<tr>
<td>null</td>
<td>False</td>
<td>Exception</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>The library function understands a much broader range of numeric
input than the hand-crafted code does. And that's leaving aside all
discussion about locales (should ',' be a decimal point or a
thousands separator?), which the library function takes in its
stride. This probably provides an explanation of why our own
conversion function is faster than the library call - it isn't a
complete solution!</p>
<p>The problem with the initial code review was the word
'efficient'; I would like to make use of the library call to take
advantage of its rich functionality despite the loss of efficiency.
However I'd like to reduce the expense of the exception - is this
possible?</p>
<p>The exception is being thrown when the input is not numeric so
its cost only matters in this case. Ideally I'd like to find out
how many times the function returns false in typical use;
unfortunately a simple profiler will only tell me how many times
the function is called and not differentiate on return code. I
either need to use a better profiler or to add some instrumentation
to my program.</p>
<p>In the best case I might find that the function usually succeeds
and then I probably don't mind taking a performance hit on the rare
failures. However I might find that the function is called a lot
and is roughly evenly divided between success and failure - in this
case I will want to reduce the cost.</p>
<p>As it happens, it is fairly easy to do this in the C# case.
Closer investigation of the Double class reveals a TryParse method
that has exactly the behaviour we require in IsNumeric. It needs a
couple of additional arguments but the resultant code is clear:</p>
<pre class="programlisting">
using System.Globalization;
...
public bool isNumeric(string input) {
  double ignored;
  return = Double.TryParse(input,
        NumberStyles.Float |
              NumberStyles.AllowThousands,
        NumberFormatInfo.CurrentInfo,
        out ignored);
}
</pre>
<p>The results of running this function are:</p>
<div class="table"><a name="d0e797" id="d0e797"></a>
<p class="title c2">Table 14. Optimised C#</p>
<table summary="Optimised C#" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
<th>Function #3</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.10</td>
<td>0.89</td>
<td>1.08</td>
</tr>
<tr>
<td>12345</td>
<td>0.33</td>
<td>1.01</td>
<td>1.29</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.65</td>
<td>1.36</td>
<td>1.55</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.28</td>
<td>1.76</td>
<td>1.95</td>
</tr>
<tr>
<td>X</td>
<td>0.11</td>
<td>143.24</td>
<td>1.90</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Unfortunately <tt class="methodname">Double.Parse(string)</tt>
is slightly faster than <tt class="methodname">TryParse</tt> for
the 'good' case but this is outweighed by the drastic improvement
in speed on 'bad' inputs. In the absence of specific measurements
of performance I would prefer this solution.</p>
<p>The Java case is more difficult - there is no direct equivalent
to <tt class="methodname">TryParse</tt>. I tried the following:</p>
<pre class="programlisting">
public boolean isNumeric(String input) {
  java.text.NumberFormat numberFormat
        = java.text.NumberFormat.getInstance();
  java.text.ParsePosition parsePosition
        = new java.text.ParsePosition(0);
  Number value = numberFormat.parse(
        input, parsePosition);
    return ((value != null)
        &amp;&amp; (parsePosition.getIndex()
        == input.length()));
}
</pre>
<p>However the performance is 'disappointing'. The new method is
indeed faster when an exception occurs - but an order of magnitude
slower when the input is in fact numeric. The biggest cost is
creating the numberFormat object - caching this makes it a lot
faster, but additional coding work would need to be done to make it
threadsafe (see the JDK 1.4 documentation for <tt class=
"methodname">NumberFormat</tt>).</p>
<div class="table"><a name="d0e877" id="d0e877"></a>
<p class="title c2">Table 15. jdk 1.3.0</p>
<table summary="jdk 1.3.0" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
<th>Function #3</th>
<th>Funct'n #3 + caching</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>1</td>
<td>0.13</td>
<td>0.81</td>
<td>14.54</td>
<td>2.39</td>
</tr>
<tr>
<td>12345</td>
<td>0.42</td>
<td>1.15</td>
<td>16.00</td>
<td>3.77</td>
</tr>
<tr>
<td>10 digits</td>
<td>0.76</td>
<td>1.68</td>
<td>18.11</td>
<td>5.88</td>
</tr>
<tr>
<td>20 digits</td>
<td>1.48</td>
<td>23.16</td>
<td>51.19</td>
<td>34.70</td>
</tr>
<tr>
<td>X</td>
<td>0.16</td>
<td>15.33</td>
<td>11.79</td>
<td>0.85</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>The decision is much harder here - can I do anything else? One
option is to check for common failure cases before passing the
string into <tt class="methodname">Double.Parse</tt>. This means
measuring or guessing what the 'common failures' are - an example
of such a guess would be to check if the first digit is
alphabetic.</p>
<p>Moving on, the C++ case is easier - I can simply return failure
from <tt class="function">strtod</tt> by using a return code rather
than throwing an exception.</p>
<pre class="programlisting">
bool try_convert(std::string const &amp; arg) {
  const char *p = arg.c_str();
  char *pend = 0;
  (void)strtod(p, &amp;pend);
  return (*pend == '\0');
}
</pre>
<div class="table"><a name="d0e961" id="d0e961"></a>
<p class="title c2">Table 16. </p>
<table summary="" border="1" cellspacing="0">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Argument</th>
<th>Function #1</th>
<th>Function #2</th>
<th>Function #3</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>X</td>
<td>0.17</td>
<td>11.69</td>
<td>0.31 (MSVC unoptimised)</td>
</tr>
<tr>
<td>X</td>
<td>0.08</td>
<td>11.03</td>
<td>0.26 (MSVC optimised)</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Anything
Else?</h2>
</div>
<p>There are a couple of other points worth noting about using
exceptions.</p>
<p>It can be hard to correctly identify which exceptions should be
caught, and mismatches can cause other problems.</p>
<p>Firstly, catching too much. If your code catches too broad a
category of exceptions (for example &quot;<tt class="literal">catch
(Exception)</tt>&quot;, or &quot;<tt class="literal">catch (...)</tt>&quot; in
C++) can mean that error cases other than the one you are expecting
are caught and do not flow to the appropriate higher level handler
where they can be correctly dealt with. This can be even more of an
issue in some C++ environments, such as MSVC, where non-C++
exceptions are also swallowed by <tt class="literal">catch
(...)</tt>.</p>
<p>Conversely, failing to make the exception net wide enough can
lead to exceptions leaking out of the function and causing a
failure higher up. This has happened to me when using JDBC in Java
where the exception types thrown for data conversion errors, such
as invalid date format, seem to vary depending on the driver being
used.</p>
<p>Debugging exceptions can be a problem. Many debuggers cannot
easily filter exceptions, so if your program throws many exceptions
it can make the debugging process slow or unwieldy, or swamp output
with spurious warnings.</p>
<p>In some environments you can stop when an exception is about to
be thrown, but it is very hard to follow the flow of control after
that point. The standard flow-of-control mechanisms are usually
easier to trace.</p>
<p>Finally the code you write must be exception safe - when
exceptions occur you must make sure that the unwinding of the stack
up to the catch handler doesn't leave work undone. The main dangers
to avoid are leaving objects in inconsistent states and neglecting
to release resources. This includes, but is not restricted to,
dynamically allocated memory - don't fall for the popular
misconception that exception safety is only an issue for C++
programs (see, for example, [<a href=
"#Griffiths">Griffiths</a>]).</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>When Are
Exceptions Exceptional?</h2>
</div>
<p>Let's go back to first principles - what are exceptions for?</p>
<p>The exception mechanism can be seen as a way to provide
separation of concerns for error handling. It is particularly
useful when the code detecting the error is distant from the code
handling the error; exceptions provide a structured way of passing
information about the error up the call stack to the error
handler.</p>
<p>Exceptions also make errors non-ignorable by default, since
uncaught exceptions terminate the process. More traditional
alternatives such as error return values are often ignored and also
the flow of error information has to be explicitly coded which
leads to increased code complexity.</p>
<p>Exceptions can, in principle, be viewed as no more than a
mechanism to transfer control within a program. However, unless
care is taken, using exceptions as a flow of control mechanism can
produce obscure code. Stroustrup wrote: &quot;<span class=
"quote">Whenever reasonable, one should stick to the &quot;exception
handling is error handling&quot; view</span>&quot; [<a href=
"#Stroustrup">Stroustrup</a>].</p>
<p>If exceptions are being used for handling errors that need
nonlocal processing then the possible runtime overhead is unlikely
to be an issue. Typical uses of exceptions of this type, where
errors are relatively uncommon and the performance impact is
secondary, include:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>signalling errors which require aborting an entire unit of work,
for example an unexpected network disconnection</p>
</li>
<li>
<p>support for pre/post conditions and asserts</p>
</li>
</ul>
</div>
<p>Grey areas where, since exceptions are thrown for
'nonexceptional' or 'non-error' conditions, programmers disagree
about the validity of using exceptions include:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>dealing with invalid user input</p>
</li>
<li>
<p>handling uncommon errors in a recursive algorithm - for example
a parse failure for a SQL statement or numeric overflow in a
calculation</p>
</li>
<li>
<p>handling end of file (or, more generally, handling any kind of
'end of data' condition)</p>
</li>
</ul>
</div>
<p>Examples of abuse include:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>using exceptions to handle optional fields missing</p>
</li>
<li>
<p>using exceptions to give early return for common special
cases</p>
</li>
</ul>
</div>
<p>My own rough guideline for the 'grey areas' is that if all
exceptions became fatal then most programs should still run at
least four days out of five.</p>
<p>Others have a more flexible approach and use exceptions more
widely than this, sometimes unaware of the consequences of this
decision.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e1073" id=
"d0e1073"></a>Conclusion</h2>
</div>
<p>It is important to recognise that using exceptions has an
associated cost in C# and to a slightly lesser extent in Java and
C++.</p>
<p>Using exceptions in the main execution path through the program
may have major performance implications. Their use in time-critical
software, in particular to deal with non-exceptional cases, should
be carefully justified and the impact on performance measured.</p>
<p>When this is an issue alternative techniques which may be faster
include: using return values instead of exceptions to indicate
'expected' error conditions; and checking for common failures
before risking the exception.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e1082" id=
"d0e1082"></a>Acknowledgements</h2>
</div>
<p>Thanks to Phil Bass and Alan Griffiths for reviewing earlier
drafts of this article.</p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e1087" id=
"d0e1087"></a>References</h2>
</div>
<div class="bibliomixed"><a name="Griffiths" id="Griffiths"></a>
<p class="bibliomixed">[Griffiths] Alan Griffiths, &quot;More
Exceptional Java&quot;, <span class="citetitle"><i class=
"citetitle">Overload</i></span>, June 2002</p>
</div>
<div class="bibliomixed"><a name="Stroustrup" id="Stroustrup"></a>
<p class="bibliomixed">[Stroustrup] Bjarne Stroustrup, <span class=
"citetitle"><i class="citetitle">C++ programming language, 3rd
edition</i></span>, p375</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
