    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Statically Checking Exception Specifications</title>
        <link>https://members.accu.org/index.php/journals/340</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">Overload Journal #57 - Oct 2003 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c155/">57</a>
                    (12)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c65/">Programming</a>
                    (877)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c155-65/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c155+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Statically Checking Exception Specifications</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 02 October 2003 22:56:15 +01:00 or Thu, 02 October 2003 22:56:15 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e16" id="d0e16"></a></h2>
</div>
<p>The C++ newsgroups occasionally have threads about &quot;fixing&quot;
exception specifications (hereafter &quot;ES&quot;) so that they can be
checked at build-time. One practical problem is maintenance; an ES
depends on all the callees of a function as well as the function
itself. A more fundamental problem is that the exceptions that can
be thrown from templated code can vary with the instantiation
parameters and these are clearly unknowable when the programmer
writes the template. Perhaps the programmer is the wrong person to
be writing the ES.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a>Outline of a
Solution</h2>
</div>
<p>As the language stands right now, a function with no ES can
throw anything. My basic change is to say that when a function has
no explicit ES, the programmer wants the build system to deduce
one.</p>
<p>Except where dynamic linking is used, the compiler can determine
the most restrictive ES that the programmer could have written for
the function, by noting which exceptions are thrown and caught and
which functions are called. Since source code isn't generally
available for the called functions, the calculation cannot be
completed, but it can be reduced to an ES-expression. For example,
for this function the compiler might emit <tt class=
"computeroutput">ES(Foo)=ES(Baz)-Quux+Oink</tt>.</p>
<pre class="programlisting">
void Foo() {
  try { if(Baz()==42) throw Oink(); }
  catch (Quux&amp; b) { /*stuff*/ }
}
</pre>
<p>The linker then considers each function in turn, replacing
expressions with an absolute ES wherever possible. If not every
expression is resolved on the first pass, it makes another pass and
so on until completed. When compiling templates, the compiler can
emit ES-expressions that depend on template parameters. When
instantiating those templates (perhaps in a pre-link phase) the
expressions can be converted to non-dependent expressions.</p>
<p>The ES-expressions may be stored in a separate file, in the
object file using some extension, or in the object file as an
un-nameable data item that the linker is sure to discard. The first
is cumbersome, the second might conflict with an ABI and the third
is a filthy hack, but all three are workable. For static libraries,
library code is no different from our own as far as the linker is
concerned.</p>
<p>To eliminate false positives we need a new cast: the <tt class=
"function">nothrow_cast</tt>. It operates on pointers to functions;
so it modifies individual calls rather than the definitions. It
tells the compiler that this invocation of the function will not
throw the specified type. As usual with casts, if you lie to the
compiler then it will get its revenge in the form of implementation
defined or undefined behaviour.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e41" id="d0e41"></a>Four
complications</h2>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e45" id="d0e45"></a>1) Pointers to
Functions</h3>
</div>
<p>We expect to be able to write an expression for the minimal ES
involving only class names and the ES-es of named functions. With
function pointers, we don't know which function they point to. We
can, however, identify the pointer itself. It must be one of the
following 4 cases, and each can be named.</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>A global variable, such as <tt class="varname">baz</tt> (in the
example below).</p>
</li>
<li>
<p>A struct or class member, such as <tt class=
"function">Quux::m_pfn</tt>.</p>
</li>
<li>
<p>A function parameter or return value, such as <i class=
"parameter"><tt>Foo(4th)</tt></i>.</p>
</li>
<li>
<p>A local (automatic) variable, such as <tt class=
"varname">Foo()::name</tt>.</p>
</li>
</ul>
</div>
<p>For structure members, and function parameters, no attempt is
normally made to distinguish between different instances or
invocations. The local variable case can be eliminated by the
compiler because ES(Foo()::name) can always be replaced with the ES
of whatever was used to initialise name, but that's just an
optimisation.</p>
<p>What then? Well firstly, we can write the minimal ES for a
function that uses such a pointer, simply referring to the ES of
this named item.</p>
<pre class="programlisting">
extern int (*baz)();
void Foo() {
  try { if((*baz)()==42) throw Oink(); }
  catch (Quux&amp; b) { }
}
</pre>
<p>For this function, <tt class="literal">ES(Foo) =
ES(*baz)-Quux+Oink</tt>. Great, as long as the linker can figure
out <tt class="function">ES(*baz)</tt>. That's slightly harder than
<tt class="function">ES(Baz)</tt>, because, informally, <tt class=
"constant">Baz</tt> is a constant but <tt class="varname">baz</tt>
is a variable. However, we can model <tt class=
"function">(*baz)()</tt> as a function that calls all of the
functions that are assigned to <tt class="varname">baz</tt>
throughout the program, and each of those assignments will be seen
by the compiler. For each assignment, the compiler can spit out
<tt class="function">ES(*baz) += ... ,</tt> where the right hand
side is either <tt class="function">ES(Function)</tt> or <tt class=
"function">ES(*another_pointer)</tt>.</p>
<p>All the linker has to do is join all the pieces together. The
linker has an ES for a named object that possibly depends on the
ES-es of other named objects. Pointers to functions are now no
different from the functions themselves, and they all get thrown
into the pot and resolved together.</p>
<p>Pointers to pointers to functions, such as virtual function
tables, add little new to the problem. Instead of tracking
everything that <tt class="literal">(*p)()</tt> might point to we
have to track everything that <tt class="literal">(**pp)()</tt>
might point to.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e123" id="d0e123"></a>2)
Recursion</h3>
</div>
<p>In the presence of recursion, the call graph of a program has
cycles. For such functions we might have <tt class=
"literal">ES(Foo) = +Quux - Oink + ES(Foo)</tt>. Now, a function
neither increases nor decreases its ES by calling itself, so when
ES(Foo) appears on the right hand side of an expression
<span class="emphasis"><em>for itself</em></span> it can be
ignored. This allows us to break the cycles in recursive
systems.</p>
<p>Though we can ignore <tt class="function">ES(Foo)</tt> for
itself, we cannot ignore it elsewhere in the cycle. Part of the
cycle might throw exceptions that are caught by other parts of the
cycle, so the minimal ES exposed from a recursive cycle depends on
where you enter it.</p>
<p>In fact, this property that lets us evaluate ES-expressions in
any order we like. We can just pick one and recursively replace
every term on the right hand side with its expansion. Eventually we
will have a long expression with either absolute classes or
repetitions of the left-hand side. We remove the latter and we have
our absolute ES.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e141" id="d0e141"></a>Shared
Libraries</h3>
</div>
<p>For shared libraries, the linker doesn't see the actual code
when it is linking the client application. Whether this is a
problem depends on how the linking is achieved. I'm familiar with
Windows, so I'll treat the two cases on that platform and then
ignorantly assert that other platforms add nothing new.</p>
<p>The first case is load-time dynamic linking. The linker is given
an &quot;import library&quot; which describes the data and functions exposed
by the DLL. Any references to those are replaced with placeholder
items in the linked application and the operating system loader
&quot;fills in the blanks&quot; when the application is loaded into memory to
run. The provision of an import library makes this case very
similar to the static case. I believe it is sufficient for the
import library to contains ES-expressions for just those items
mentioned in the library's header file, since that it all the
compiler sees and so that is all that can appear in client object
files. This is certainly the case in my extended example.</p>
<p>The second case is run-time linking, where the program uses some
magic to conjure an address out of the ether. To take a slightly
non-trivial example...</p>
<pre class="programlisting">
extern IFoo* CreateSuperFoo();
        // in external library
IFoo* (*pfn)() = /*magic*/;
        // in client code
IFoo* p = (pfn)();
</pre>
<p>The details of <tt class="classname">CreateSuperFoo</tt> and
also of whatever <tt class="classname">IFoo</tt> derived class that
this library actually offers is a complete mystery to the build
system. It may be written in a different language so it is quite
possible that neither the compiler nor linker ever see it. Here we
have the one place where I think a programmer has to write an
ES.</p>
<p>The two main objections to ES that I noted at the beginning of
the article don't apply. A dynamically loaded extension cannot be a
template, though it may be an instantiation. Neither is it likely
to change often and even if it does, all knock-on effects on the
rest of the system are now the compiler's problem.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e162" id="d0e162"></a>False
Positives</h3>
</div>
<p>The final problem is that a function might throw an exception in
the case of bad input, and carry an ES to that effect, but many
callers might never feed bad parameters into the function. This
partly depends on one's programming style. If I might return to the
example...</p>
<pre class="programlisting">
if(x&lt;0) return false;
x = Sqrt(x); // assuming Sqrt() throws
             // when x&lt;0
</pre>
<p>There are certainly situations where one should write one of the
following...</p>
<pre class="programlisting">
Type Sqrt(Type x) { if(x&lt;0) abort(); ... }
Type Sqrt(Type x) { assert(x&gt;=0); ... }
</pre>
<p>...and happily spit in the faces of irate clients whose programs
were aborted, saying &quot;Don't do that then!&quot;. However, if we choose
to throw an exception instead then our clients will either be faced
with link-time ES errors or be forced to write such abominations
as</p>
<pre class="programlisting">
if(x&lt;0) return false;
try { x = Sqrt(x); } catch (...) {
/*unreachable*/ }
</pre>
<p>Not only does this look bad, but it probably incurs run-time
penalties (space and time). As with the function pointers, we know
something that the compiler doesn't, so we tell it with a cast.</p>
<pre class="programlisting">
if(x&lt;0) return false;
nothrow_cast&lt;std::logic_error&gt;(Sqrt)(x);
</pre>
<p>The <tt class="literal">nothrow_cast</tt> tells the compiler
that the function does not throw the mentioned type.</p>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e186" id="d0e186"></a>Costs and
Benefits</h2>
</div>
<p>I think it is worth confessing at this point that I've only
spent the time and energy on this because I wanted static checking.
Showing that it could be done with minimal impact on existing
source code seemed like a good way to argue the case. It all turned
out a little harder than I expected, so is static checking worth
this effort?</p>
<p>First, I note that the current standard allows ES violations at
runtime, so any ES violations detected by this system can only
result inlinker warnings. The linker must still generate a working
executable.</p>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e193" id="d0e193"></a>Costs and
Limitations</h3>
</div>
<p>The scheme derives the minimal ES from whatever source code is
presented to it, so the same program might &quot;fail&quot; if compiled
against StlPort rather than Dinkumware, or if compiler settings
change. A debug build might give false positives that an optimising
build can rule out as a by-product of its analysis.</p>
<p>You and your library vendors will all have to run all the code
through the newcompiler. The scheme adds no new compile-time
errors, so if the library vendors are still in business then
theyshouldn't have much of a problem with this.</p>
<p>If you don't modify the code then you may get warnings from the
ES checking phase, which will disable the various optimisations
mentioned below. You've lost nothing except for the extra build
costs.</p>
<p>If you are able to modify your code, you can eliminate all the
errors using explicit ES and <tt class="literal">throw_cast</tt>,
respectively. In both cases you can let the diagnostics guide you.
There is no problem of figuring out what changes to make, simply
the time involved in actually doing it.</p>
<p>The extension does require more complicated compilers and
linkers. I can't judge how much more complicated because I've never
written a compiler or linker, let alone one for C++. There is also
a cost in build time which I don't feel qualified to estimate, but
I have already noted that we don't need to reduce ES expressions in
any particular order.</p>
</div>
<div class="sect2" lang="en">
<div class="titlepage">
<h3><a name="d0e209" id="d0e209"></a>Benefits</h3>
</div>
<p>Having to treat ES violations as warnings actually yields a
couple of migration paths. A vendor could just ignore the whole
idea, implementing the <tt class="literal">nothrow_cast</tt> as a
do-nothing template function. Equally, since there is no new
run-time behaviour, the whole thing could be done by a tool like
lint.</p>
<p>If we can spot violations of <tt class="literal">throw()</tt> at
build-time rather than run-time, with any tool, the Abrahams
exception safety guarantees are easier to police. The cast may be
useful to the compiler even without the link-time checking, since
it can optimise more strongly if it believes exceptions can't
happen.</p>
<p>However, we get maximum benefit if the compiler and linker do
the checking. The cost of exceptions that cannot ever occur can be
reduced to zero in both time and space and any function that can't
throw (and all its immediate callers) can be recompiled with that
knowledge. With these optimisations, Standard C++ is more
attractive for embedded systems and vendors needn't include
compiler options to disable exceptions.</p>
<div class="sidebar">
<p>Consider the common scenario of an interface header file...</p>
<pre class="programlisting">
struct IFoo {             // struct IFoo::vbtl {
  virtual void Bar() = 0; // void (*pBar)(IFoo*);
  virtual int Quux() = 0; // int (*pQuux)(IFoo*);
};                        // };
void AddFoo(IFoo*);
void DoStuff();
</pre>
<p>...used by a shared library source file...</p>
<pre class="programlisting">
IFoo* global;            // IFoo::vtbl** global;
void AddFoo(IFoo* foo)
  { global = foo; }      // ES(AddFoo) = 0
                         // ES(*IFoo::vtbl.pBar) += ES((*AddFoo 1st).pBar)
                         // ES(*IFoo::vtbl.pQuux) += ES((*AddFoo 1st).pQuux)
void DoStuff()
  { global-&gt;Bar(); }     // ES(DoStuff) = ES(*IFoo::vtbl.pBar)
</pre>
<p>...and implemented in an application source file...</p>
<pre class="programlisting">
class Foo : public IFoo {
  virtual void Bar()     // ES(*IFoo::vtbl.pBar) += ES(Foo::Bar)
    { throw 1; }         // ES(Foo::Bar) = int
  virtual int Quux()     // ES(*IFoo::vtbl.pBar) += ES(Foo::Bar)
    { return 0; }        // ES(Foo::Quux) = 0
};
int main() {
  AddFoo(new Foo);       // ES( *(*(AddFoo 1st).pBar) ) += ES(Foo::Bar)
                         // ES( *(*(AddFoo 1st).pQuux) ) += ES(Foo::Quux)
  DoStuff();
}                        // ES(main) = ES(AddFoo) + ES(DoStuff)
</pre>
<p>In the application, the compiler can see that the <i class=
"parameter"><tt>IFoo*</tt></i> parameter to <tt class=
"methodname">AddFoo</tt> is actually a &quot;<tt class="literal">new
Foo</tt>&quot;. Had that detail not been visible, the compiler could
only have written...</p>
<pre class="programlisting">
ES( *(*(AddFoo 1st).pBar) ) += ES(*IFoo::vtbl.pBar)
ES( *(*(AddFoo 1st).pQuux) ) += ES(*IFoo::vtbl.pQuux)
</pre>
<p>We bring all this together in the linker. Our raw data from
compiling the library is...</p>
<pre class="programlisting">
ES(*(IFoo::vtbl-&gt;pBar)) += ES(*(AddFoo 1st)-&gt;pBar)
ES(*(IFoo::vtbl-&gt;pQuux)) += ES(*(AddFoo 1st)-&gt;pQuux)
ES(DoStuff) = ES(* IFoo::__vtable.pBar)
ES(AddFoo) = 0
</pre>
<p>That from compiling the application is...</p>
<pre class="programlisting">
ES(*IFoo::vtbl.pBar) += ES(Foo::Bar)
ES(Foo::Bar) = int
ES(*IFoo::vtbl.pQuux) += ES(Foo::Quux)
ES(Foo::Quux) = 0
ES(*(AddFoo 1st)-&gt;pBar)) += ES(*IFoo::vtbl.pBar)
ES(*(AddFoo 1st)-&gt;pQuux)) += ES(*IFoo::vtbl.pQuux)
ES(main) = ES(AddFoo) + ES(DoStuff)
</pre>
<p>Bringing it all together and substituting yields...</p>
<pre class="programlisting">
ES(main) = 0 + ES(DoStuff)
         = ES(*(AddFoo 1st)-&gt;pBar)
         = ES(*IFoo::vtbl.pBar)
         = ES(Foo::Bar)
         = int
</pre></div>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
