    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Professionalism in Programming #25</title>
        <link>https://members.accu.org/index.php/articles/209</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Professionalism in Programming, from CVu journal + CVu Journal Vol 16, #2 - Apr 2004</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c184/">Journal Columns</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c182/">Professionalism</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c103/">162</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c182-103/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c182+103/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Professionalism in Programming #25</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 01 April 2004 22:53:48 +01:00 or Thu, 01 April 2004 22:53:48 +01:00</p>
<p><strong>Summary:</strong>&nbsp;<p>In the first part of this series, we looked at what it means to
optimise code, and saw the cases for and against optimisation. In
this article, we'll look at the process of optimisation. We'll see
the correct, methodical approach that will lead to solid,
worthwhile code optimisations.</p>
</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2>The nuts and
bolts</h2>
</div>
<p>So how do you optimise? Rather than learn a list of specific
code optimisations, it's far more important to understand the
correct <span class="emphasis"><em>approach</em></span> to
optimising. Don't panic; we will see some programming techniques
later, but they must be read in the context of this wider
optimisation process.</p>
<p>The six steps for speeding up a program are:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Determine that it's too slow, and prove you do need to
optimise.</p>
</li>
<li>
<p>Identify the slowest code. Target this point.</p>
</li>
<li>
<p>Test the performance of the optimisation target.</p>
</li>
<li>
<p>Optimise the code.</p>
</li>
<li>
<p>Test that the optimised code still works (very important).</p>
</li>
<li>
<p>Test the speed increase, and decide what to do next.</p>
</li>
</ol>
</div>
<p>This sounds like quite a performance, but without it you'll
waste time and effort, and end up with crippled code that runs no
faster. If you're not trying to improve execution speed, adjust
this process accordingly; for example tackle memory consumption
problems by identifying which data structures are consuming all the
memory and target those.</p>
<p>This optimisation procedure is essential; you cannot
successfully optimise as you go along. First strive for clear,
correct code. Only later look at improving it. But do pay
continuous attention to efficiency as you write.</p>
<div class="sidebar">
<p class="title c3">Performance requirements</p>
<p>Certain kinds of system have very strict requirements for
program performance; this is especially true of embedded systems
controlling critical pieces of hardware.</p>
<p>General purpose software also has performance requirements,
although they tend to be specified less rigorously. You won't see
many mandated performance requirements - the software is expected
to execute in 'reasonable' time on a 'reasonable' computer. The
problem comes when you need to know the value of 'reasonable'. A
user with a modest PC will have a different opinion than a
developer with a cutting edge machine.</p>
<p>The adequate level of performance is usually captured in a
product's <span class="emphasis"><em>Requirements
Specification</em></span>. This contains a list of all
requirements, including those for the program's quality of service.
These specifications translate operational requirements into more
technical language that describes:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>what is correct behaviour, and</p>
</li>
<li>
<p>how we will determine if the software is acceptable.</p>
</li>
</ul>
</div>
<p>As well as the expected system behaviour, performance
requirements will describe the operational environment (in what
conditions the code should work, and when it doesn't matter).
Sometimes a UI Specification (describing how the user expects the
system to respond) indirectly covers a large number of performance
requirements.</p>
<p>A good performance specification should be quantitative, not
qualitative. This helps us to design acceptance tests based on
performance criteria. If a requirement is not verifiable, it is
useless.</p>
<p>The requirements specification should not be too prescriptive.
If a system is required to position a mechanism accurate to within
one millimetre, it is specified. However, the requirements
specification does not detail how this criterion is met - it might
be a function of the mechanical hardware, of the electronic motor
control, or of the controlling software (or more likely, a
combination of the three). This is a design decision, often taken
much later when the appropriate tradeoffs can be made.</p>
</div>
<p>Ensure that you don't write needlessly bloated code. This alone
will not lead to code that performs well, but it means that you'll
start optimising from a more reasonable place, and will not have to
address tedious code issues before getting on to more serious
optimisations.</p>
<p>It's important to begin optimisation with a clear goal in sight
- the more optimisation you perform, the less readable the code
becomes. Know the level of performance you require, and stop when
it's sufficiently fast. It's tempting to keep going, trying to
continually squeeze out a little extra performance.</p>
<p>To stand any chance of optimising correctly, you must take great
care to prevent external factors changing the way your code works.
When the world is changing under your feet, you can't compare
measurements realistically. There are two essential techniques that
help here:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Optimise your code separately from any other work, so the
outcome of one task doesn't cloud the other.</p>
</li>
<li>
<p>Optimise release builds of your program, not development builds.
The development builds may run in a very different manner to
release builds, due to the inclusion of debugging trace
information, object file symbols, etc.</p>
</li>
</ol>
</div>
<p>Now we'll look at each of these optimisation steps in more
detail:</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Prove you
need to optimise</h2>
</div>
<p>The first thing to do is make sure you really do need to
optimise. If the code's performance is acceptable then there's no
point in tinkering with it. Knuth said (himself quoting C.A.R.
Hoare): &quot;<span class="quote">We should forget about small
efficiencies, say about 97% of the time: premature optimisation is
the root of all evil.</span>&quot; There are so many compelling reasons
<span class="emphasis"><em>not</em></span> to optimise that the
quickest and safest optimisation technique is to prove that you
don't need to do it.</p>
<p>You make this decision based on program requirements or
usability studies. With this information you can establish a
commercial imperative for optimisation, and determine whether it
takes priority over adding new features.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Identify the
slowest code</h2>
</div>
<p>This is the part that most programmers get wrong. If you're
going to spend time optimising, you need to target the places where
it will make a difference. Investigations show that the average
program spends more than 80% of its time in less than 20% of the
code [<a href="#Boehm87">Boehm87</a>]. This is known as the 80/20
rule<sup>[<a name="d0e125" href="#ftn.d0e125" id=
"d0e125">1</a>]</sup>. That's a relatively small target; it's very
easy to miss, and spend time optimising code that's rarely run.</p>
<p>You might notice that a part of your program has some relatively
easy optimisations, but if it's seldom executed, then there's no
point in doing so - clear code is better than faster code in this
situation.</p>
<p>So how do you work out where to focus your attention? The most
popular - and often the most effective - technique is to use a
profiler. This times the flow of control around your program. It
shows where that 80% of execution time is going, so you know where
to concentrate your effort.</p>
<p>A profiler <span class="emphasis"><em>doesn't</em></span> tell
you which parts of the code are slowest; this is a common
misconception. It actually tells you where the CPU spends most of
its time. This is subtly different<sup>[<a name="d0e138" href=
"#ftn.d0e138" id="d0e138">2</a>]</sup>. You have to interpret these
results, and use your brain. The program might spend most of its
execution time in a few perfectly valid functions which cannot be
improved at all - it's a sad fact that you can't always optimise.
Sometimes the laws of physics win.</p>
<p>There are plenty of benchmarking programs around: many excellent
commercial programs, and a number of freely available tools. It's
worth spending money on a decent profiler - optimisation can easily
eat into your time; this is also an expensive commodity. If you
don't have a profiler available, there are a few other timing
techniques you can try:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>Put manual timing tests throughout your code. Make sure you use
an accurate clock source, and that the time taken to read the clock
will not affect program performance too much.</p>
</li>
<li>
<p>Count how often each function is called (some debug libraries
provide support for this kind of activity).</p>
</li>
<li>
<p>Exploit compiler-supplied 'hooks' to insert your own accounting
code when each function is entered/exited. Many compilers provide a
means to do this - some profilers are implemented using such a
mechanism.</p>
</li>
<li>
<p>Sample the program counter; interrupt your program periodically
in a debugger to see where control is. This is harder in
multithreaded programs, and is a very slow, manual approach. If you
have control over the execution environment, you can write
scaffolding to automate this kind of test - effectively writing
your own form of profiler.</p>
</li>
<li>
<p>Test an individual function's impact on the total program
execution time by making it slower. If you suspect that a
particular function is causing a slowdown, try replacing its call
with two calls in succession, and measure how it affects execution
time<sup>[<a name="d0e160" href="#ftn.d0e160" id=
"d0e160">3</a>]</sup>. If the program takes 10% longer to run, then
the function consumes approximately 10% of execution time. You can
use this as a very basic timing test.</p>
</li>
</ul>
</div>
<p>Whilst a profiler (or equivalent) is a good starting point to
choose optimisation targets, you can easily miss quite fundamental
problems. The profiler only shows how the code in the current
design executes - and encourages you to perform code-level
improvement only. Look at larger design issues, too. The lack of
performance may not be due to a single function, but a more
pervasive design flaw. If it is, then you'll have to work harder to
remedy the problem. This shows how important it is to get the
initial code design right, with knowledge of established
performance requirements. Don't rely solely on a profiler to find
the causes of program inefficiency - you might miss important
problems.</p>
<p>When profiling make sure that you use realistic input data,
simulating Real World events. The way your code executes may be
drastically affected by the kind of input you feed it, or by the
way it is driven, so make sure that you provide true representative
input sets. If possible, capture a set of real input data from a
live system.</p>
<p>Try profiling several different data sets, to see what
difference this makes. Select a very 'basic' set, a 'heavy use'
set, and a number of 'general use' sets. This will prevent you
optimising for the particular quirks of one input data set. Select
profiling test data carefully to represent Real World program use.
Otherwise, you might optimise parts of the program that are not
normally run.</p>
<div class="sidebar">
<p class="title c3">Using a profiler</p>
<p>Owning a hammer or a saw doesn't instantly make you an expert
carpenter; owning a profiler doesn't make you an expert optimiser.
You have to know how to use it to perform the task - any tool
increases your productivity, it doesn't give you ability.</p>
<p>Although each profiler is different, they follow the same
general principles. A profiler is intimately entwined with a
particular compiler - it has to reach into the object files and
extract the compiler's 'debugging' information to work out which
function is currently being run. The basic use of a profiler
is:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Rebuild your program to support profiling. This usually means
linking the executable <span class=
"emphasis"><em>without</em></span> stripping out debug information.
When you do this, you're no longer probing a release build of the
code; this contravenes one of the previous optimisation laws.
Sigh.</p>
</li>
<li>
<p>Inevitably using a profiler will interfere with running code,
but generally this interference will not affect the way the code
runs, or skew the timing results too much. We have to live with
this.</p>
</li>
<li>
<p>Run the program in the profiler. This is probably an option in
your IDE, or a simple profiler command.</p>
</li>
<li>
<p>The profiler will spit out its results when the program
finishes. You can browse the profiler's output file, or use a neat
graphical tool to interpret this information and display nice
graphs.</p>
</li>
<li>
<p>Work out what the results are telling you, and act on them.</p>
</li>
</ol>
</div>
<p>The most interesting part of the profiler output is the list of
functions that the CPU spent most time in. Use this as a starting
point for your optimisation. Read the list, taking into account the
number of times each function was called:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>If a function runs very quickly but gets called frequently, try
to reduce the number of times it is invoked.</p>
</li>
<li>
<p>If a function is only ever called once but takes an age to
complete, look for ways to speed up its execution.</p>
</li>
</ul>
</div>
</div>
<p>Having completed this step, you've found the areas of your code
where a performance improvement will have the most benefit. Now
it's time to attack them...</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Testing the
code</h2>
</div>
<p>We recognised three testing phases in the optimisation
procedure. For each piece of code targeted, we test:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>its performance before optimisation,</p>
</li>
<li>
<p>that it still works correctly once optimised, and then</p>
</li>
<li>
<p>its performance after optimisation.</p>
</li>
</ul>
</div>
<p>Programmers often forget the second check: that the optimised
code still works correctly in <span class=
"emphasis"><em>all</em></span> possible situations. It's easy to
check the 'normal' mode of operation, but it's not in our nature to
test each and every corner case. This can be the cause of weird
bugs late in the day, so be very rigorous about this.</p>
<p>You <span class="emphasis"><em>must</em></span> measure the
code's performance before and after modification, to make sure that
you have made a real difference - and to make sure that it is a
change for the better; sometimes an 'optimisation' can be an
unwitting <span class="emphasis"><em>pessimisation</em></span>. You
can perform these timing tests with your profiler, or by inserting
timing instrumentation by hand. Never try to optimise code without
performing some kind of before and after measurement.</p>
<p>These are some very important things to think about when running
your timing tests:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>Run both the 'before' and 'after' tests with exactly the same
set of input data, so that you're testing exactly the same thing.
Otherwise your tests are meaningless; you're not comparing like
with like. An automated test suite is best<sup>[<a name="d0e241"
href="#ftn.d0e241" id="d0e241">4</a>]</sup> - with the same kind of
'live' representative data, we used in the profiling step.</p>
</li>
<li>
<p>Run all tests under identical prevailing conditions, so that
factors like the CPU load or amount of free memory don't affect
your measurements.</p>
</li>
<li>
<p>Ensure that your tests don't rely on user input. Humans can
cause timings to fluctuate wildly. Automate every possible aspect
of the test procedure.</p>
</li>
</ul>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Optimising
the code</h2>
</div>
<p>We'll investigate some specific optimisation techniques in the
next article. Speed-ups vary from the simple refactoring of small
sections of code to more serious design-level alterations. The
trick is to optimise without totally destroying the code.</p>
<p>Determine how many different ways exist to optimise the
identified code, and pick the best. Only perform one change at a
time; it's less risky, and you'll have a better idea of what
improved performance the most. Sometimes it's the least expected
things that have the most significant optimisation effects.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>After
optimisation</h2>
</div>
<p>We already mentioned performance testing after an optimisation.
It's really important to run these before and after performance
benchmarks - how else do you know you've made a successful
modification?</p>
<p>If an optimisation proves to be unsuccessful, remove it. Back
out your changes. This is where a source control system is useful,
helping you to revert to the previous code version.</p>
<p>You should also remove the <span class=
"emphasis"><em>slightly</em></span> successful optimisations.
Prefer clear code to modest optimisations (unless you're absolutely
desperate for an improvement, and there are no other avenues to
explore).</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2>Next
time</h2>
</div>
<p>We'll round off this series by looking at some specific
optimisation techniques, and discover how to avoid optimisation
altogether.</p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2>
</div>
<div class="bibliomixed"><a name="Boehm87" id="Boehm87"></a>
<p class="bibliomixed">[Boehm87] Boehm, Barry, &quot;Improving Software
Productivity&quot;, 1987, <span class="citetitle"><i class=
"citetitle">IEEE Computer</i></span>, Vol 20, No 9.</p>
</div>
</div>
<div class="footnotes"><br>
<hr class="c4" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e125" href="#d0e125" id=
"ftn.d0e125">1</a>]</sup> Although some go so far as to claim this
should be the 90/10 rule.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e138" href="#d0e138" id=
"ftn.d0e138">2</a>]</sup> All code runs at a fixed rate, based on
the speed of the CPU clock, the number of other processes being
juggled by the OS, and this thread's priority.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e160" href="#d0e160" id=
"ftn.d0e160">3</a>]</sup> This won't necessarily make the function
run twice as slowly. File system buffers, or CPU memory caches can
enhance the performance of repeated code sections. Treat this as a
very rough guide - more qualitative than quantitative.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e241" href="#d0e241" id=
"ftn.d0e241">4</a>]</sup> Perhaps you can even use a part of your
regression test suite?</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
