    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Garbage Collection Implementation Considerations</title>
        <link>https://members.accu.org/index.php/journals/547</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">Overload Journal #30 - Feb 1999 + Design of applications and programs</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c174/">30</a>
                    (11)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c67/">Design</a>
                    (236)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c174-67/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c174+67/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Garbage Collection Implementation Considerations</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 26 February 1999 16:50:51 +00:00 or Fri, 26 February 1999 16:50:51 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<p class="c2"><span class="remark">Francis Glassborow: It is
possible that in copy-editing Henrik's submission I may have
inadvertently changed the meaning. His English is much better than
my German. His grasp of Garbage Collection is also much better than
mine. If any specialist in this area would like to take over
editing future submissions in this area I am sure all concerned
will be grateful.</span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e23" id="d0e23"></a>Abstract</h2>
</div>
<p>I have been writing compilers for more than 15 years. Six years
ago I developed my own programming language. It is a pure
functional language with interfaces to C, C++ and Java. This
language needs a garbage collector (GC). This is extremely
difficult because each language has its own memory management and
so the requirements of the GC differ. In this article I will give a
global overview of the problems of implicit and explicit GC's. In a
future Overload I will write about the GC of my programming
language. To give you an easy start I begin with the GC issues of C
and C++.</p>
<p>First you must decide if you want to implement the GC in the
language itself - implicit - or in an external library -explicit. A
second possibility is to use the GC in a tool like a debugger. The
differences which are shown in this article are based on the
implicit and explicit kinds of GC.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e30" id="d0e30"></a>What has to be
Done?</h2>
</div>
<p>The development time is likely to be less if you use implicit GC
rather than writing an explicit GC. The reason is that the implicit
version is much more specific to the language and the applications
written in the language.</p>
<p>The explicit version has many more requirements because you
don't know the memory structure of the application on which you
want to use the explicit GC. A typical example for a small implicit
implementation of GC are C++ constructors and destructors. You can
find examples of explicit GC in any GC-Library. Generally you can
say that from the technological side it is much better to use an
implicit GC and not an explicit one. You can use the implicit GC of
a programming language for your application.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e37" id="d0e37"></a>Write Once -
Use Often</h2>
</div>
<p>If you write an explicit tool like a debugger which contains GC
you have the advantage that you can use the GC in each application
which is written in the language the debugger understands. The
disadvantage is that there is no certainty that the GC works
correctly. In some cases the GC may seem to work incorrectly. In
such a situation you cannot determine if it is the GC or the
application that is at fault. You have to test the application in
several different ways to find out what is going wrong. If you
could use an implicit GC you can be sure that the GC works
correctly. In addition, in such an application you can also work
with an explicit GC. In this way you see what is going wrong
because you can compare the results of the implicit GC with the
explicit one.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e42" id="d0e42"></a>Does it work -
or Doesn't it?</h2>
</div>
<p>We need to fill out the requirements to make the work of the GC
useful. That means that you should write your application
independent of the GC used. The result is that during the test
phase in which you use the GC, it is much easier to find errors. In
addition the performance of the application improves. In
particular, strings and large numbers must be checked very
carefully. In some cases it is necessary to test with GC packages
which are specialised for handling strings and large numbers. This
technology is also preferred for other language constructs which
should handle special kinds of data.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e47" id="d0e47"></a>Safe or not
Safe<sup>[<a name="d0e50" href="#ftn.d0e50" id=
"d0e50">1</a>]</sup></h2>
</div>
<p>In the most cases an implicit GC is much safer than an explicit
one. The reason for that is that an implicit GC can be used on
every application that is written in the language that the debugger
understands. So the requirements of the GC for the application are
much less than for explicit GC. An implicit GC depends on the GC of
the programming language being used. Because of this it is much
more effective from the point of view of safety to use an implicit
GC.</p>
<p>The handling of pointers is much easier with implicit GC than
with explicit ones. A very special point for explicit GC is the
handling of pointers that are cast. Such problems only happen in
object oriented languages and not in structured ones. The GC of an
OO language is much more complex. In general it is recommended to
use an implicit GC for handling casts and pointers in OO languages.
The pointers and casts in a structured language are less
complicated and can be handled by any explicit GC. The reason for
that is that the possibilities in a structured language for
handling pointers and casts are less than in an OO language.
Finally there are two points which make GC unsafe. The first is
heavily optimised code and the second is threaded code. In the
first case it is difficult to locate the problem in the code
because the optimisation deletes a lot of code and the behaviour of
the management systems changes from unoptimised code to optimised
code. The problem in threaded code is to locate the GC in the right
place. In the most cases you don't know where the GC happens. There
can be two locations. The first is the local one, the second the
remote one. It is too difficult to for an explicit GC to handle
remote code. Remote code can only handled by an implicit GC which
&quot;follows&quot; the code. Local code can handled by both kinds of GC.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e57" id="d0e57"></a>Bugs, Bugs,
Bugs</h2>
</div>
<p>In the above sections I have often mentioned the use of GC
inside of a debugger. This makes sense because checking an
application with a GC is the something like finding a bug. Because
of this it makes sense to combine the GC and the functionality of a
debugger. Of course if you use very complicated constructs it could
be possible that the debugger cannot handle them. The consequence
is that the GC doesn't give correct results. If the GC is included
in a debugger it depends on the results of it. So you have to
decide whether to put the GC in a debugger or in a separate
library. The advantage is that in the case of a library you can put
the statements of the GC in the code of the application wherever
you want. In the case of the debugger you cannot do that. The
placement of GC statements in most cases will be automated by the
debugger itself. The decision to put the GC in a debugger or not
depends on that what you want to do.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e62" id="d0e62"></a>Less or More
Memory</h2>
</div>
<p>The biggest difference between implicit and explicit GC is that
explicit GC needs much more memory than the implicit one. The
disadvantage of the implicit GC in C or C++ is that you can only
handle small memory structures. With the explicit one you can
handle the real big ones. The reason for that behaviour is clear:
The explicit GC needs much more memory because of the handling of
large memory areas. The implicit needs less memory because of the
less memory it has to handle. A good solution for this problem is
for explicit GC to handle large memory in small sections. In this
way it can allocate part by part if as necessary. That is it works
like an extended implicit GC.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e67" id="d0e67"></a>Fast or
Slow</h2>
</div>
<p>The handling of memory is not the only problem. The performance
of the application is also affected. Implicit GC is the best
solution from that viewpoint. A special case occurs if the
application runs in a multithreaded environment. The GC is there
much more complex.</p>
<p>Of course using implicit GC results in the disadvantages of the
section above but at the moment there is no other solution if the
application is speed limited.</p>
<p>A future solution will be to include an explicit GC in a
programming language. In this way the explicit GC becomes an
implicit ones. The explicit GC becomes part of the programming
language<sup>[<a name="d0e76" href="#ftn.d0e76" id=
"d0e76">2</a>]</sup>}.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e80" id="d0e80"></a>Time
Enough?</h2>
</div>
<p>The time factor for an explicit GC is much higher for an
implicit one. An allocation process of an implicit GC is about 60%
faster than for an explicit one. If you have to use an explicit GC
and want to avoid this disadvantage you have to ensure that the
compiler supports the explicit GC. A solution could be if virtual
memory is provided by the GC. Modern compilers can handle this kind
of memory. A much more complex technology is necessary if the GC
only works on idle-time. This can not be realised in real-time
applications but only in applications which strongly depend on the
user interaction.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e85" id="d0e85"></a>Which Page
Please?</h2>
</div>
<p>As I mentioned in the last section, virtual memory helps to
minimise time. Any kind of dynamic memory should work together with
virtual memory. The advantage is that the access time to the disk
for local memory paging is much less. This is recommended in any
kind of application - whether or not it is real-time software,
whether is a small or a big software application. The technology of
using/connecting GC with virtual memory is always a good solution.
This kind of memory can be used in both implicit and explicit GC
systems. Especially the handling of pointers of very large memory
sections can be done on VM. In some cases it is possible to
allocate a very big memory area with only one pointer by using VM.
The final question is where the VM comes from.</p>
<p>There are two possibilities. First you can use real RAM. This is
very fast but you need a lot of because the VM needs it as well as
the application. A much better solution is to use a part of a hard
disk. UNIX systems can change normal hard-disk memory to virtual
memory. Then existing VM is much faster on access to the normal
hard-disk. The size of the VM depends on the operating system used
and on the requirements of the application. In general 128MB is
often enough for a normal business application.</p>
</div>
<div class="footnotes"><br>
<hr class="c3" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e50" href="#d0e50" id=
"ftn.d0e50">1</a>]</sup> Henrik seemed to have inverted implicit
and explicit in this section. If GC experts think the section is
wrong, the error is probably mine. FG</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e76" href="#d0e76" id=
"ftn.d0e76">2</a>]</sup> GC is done automatically in my own
programming language</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
