    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Software Engineers Toolbox</title>
        <link>https://members.accu.org/index.php/articles/727</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 8, #2 - Apr 1996</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c136/">082</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Software Engineers Toolbox</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 April 1996 13:15:27 +01:00 or Wed, 03 April 1996 13:15:27 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a>The
Preprocessor - Part 1</h2>
</div>
<p>When we think about compiling a program, we tend to think of the
whole process (starting with a file of C source code and finishing
with an executable program image) as a single, monolithic process.
This view is reinforced by the fact that often we need only issue a
single command (such as <tt class="literal">cl</tt>, <tt class=
"literal">cc</tt> or <tt class="literal">bcc</tt>) to go from the
source file to an executable program. (I gave an detailed overview
of the whole process in an earlier, six-part series, starting in
CVu, Vol 5 Issue 5.) In fact, the processes of compilation is
comprised of many distinct activities performed in series. The C
standard defines eight phases of compilation. This is a bit much
for our purpose here, so I will simply divide it into three major
blocks; preprocessing, translation and linking. Translation is the
principle activity, which converts a text file of valid C tokens
into an <span class="emphasis"><em>object file</em></span> (a
partial binary program image).</p>
<p>The linker joins these object files into a single executable
unit. The file which the translator takes as its input is called a
<span class="emphasis"><em>translation unit</em></span>. This is
not (in general) the same as the file of C code - the <span class=
"emphasis"><em>source file</em></span> - that you created with your
editor. Although the source file and translation unit are both text
files, they can be quite different. The purpose of the preprocessor
is, as we shall see, to convert the <span class=
"emphasis"><em>source file</em></span> that you wrote into a
<span class="emphasis"><em>translation unit</em></span> that the
translator can handle.</p>
<p>Provided an ISO C compiler obeys the rules of translation, it
does not have to provide a separate preprocessor. The functionality
of the preprocessor and the translator (and the linker, for that
matter) could be built into a single, monolithic program. In
practice, this is both unnecessary and undesirable.</p>
<p>Usually, the compiler will be implemented as a set of three,
four or even more programs that run sequentially. Each program
creates a temporary output file which is used as the input to the
next program in the chain. Normally, these temporary files are
transparent to the user but most compilers have options which allow
you to save some of them (or at least a representation of them) to
a text file.</p>
<p>This can be a very useful way to learn more about the way the
preprocessor works. Most compilers let you create a preprocessor
output file, which is more-or-less the <span class=
"emphasis"><em>translation unit</em></span> that the translator
sees. By comparing this with the original <span class=
"emphasis"><em>source file</em></span>, you can see just what
transformations the preprocessor makes.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e61" id="d0e61"></a>Why a
Preprocessor?</h2>
</div>
<p>Now that I've shown where the preprocessor fits into the scheme
of things and broadly hinted at what it does, the obvious question
must be, why do we need one in the first place? The glib answer
would be 'To make up for some shortcomings in the language.'</p>
<p>That would be at least half true. Some of the features that the
preprocessor is used for can, and probably should, be done in the
translator. (Some may even make it into the next revision of the
standard.) However, given that we have to live with a certain core
language definition, let's see what a preprocessor can do for
us.</p>
<p>Probably the single most common use of the preprocessor is file
inclusion. There are often lines of code, such as function
prototypes, type definitions and global variable declarations,
which we wish to include in more than one file of a project. It is
impractical (and dangerous) to simply add the same lines to every
file. Maintenance would be (is!) a nightmare. The answer is to put
all the common code in a file of its own, then simply include the
text of that file wherever it is needed in other files.</p>
<p>manifest constants. A manifest constant is one where a constant
value is associated with a name. The code then uses the name
instead of the value. This is important for several reasons. First,
numbers by themselves give no clue to their purpose. By using a
name instead, you can provide meaning. Seeing an array definition
with a size of 10 tells you little.</p>
<p>Seeing a size of MAX_BUTTONS tells you much more. Second, you
may need to use a certain value in many places. If you use the
'raw' number, changing it becomes very difficult. You can't just
change all occurrences of 10 to 12, say, because 10 may have been
used for many different reasons. You would have to laboriously
check each one to see if you should change it. If you use a
manifest constant, you only change the value associated with the
name and the new value is automatically used in all the places
where the name occurs. const qualified variables provide a solution
in some contexts, but there are problems. A const variable always
allocates storage. This is usually unnecessary and may waste space
if there are many constants. Also, a const variable cannot always
be used in place of a true constant. The preprocessor can be used
to get around this.</p>
<p>An important requirement in many applications is speed and code
can often be speeded up considerably by avoiding function call
overheads. The solution is to this is to create some sort of
function which expands to inline code. Some languages allow this by
applying a keyword (such as <tt class="literal">inline</tt>) to a
normal function. In C, we can only do this by using the
preprocessor.</p>
<p>Sometimes, it is necessary to change the code depending on
certain conditions. Examples are compiling for different
environments or building debug or production versions of code. It
is useful to be able to include code for all possibilities and then
select which code will be used at compile time. This is called
conditional compilation.</p>
<p>These are the major requirements which cannot be adequately met
with just the core language definition. In C, they are met using
preprocessor facilities. There are a few other, minor tasks which
the preprocessor can do, which I will discuss later.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e83" id="d0e83"></a>So how is it
done?</h2>
</div>
<p>As a programmer, you control the actions of the preprocessor by
including <span class="emphasis"><em>preprocessor
directives</em></span> in the source file. These are commands which
the preprocessor (and only the preprocessor) knows how to execute.
There are thirteen preprocessor directives, which are listed in the
following table.</p>
<div class="table"><a name="d0e91" id="d0e91"></a>
<p class="title c2">Table 1. </p>
<table summary="" border="1">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>Directive</td>
<td>Action</td>
</tr>
<tr>
<td>#include</td>
<td>Include a file of text</td>
</tr>
<tr>
<td>#define</td>
<td>define a macro</td>
</tr>
<tr>
<td>#undef</td>
<td>undefine a macro</td>
</tr>
<tr>
<td>#if</td>
<td>Conditional test</td>
</tr>
<tr>
<td>#elif</td>
<td>Conditional else-if test</td>
</tr>
<tr>
<td>#else</td>
<td>Conditional else</td>
</tr>
<tr>
<td>#ifdef</td>
<td>Conditional test - macro is defined</td>
</tr>
<tr>
<td>#ifndef</td>
<td>Conditional test - macro is not defined</td>
</tr>
<tr>
<td>#endif</td>
<td>End of conditional test</td>
</tr>
<tr>
<td>#error</td>
<td>Abort compilation</td>
</tr>
<tr>
<td>#line</td>
<td>Set line number and file name</td>
</tr>
<tr>
<td>#pragma</td>
<td>Implementation specific command</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Each of these directives must be entered so that the '#' is the
first non-blank character on a line. Earlier compilers often
required the '#' to be in the first column. ISO C allows any amount
of white space (space and horizontal tab only) before the '#' or
between the '#' and the directive name. This allows preprocessor
directives to indented in the same way as other code to improve
readability. This is particularly useful when using conditional
directives.</p>
<p>There are also three preprocessing operators, defined, # and
##.</p>
<div class="table"><a name="d0e164" id="d0e164"></a>
<p class="title c2">Table 2. </p>
<table summary="" border="1">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td>Operator</td>
<td>Action</td>
</tr>
<tr>
<td>defined</td>
<td>evaluates true if macro is defined</td>
</tr>
<tr>
<td>#</td>
<td>stringising</td>
</tr>
<tr>
<td>##</td>
<td>token pasting</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Now lets look at how these are used.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e190" id="d0e190"></a>Defining
Macros</h2>
</div>
<p>A macro is often used to mean a sequence of instructions to be
executed, as in shell scripts or DOS batch files. C macros are
rather different. A C macro is a form of programmable text
substitution. There are two forms of macro, object-like macros and
function-like macros. Where it is necessary to make a clear
distinction between them, I will refer to them as o-macros and
f-macros. The simpler form of macro is the o-macro. This has the
syntax:</p>
<pre class="programlisting">
#define  identifier replacement-list new-line
</pre>
<p>The identifier defines the name of the macro. It must conform to
the usual rules for naming C identifiers, so a macro name can look
just like a variable or function name. The replacement list is all
the text following the identifier (including spaces), except the
white space between identifier and replacement-list is discarded,
as is any trailing white space at the end of the line. Once the
preprocessor has processed this line, it will look for the macro
name anywhere in the C code except inside string constants. If it
finds one, it will replace the name by all the text in the
replacement list.</p>
<p>For examples, consider the macros directive:</p>
<p>Once the pre-processor has seen the first macro definition, it
will change any (non-string) occurrence of MAX_PORTS to 4. So if we
had the following program fragment in our source file,</p>
<pre class="programlisting">
void func(void) {
  for(i=0; i&lt;=MAX_PORTS; I++)
    init_port(i);
  puts(&quot;MAX_PORTS\n&quot;);
}
</pre>
<p>it would be changed by the preprocessor to</p>
<pre class="programlisting">
void func(void) {
  for(i=0; i&lt;=10; i++)    
  // that MAX_PORTS was changed
    printf((&quot;num is %d\n&quot;, I);
  puts(&quot;MAX_PORTS\n&quot;);   
// that one wasn't it's in a string
}
</pre>
<p>If you have been reading attentively, you should realise by now
that this is using an o-macro to provide a manifest constant.
(There are other applications, as we shall see in a minute.)
Although this example uses an integer, this is not the only way it
can be used. You can just as easily create manifest constants for
other types of constant. For example,</p>
<pre class="programlisting">
#define  DEFAULT_DIR &quot;/usr/bin&quot;
</pre>
<p>and</p>
<pre class="programlisting">
#define  EPSILON  1.2E-10
</pre>
<p>create manifest constants for a string and a real number
respectively.</p>
<p>O-macros can be used for purposes other than just manifest
constants. They can be used for any straight text substitution. A
typical example which you may sometimes see is something like:</p>
<pre class="programlisting">
#define  LOOP     while(1)
</pre>
<p>This is then used in the code as, say:</p>
<pre class="programlisting">
LOOP
{
 /*  continuous process */
}
</pre>
<p>Another example that I have seen is to create aliases for
complex variable expressions. I worked on a package that used
complex communications message formats. The structure into which
the data was stored had several levels of nested structs and
unions. If the element names were used in full, you ended up with
long lines of code full of names like <tt class=
"literal">Q931.Infomation.Codeset6.UserID.String</tt>. To make the
code more readable (and more man-ageable), these were aliased using
o-macros such as:</p>
<pre class="programlisting">
#define  Msg_Username Q931.Infomation.Codeset6.UserID.String
</pre>
<p>Now the code could use the much shorter (and more meaningful)
alias name to refer to the data object.</p>
<p>So far, I have only given examples of replacement lists which
are a single token. They can, of course, be much more complex than
this. They can even include references to other macros. Say, for
example, that we which to create a manifest constant that yields a
number which is ten bigger than a default buffer size, BUF_SIZE.
Somewhere in the program is a definition</p>
<pre class="programlisting">
#define  BUF_SIZE  100
</pre>
<p>At a point after this definition, we can write</p>
<pre class="programlisting">
#define  NEW_BUF  BUF_SIZE + 2
</pre>
<p>When the preprocessor encounters NEW_BUF, it will simply replace
it with the replacement list text 'BUF_SIZE + 2'. After the
substitution, the preprocessor rescans the line looking for more
macros. It now sees the macro BUF_SIZE, and replaces it with its
replacement list, 100. The next rescan sees no more macro names, so
makes no more changes. If we had started with the line</p>
<pre class="programlisting">
int  buf[NEW_BUF];
</pre>
<p>then the final result in the translation unit would be</p>
<pre class="programlisting">
int buf[100 + 2];
</pre>
<p>Now this example would work as expected, but there is a
potential problem. What if you wrote a declaration</p>
<pre class="programlisting">
int  buf[NEW_BUF * 2];
</pre>
<p>Since you expect NEW_BUF to be 102, the expected answer is that
the array will have 204 elements, but it doesn't. Remember, these
are not true manifest constants, simply textural replacements. What
you end up with is</p>
<pre class="programlisting">
int buf[100 + 2 * 2];
</pre>
<p>which evaluates to 104, not 204. This a very common precedence
problem with this type of complex expression in replacement lists.
To avoid such problems, you must adopt the practice of ALWAYS
putting the expression in parenthesis. This then ensures that the
expressions are evaluated as you would expect. The NEW_BUF
definition should be written as:</p>
<pre class="programlisting">
#define  NEW_BUF  (BUF_SIZE + 2)
</pre>
<p>Although the examples I have given so far contain complete
expressions, this is not necessary. A macro replacement list can
contain just about any text, including no text. A macro with no
replacement list is called an empty macro and has several uses. We
will see uses for empty macros when we look at conditional
compilation. The only restriction on the replacement list text is
that it must consist of valid preprocessor tokens (pp-tokens). The
set of valid pp-tokens includes identifiers, character constants,
string literals, pp-numbers (integers and floating numbers),
operators and punctuators. What you cannot have is a partial
token,</p>
<pre class="programlisting">
#define  STRING  &quot;text string&quot;
</pre>
<p>is acceptable,</p>
<pre class="programlisting">
#define STRING1  &quot;text
#define STRING2   string&quot;
</pre>
<p>are not, even if they would appear to create a valid after
processing. e.g.</p>
<pre class="programlisting">
puts( STRING1 STRING2);  
</pre>
<p>would appear to give a valid line of code after substitution,
but it is not allowed. Each replacement list must contain complete
tokens. They do not, however, have to be complete expressions.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
