    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Recursive Make Considered Harmful</title>
        <link>https://members.accu.org/index.php/articles/2004</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + Overload Journal #71 - Feb 2006</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c78/">Overload</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c141/">71</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-141/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+141/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Recursive Make Considered Harmful</h1>
<p><strong>Author:</strong>&nbsp;Martin Moene</p>
<p>
<strong>Date:</strong> 01 February 2006 21:22:40 +00:00 or Wed, 01 February 2006 21:22:40 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<h2>Abstract</h2>
<p>For large UNIX projects, the traditional method of building the project is to use recursive <code>make</code>. On some projects, this results in build times which are unacceptably large, when all you want to do is change one file. In examining the source of the overly long build times, it became evident that a number of apparently unrelated problems combine to produce the delay,but on analysis all have the same root cause.</p>
<p>This paper explores a number of problems regarding the use of recursive <code>make</code>, and shows that they are all symptoms of the same problem. Symptoms that the UNIX community have long accepted as a fact of life, but which need not be endured any longer. These problems include recursive <code>make</code>s which take â€œforeverâ€ to work out that they need to do nothing, recursive <code>make</code>s which do too much, or too little, recursive <code>make</code>s which are overly sensitive to changes in the source code and require constant Makefile intervention to keep them working.</p>
<p>The resolution of these problems can be found by looking at what <code>make</code> does, from first principles, and then analyzing the effects of introducing recursive <code>make</code> to this activity. The analysis shows that the problem stems from the artificial partitioning of the build into separate subsets. This, in turn, leads to the symptoms described. To avoid the symptoms, it is only necessary to avoid the separation; to use a single <code>make</code> session to build the whole project, which is not quite the same as a single <code>Makefile</code>.</p>
<p>This conclusion runs counter to much accumulated folk wisdom in building large projects on UNIX. Some of the main objections raised by this folk wisdom are examined and shown to be unfounded. The results of actual use are far more encouraging, with routine development performance improvements significantly faster than intuition may indicate, and without the intuitvely expected compromise of modularity. The use of a whole project make is not as difficult to put into practice as it may at first appear.</p>
<h2>Introduction</h2>
<p>For large UNIX software development projects, the traditional methods of building the project use what has come to be known as â€œrecursive <code>make</code>.â€ This refers to the use of a hierarchy of directories containing source files for the modules which make up the project, where each of the sub-directories contains a <code>Makefile</code> which describes the rules and instructions for the <code>make</code> program. The complete project build is done by arranging for the top-level <code>Makefile</code> to change directory into each of the sub-directories and recursively invoke <code>make</code>.</p>
<p>This paper explores some significant problems encountered when developing software projects using the recursive <code>make</code> technique. A simple solution is offered, and some of the implications of that solution are explored.</p>
<p>Recursive make results in a directory tree which looks something like figure 1. This hierarchy of modules can be nested arbitrarily deep. Real-world projects often use two- and three-level structures.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:45%" src="/content/images/journals/ol71/Miller/Miller-Figure-1.png"></p></td>
</tr>
<tr>
<td class="title">Figure 1</td>
</tr>
</table>

<h2>Assumed Knowledge</h2>
<p>This paper assumes that the reader is familiar with developing software on UNIX, with the <code>make</code> program, and with the issues of C programming and include file dependencies.</p>
<p>This paper assumes that you have installed GNU Make on your system and are moderately familiar with its features. Some features of <code>make</code> described below may not be available if you are using the limited version supplied by your vendor.</p>
<h2>The Problem</h2>
<p>There are numerous problems with recursive <code>make</code>, and they are usually observed daily in practice. Some of these problems include:</p>
<ul>
<li>
<p>It is very hard to get the order of the recursion into the subÂ­directories correct. This <em>order</em> is very unstable and frequently needs to be manually â€˜â€˜tweaked.â€™â€™ Increasing the number of directories, or increasing the depth in the directory tree, cause this order to be increasingly unstable.</p>
</li>
<li>
<p>It is often necessary to do more than one pass over the subÂ­directories to build the whole system. This, naturally, leads to extended build times.</p>
</li>
<li>
<p>Because the builds take so long, some dependency information is omitted, otherwise development builds take unreasonable lengths of time, and the developers are unproductive. This usually leads to things not being updated when they need to be, requiring frequent â€œcleanâ€ builds from scratch, to ensure everything has actually been built.</p>
</li>
<li>
<p>Because inter-directory dependencies are either omitted or too hard to express, the <code>Makefiles</code> are often written to build <em>too much</em> to ensure that nothing is left out.</p>
</li>
<li>
<p>The inaccuracy of the dependencies, or the simple lack of dependencies, can result in a product which is incapable of building cleanly, requiring the build process to be carefully watched by a human.</p>
</li>
<li>
<p>Related to the above, some projects are incapable of taking advantage of various â€œparallel makeâ€ impementations, because the build does patently silly things.</p>
</li>
</ul>
<p>Not all projects experience all of these problems. Those that do experience the problems may do so intermittently, and dismiss the problems as unexplained â€œone offâ€ quirks. This paper attempts to bring together a range of symptoms observed over long practice, and presents a systematic analysis and solution.</p>
<p>It must be emphasized that this paper does not suggest that <code>make</code> itself is the problem. This paper is working from the premise that <code>make</code> does <em>not</em> have a bug, that <code>make</code> does <em>not</em> have a design flaw. The problem is not in <code>make</code> at all, but rather in the input given to <code>make</code> â€“ the way make is being used.</p>
<h2>Analysis</h2>
<p>Before it is possible to address these seemingly unrelated problems, it is first necessary to understand what <code>make</code> does and how it does it. It is then possible to look at the effects recursive <code>make</code> has on how <code>make</code> behaves.</p>
<h2>Whole Project Make</h2>
<p><code>make</code> is an expert system. You give it a set of rules for how to construct things, and a target to be constructed. The rules can be decomposed into pair-wise ordered dependencies between files. <code>make</code> takes the rules and determines how to build the given target. Once it has determined how to construct the target, it proceeds to do so.</p>
<p><code>make</code> determines how to build the target by constructing a <em>directed acyclic graph</em>, the DAG familiar to many Computer Science students. The vertices of this graph are the files in the system, the edges of this graph are the inter-file dependencies. The edges of the graph are directed because the pair-wise dependencies are ordered; resulting in an <em>acyclic</em> graph â€“ things which look like loops are resolved by the direction of the edges.</p>
<p>This paper will use a small example project for its analysis. While the number of files in this example is small, there is sufficient complexity to demonstrate all of the above recursive <code>make</code> problems. First, however, the project is presented in a non-recursive form (figure 2).</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:30%" src="/content/images/journals/ol71/Miller/Miller-Figure-2.png"></p></td>
</tr>
<tr>
<td class="title">Figure 2</td>
</tr>
</table>

<p>The Makefile in this small project looks like this:</p>
<pre><code>OBJ = main.o parse.o
prog: $(OBJ)
    $(CC) -o $@ $(OBJ)
main.o: main.c parse.h
    $(CC) -c main.c
parse.o: parse.c parse.h
    $(CC) -c parse.c
</code></pre>

<p>Some of the implicit rules of <code>make</code> are presented here explicitly, to assist the reader in converting the <code>Makefile</code> into its equivalent DAG.</p>
<p>The above <code>Makefile</code> can be drawn as a DAG in the form shown in figure 3.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:50%" src="/content/images/journals/ol71/Miller/Miller-Figure-3.png"></p></td>
</tr>
<tr>
<td class="title">Figure 3</td>
</tr>
</table>

<p>This is an <em>acyclic</em> graph because of the arrows which express the ordering of the relationship between the files. If there <em>was</em> a circular dependency according to the arrows, it would be an error.</p>
<p>Note that the object files (<code>.o</code>) are dependent on the include files (<code>.h</code>) even though it is the source files (<code>.c</code>) which do the including. This is because if an include file changes, it is the object files which are out-of-date, not the source files.</p>
<p>The second part of what <code>make</code> does it to perform a <em>postorder</em> traversal of the DAG. That is, the dependencies are visited first. The actual order of traversal is undefined, but most <code>make</code> implementations work down the graph from left to right for edges below the same vertex, and most projects implicitly rely on this behaviour. The last-time-modified of each file is examined, and higher files are determined to be out-of-date if any of the lower files on which they depend are younger. Where a file is determined to be out-of-date, the action associated with the relevant graph edge is performed (in the above example, a compile or a link).</p>
<p>The use of recursive <code>make</code> affects both phases of the operation of <code>make</code>: it causes <code>make</code> to construct an inaccurate DAG, and it forces <code>make</code> to traverse the DAG in an inappropriate order.</p>
<h2>Recursive Make</h2>
<p>To examine the effects of recursive <code>make</code>s, the above example will be artificially segmented into two modules,each with its own <code>Makefile</code>, and a top-level</p>
<p><code>Makefile</code> used to invoke each of the module <code>Makefiles</code>.</p>
<p>This example is intentionally artificial, and thoroughly so. <code>Makefile</code> However, all â€œmodularityâ€ of all projects is artificial, to some extent. Consider: for many projects, the linker flattens it allo ut again, right at the end. The directory structure is as shown in figure 4.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:30%" src="/content/images/journals/ol71/Miller/Miller-Figure-4.png"></p></td>
</tr>
<tr>
<td class="title">Figure 4</td>
</tr>
</table>

<p>The top-level <code>Makefile</code> often looks a lot like a shell script:</p>
<pre><code>MODULES = ant bee
all:
for dir in $(MODULES); do \
    (cd $$dir; ${MAKE} all); \
done
</code></pre>

<p>The <code>ant/Makefile</code> looks like this:</p>
<pre><code>all: main.o
main.o: main.c ../bee/parse.h
$(CC) -I../bee -c main.c
</code></pre>

<p>and the equivalent DAG looks like figure 5.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:50%" src="/content/images/journals/ol71/Miller/Miller-Figure-5.png"></p></td>
</tr>
<tr>
<td class="title">Figure 5</td>
</tr>
</table>

<p>The bee/Makefile looks like this:</p>
<pre><code>OBJ = ../ant/main.o parse.o
all: prog
prog: $(OBJ)
    $(CC) -o $@ $(OBJ)
parse.o: parse.c parse.h
    $(CC) -c parse.c
</code></pre>

<p>and the equivalent DAG looks like figure 6.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:50%" src="/content/images/journals/ol71/Miller/Miller-Figure-6.png"></p></td>
</tr>
<tr>
<td class="title">Figure 6</td>
</tr>
</table>

<p>Take a close look at the DAGs. Notice how neither is complete â€“ there are vertices and edges (files and dependencies) missing
from both DAGs. When the entire build is done from the top level,
everything will work.</p>
<p>But what happens when small changes occur? For example, what
would happen if the <code>parse.c</code> and <code>parse.h</code> files were generated from a <code>parse.y</code> yacc grammar? This would add the following lines to the <code>bee/Makefile</code>:</p>
<pre><code>parse.c parse.h: parse.y
$(YACC) -d parse.y
mv y.tab.c parse.c
mv y.tab.h parse.h
</code></pre>

<p>And the equivalent DAG changes to look like figure 7.</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:50%" src="/content/images/journals/ol71/Miller/Miller-Figure-7.png"></p></td>
</tr>
<tr>
<td class="title">Figure 7</td>
</tr>
</table>

<p>This change has a simple effect: if <code>parse.y</code> is edited, <code>main.o</code> will <em>not</em> be constructed correctly. This is because the DAG for <code>ant</code> knows about only some of the dependencies of <code>main.o</code>,and the DAG for <code>bee</code> knows none of them.</p>
<p>To understand why this happens, it is necessary to look at the actions <code>make</code> will take <em>from the top level</em>. Assume that the project is in a self-consistent state. Now edit <code>parse.y</code> in such a way that the generated <code>parse.h</code> file will have non-trivial differences. However, when the top-level <code>make</code> is invoked, first <code>ant</code> and then <code>bee</code> is visited. But <code>ant/main.o</code> is <em>not</em> recompiled, because <code>bee/parse.h</code> has not yet been regenerated and thus does not yet indicate that <code>main.o</code> is out-of-date. It is not until <code>bee</code> is visited by the recursive <code>make</code> that <code>parse.c</code> and <code>parse.h</code> are reconstructed, followed by <code>parse.o</code>. When the program is linked <code>main.o</code> and <code>parse.o</code> are non-trivially incompatible. That is, the program is <em>wrong</em>.</p>
<h2>Traditional Solutions</h2>
<p>There are three traditional fixes for the above â€˜â€˜glitch.â€™â€™</p>
<h2>Reshuffle</h2>
<p>The first is to manually tweak the order of the modules in the toplevel <code>Makefile</code>. But why is this tweak required at all? Isnâ€™t <code>make</code> supposed to be an expert system? Is <code>make</code> somehow flawed, or did something else go wrong?</p>
<p>To answer this question, it is necessary to look, not at the graphs, but the <em>order of traversal</em> of the graphs. In order to operate correctly, <code>make</code> needs to perform a <b><em>postorder</em></b> traversal, but in separating the DAG into two pieces, <code>make</code> has not been <em>allowed</em> to traverse the graph in the necessary order â€“ instead the project has dictated an order of traversal. An order which, when you consider the original graph, is plain <em>wrong</em>. Tweaking the top-level <code>Makefile</code> corrects the order to one similar to that which <code>make</code> could have used. Until the next dependency is added...</p>
<p>Note that <code>make -j</code> (parallel build) invalidates many of the ordering assumptions implicit in the reshuffle solution, making it useless. And then there are all of the sub-makes all doing their builds in parallel, too.</p>
<h2>Repetition</h2>
<p>The second traditional solution is to <code>make</code> more than one pass in the top-level <code>Makefile</code>, something like this:</p>
<pre><code>MODULES = ant bee
all:
for dir in $(MODULES); do \
    (cd $$dir; ${MAKE} all); \
done
for dir in $(MODULES); do \
    (cd $$dir; ${MAKE} all); \
done
</code></pre>

<p>This doubles the length of time it takes to perform the build. But that is not all: there is no guarantee that two passes are enough! The upper bound of the number of passes is not even proportional to the number of modules, it is instead proportional to the number of graph edges which cross module boundaries.</p>
<h2>Overkill</h2>
<p>We have already seen an example of how recursive <code>make</code> can build too little, but another common problem is to build too much. The third traditional solution to the above glitch is to add even <em>more</em> lines to <code>ant/Makefile</code>:</p>
<pre><code>.PHONY: ../bee/parse.h
    ../bee/parse.h:
cd ../bee; \
make clean; \
make all
</code></pre>

<p>This means that whenever <code>main.o</code> is made, <code>parse.h</code> will always be considered to be out-of-date. All of <code>bee</code> will always be rebuilt including <code>parse.h</code>, and so main.o will always be rebuilt, <em>even if everything was self consistent</em>.</p>
<p>Note that <code>make -j</code> (parallel build) invalidates many of the ordering assumptions implicit in the overkill solution, making it useless, because all of the sub-makes are all doing their builds (â€œcleanâ€ then â€œallâ€) in parallel, constantly interfering with each other in non-deterministic ways.</p>
<h2>Prevention</h2>
<p>The above analysis is based on one simple action: the DAG was artificially separated into incomplete pieces. This separation resulted in all of the problems familiar to recursive make builds.</p>
<p>Did <code>make</code> get it wrong? No. This is a case of the ancient GIGO principle: <em>Garbage In, Garbage Out</em>. Incomplete <code>Makefiles</code> are <em>wrong</em> <code>Makefiles</code>.</p>
<p>To avoid these problems, donâ€™t break the DAG into pieces; instead, use one <code>Makefile</code> for the entire project. It is not the recursion itself which is harmful, it is the crippled <code>Makefiles</code> which are used in the recursion which are <em>wrong</em>. It is not a deficiency of <code>make</code> itself that recursive <code>make</code> is broken, it does the best it can with the flawed input it is given.</p>
<p><em>But, but, but... You canâ€™t do that!â€™â€™</em> I hear you cry. <em>â€˜â€˜A single
Makefile is too big,itâ€™s unmaintainable, itâ€™s too hard to write the rules, youâ€™ll run out of memory, I only want to build my little bit, the build will take too long. Itâ€™s just not practical.â€™</em></p>
<p>These are valid concerns, and they frequently lead <code>make</code> users to the conclusion that re-working their build process does not have any short- or long-term benefits. This conclusion is based on ancient, enduring, false assumptions.</p>
<p>The following sections will address each of these concerns in turn.</p>
<h2>A Single Makefile is Too Big</h2>
<p>If the entire project build description were placed into a single <code>Makefile</code> this would certainly be true, however modern <code>make</code> implementations have include statements. By including a relevant fragment from each module, the total size of the <code>Makefile</code> and its include files need be no larger than the total size of the <code>Makefile</code>s in the recursive case.</p>
<h2>A Single Makefile Is Unmaintainable</h2>
<p>The complexity of using a single top-level <code>Makefile</code> which includes a fragment from each module is no more complex than in the recursive case. Because the DAG is not segmented, this form of <code>Makefile</code> becomes less complex, and thus <em>more</em> maintainable, simply because fewer â€œtweaksâ€ are required to keep it working.</p>
<p>Recursive <code>Makefile</code>s have a great deal of repetition. Many projects solve this by using include files. By using a single <code>Makefile</code> for the project, the need for the â€œcommonâ€ include files disappears â€“ the single <code>Makefile</code> is the common part.</p>
<h2>Itâ€™s Too Hard To Write The Rules</h2>
<p>The only change required is to include the directory part in filenames in a number of places. This is because the <code>make</code> is performed from the top level directory; the current directory is not the one in which the file appears. Where the output file is explicitly stated in a rule, this is not a problem.</p>
<p>GCC allows a <code>-o</code> option in conjunction with the <code>-c</code> option, and GNU Make knows this. This results in the implicit compilation rule placing the output in the correct place. Older and dumber C compilers, however, may not allow the <code>-o</code> option with the <code>-c</code> option, and will leave the object file in the top-level directory (i.e. the wrong directory). There are three ways for you to fix this: get GNU Make and GCC, override the built-in rule with one which does the right thing, or complain to your vendor.</p>
<p>Also, K&amp;R C compilers will start the double-quote include path (<code>#include &quot;filename.h&quot;</code>) from the current directory. This will not do what you want. ANSI C compliant C compilers, however, start the double-quote include path from the directory in which the source file appears; thus, no source changes are required. If you donâ€™t have an ANSI C compliant C compiler,you should consider installing GCC on your system as soon as possible.</p>
<h2>I Only Want To Build My Little Bit</h2>
<p>Most of the time, developers are deep within the project tree and they edit one or two files and then run <code>make</code> to compile their changes and try them out. They may do this dozens or hundreds of times a day. Being forced to do a full project build every time would be absurd.</p>
<p>Developers always have the option of giving <code>make</code> a specific target. This is always the case, itâ€™s just that we usually rely on the default target in the <code>Makefile</code> in the current directory to shorten the command line for us. Building â€œmy little bitâ€ can still be done with a whole project <code>Makefile</code>, simply by using a specific target, and an alias if the command line is too long.</p>
<p>Is doing a full project build every time so absurd? If a change made in a module has repercussions in other modules, because there is a dependency the developer is unaware of (but the <code>Makefile</code> is aware of), isnâ€™t it better that the developer find out as early as possible? Dependencies like this <em>will</em> be found, because the DAG is more complete than in the recursive case.</p>
<p>The developer is rarely a seasoned old salt who knows every one of the million lines of code in the product. More likely the developer is a short-term contractor or a junior. You donâ€™t want implications like these to blow up after the changes are integrated with the master source, you want them to blow up on the developer in some nice safe sand-box far awayfrom the master source.</p>
<p>If you want to make â€œjust your littleâ€ bit because you are concerned that performing a full project build will corrupt the project master source, due to the directory structure used in your project, see the â€œProjects <em>versus</em> Sand-Boxesâ€ section below.</p>
<h2>The Build Will Take Too Long</h2>
<p>This statement can be made from one of two perspectives. First, that a whole project <code>make</code>, even when everything is up-to-date, inevitably takes a long time to perform. Secondly, that these inevitable delays are unacceptable when a developer wants to quickly compile and link the one file that they have changed.</p>
<h2>Project Builds</h2>
<p>Consider a hypothetical project with 1000 source (<code>.c</code>) files, each of which has its calling interface defined in a corresponding include (<code>.h</code>) file with defines, type declarations and function prototypes. These 1000 source files include their own interface definition, plus the interface definitions of any other module they may call. These 1000 source files are compiled into 1000 object files which are then linked into an executable program. This system has some 3000 files which <code>make</code> must be told about, and be told about the include dependencies, and also explore the possibility that implicit rules (<code>.y -&gt; .c</code> for example) may be necessary.</p>
<p>In order to build the DAG, <code>make</code> must â€œstatâ€ 3000 files, plus an additional 2000 files or so, depending on which implicit rules your <code>make</code> knows about and your <code>Makefile</code> has left enabled. On the authorâ€™s humble 66MHz i486 this takes about 10 seconds; on native disk on faster platforms it goes even faster. With NFS over 10MB Ethernet it takes about 10 seconds, no matter what the platform.</p>
<p>This is an astonishing statistic! Imagine being able to do a single file compile, out of 1000 source files, in only 10 seconds, plus the time for the compilation itself.</p>
<p>Breaking the set of files up into 100 modules, and running it as a recursive <code>make</code> takes about 25 seconds. The repeated process creation for the subordinate <code>make</code> invocations take quite a long time.</p>
<p>Hang on a minute! On real-world projects with less than 1000 files, it takes an awful lot longer than 25 seconds for <code>make</code> to work out that it has nothing to do. For some projects, doing it in only 25 minutes would be an improvement! The above result tells us that it is not the number of files which is slowing us down (that only takes 10 seconds), and it is not the repeated process creation for the subordinate <code>make</code> invocations (that only takes another 15 seconds). So just what <em>is</em> taking so long?</p>
<p>The traditional solutions to the problems introduced by recursive <code>make</code> often increase the number of subordinate <code>make</code> invocations beyond the minimum described here; e.g. to perform multiple repetitions (see â€˜Repetitionâ€™, above), or to overkill cross-module dependencies (see â€˜Overkillâ€™, above). These can take a long time, particularly when combined, but do not account for some of the more spectacular build times; what else is taking so long?</p>
<p>Complexity of the <code>Makefile</code> is what is taking so long. This is covered, below, in the â€˜Efficient Makefilesâ€™ section.</p>
<h2>Development Builds</h2>
<p>If, as in the 1000 file example, it only takes 10 seconds to figure out which one of the files needs to be recompiled, there is no serious threat to the productivity of developers if they do a whole project <code>make</code> as opposed to a module-specific <code>make</code>. The advantage for the project is that the module-centric developer is reminded at relevant times (and only relevant times) that their work has wider ramifications.</p>
<p>By consistently using C include files which contain accurate interface definitions (including function prototypes), this will produce compilation errors in many of the cases which would result in a defective product. By doing whole-project builds, developers discover such errors very early in the development process, and can fix the problems when they are least expensive.</p>
<h2>Youâ€™ll Run Out Of Memory</h2>
<p>This is the most interesting response. Once long ago, on a CPU far, far away, it may even have been true. When Feldman [1] first wrote <code>make</code> it was 1978 and he was using a PDP11. Unix processes were limited to 64KB of data.</p>
<p>On such a computer, the above project with its 3000 files detailed in the whole-project <code>Makefile</code>, would probably not allow the DAG and rule actions to fit in memory.</p>
<p>But we are not using PDP11s any more. The physical memory of modern computers exceeds 10MB for <em>small</em> computers, and virtual memory often exceeds 100MB. It is going to take a project with hundreds of thousands of source files to exhaust virtual memory on a <code>small</code> modern computer. As the 1000 source file example takes less than 100KB of memory (try it, I did) it is unlikely that any project manageable in a single directory tree on a single disk will exhaust your computerâ€™s memory.</p>
<h2>Why Not Fix The DAG InThe Modules?</h2>
<p>It was shown in the above discussion that the problem with recursive <code>make</code> is that the DAGs are incomplete. It follows that by adding the missing portions, the problems would be resolved without abandoning the existing recursive <code>make</code> investment.</p>
<ul>
<li>
<p>The developer needs to remember to do this. The problems will not affect the developer of the module, it will affect the developers of <em>other</em> modules. There is no trigger to remind the developer to do this, other than the ire of fellow developers.</p>
</li>
<li>
<p>It is difficult to work out where the changes need to be made. Potentially every <code>Makefile</code> in the entire project needs to be examined for possible modifications. Of course, you can wait for your fellow developers to find them for you.</p>
</li>
<li>
<p>The include dependencies will be recomputed unnecessarily, or will be interpreted incorrectly. This is because <code>make</code> is string based, and thus â€œ<code>.</code>â€and â€œ<code>../ant</code>â€ are two different places, even when you are in the <code>ant</code> directory. This is of concern when include dependencies are automatically generated â€“ as they are for all large projects.</p>
</li>
</ul>
<p>By making sure that each <code>Makefile</code> is complete, you arrive at the point where the <code>Makefile</code> for at least one module contains the equivalent of a whole-project <code>Makefile</code> (recall that these modules form a single project and are thus inter-connected), and there is no need for the recursion anymore.</p>
<h2>Efficient Makefiles</h2>
<p>The central theme of this paper is the <em>semantic</em> side-effects of artificially separating a <code>Makefile</code> into the pieces necessary to perform a recursive <code>make</code>. However, once you have a large number of <code>Makefile</code>s, the speed at which <code>make</code> can interpret this multitude of files also becomes an issue.</p>
<p>Builds can take â€œforeverâ€ for both these reasons: the traditional fixes for the separated DAG may be building too much <em>and</em> your <code>Makefile</code> may be inefficient.</p>
<h2>Deferred Evaluation</h2>
<p>The text in a <code>Makefile</code> must somehow be read from a text file and understood by <code>make</code> so that the DAG can be constructed, and the specified actions attached to the edges. This is all kept in memory.</p>
<p>The input language for <code>Makefile</code>s is deceptively simple. A crucial distinction that often escapes both novices and experts alike is that <code>make</code>â€™s input language is <em>text based</em>, as opposed to token based, as is the case for C or AWK. <code>make</code> does the very least possible to process input lines and stash them away in memory.</p>
<p>As an example of this, consider the following assignment:</p>
<pre><code>OBJ = main.o parse.o
</code></pre>

<p>Humans read this as the variable OBJ being assigned two filenames <code>main.o</code> and <code>parse.o</code>. But <code>make</code> does not see it that way. Instead OBJ is assigned the <em>string</em> â€œ<code>main.o parse.o</code>â€. It gets worse:</p>
<pre><code>SRC = main.c parse.c
OBJ = $(SRC:.c=.o)
</code></pre>

<p>In this case humans expect <code>make</code> to assign two filenames to OBJ, but <code>make</code> actually assigns the string â€˜â€˜<code>$(SRC:.c=.o)</code>â€™â€™. This is because it is a <em>macro</em> language with deferred evaluation, as opposed to one with variables and immediate evaluation.</p>
<p>If this does not seem too problematic, consider the following <code>Makefile</code> shown at the top of the next column.</p>
<p>How many times will the shell command be executed? <strong>Ouch!</strong> It will be executed <em>twice</em> just to construct the DAG,and a further <em>two</em> times if the rule needs to be executed.</p>
<p>If this shell command does anything complexor time consuming (and it usually does) it will take <em>four</em> times longer than you thought.</p>
<pre><code>SRC = $(shell echo â€™Ouch!â€™ \
    1&gt;&amp;2 ; echo *.[cy])
OBJ = \
    $(patsubst %.c,%.o,\
    $(filter %.c,$(SRC))) \
    $(patsubst %.y,%.o,\
    $(filter %.y,$(SRC)))
test: $(OBJ)
    $(CC) -o $@ $(OBJ)
</code></pre>

<p>But it is worth looking at the other portions of that OBJ macro. Each time it is named, a huge amount of processing is performed:</p>
<ul>
<li>
<p>The argument to <code>shell</code> is a single string (all built-in-functions take a single string argument). The string is executed in a sub-shell, and the standard output of this command is read back in, translating new lines into spaces. The result is a single string.</p>
</li>
<li>
<p>The argument to <code>filter</code> is a single string. This argument is broken into two strings at the first comma. These two strings are then each broken into sub-strings separated by spaces. The first set are the patterns, the second set are the filenames. Then, for each of the pattern substrings, if a filename sub-string matches it, that filename is included in the output. Once all of the output has been found, it is re-assembled into a single space-separated string.</p>
</li>
<li>
<p>The argument to <code>patsubst</code> is a single string. This argument is broken into three strings at the first and second commas. The third string is then broken into sub-strings separated by spaces, these are the filenames. Then, for each of the filenames which match the first string it is substituted according to the second string. If a filename does not match, it is passed through unchanged. Once all of the output has been generated, it is reÂ­assembled into a single space-separated string.</p>
</li>
</ul>
<p>Notice how many times those strings are disassembled and reÂ­assembled. Notice how many ways that happens. <em>This is slow</em>. The example here names just two files but consider how inefficient this would be for 1000 files. Doing it <em>four</em> times becomes decidedly inefficient.</p>
<p>If you are using a dumb <code>make</code> that has no substitutions and no built-in functions, this cannot bite you. But a modern <code>make</code> has lots of built-in functions and can even invoke shell commands on-the-fly.The semantics of <code>make</code>â€™s text manipulation is such that string manipulation in <code>make</code> is very CPU intensive,compared to performing the same string manipulations in C or AWK.</p>
<h2>Immediate Evaluation</h2>
<p>Modern <code>make</code> implementations have an immediate evaluation := assignment operator. The above example can be re-written as</p>
<pre><code>SRC := $(shell echo â€™Ouch!â€™ \
    1&gt;&amp;2 ; echo *.[cy])
OBJ := \
    $(patsubst %.c,%.o,\
    $(filter %.c,$(SRC))) \
    $(patsubst %.y,%.o,\
    $(filter %.y,$(SRC)))
test: $(OBJ)
    $(CC) -o $@ $(OBJ)
</code></pre>

<p>Note that <em>both</em> assignments are immediate evaluation assignments. If the first were not, the shell command would always be executed twice. If the second were not, the expensive substitutions would be performed at least twice and possibly four times.</p>
<p>As a rule of thumb: always use immediate evaluation assignment unless you knowingly want deferred evaluation.</p>
<h2>Include Files</h2>
<p>Many <code>Makefiles</code> perform the same text processing (the filters above, for example) for every single <code>make</code> run, but the results of the processing rarely change. Wherever practical, it is more efficient to record the results of the text processing into a file, and have the <code>Makefile</code> include this file.</p>
<h2>Dependencies</h2>
<p>Donâ€™t be miserly with include files. They are relatively inexpensive to read, compared to $(shell), so more rather than less doesnâ€™t greatly affect efficiency.</p>
<p>As an example of this, it is first necessary to describe a useful feature of GNU Make: once a <code>Makefile</code> has been read in, if any of its included files were out-of-date (or do not yet exist), they are re-built, and then <code>make</code> starts again, which has the result that <code>make</code> is now working with up-to-date include files. This feature can be exploited to obtain automatic include file dependency tracking for C sources. The obvious way to implement it, however, has a subtle flaw.</p>
<pre><code>SRC := $(wildcard *.c)
OBJ := $(SRC:.c=.o)
test: $(OBJ)
    $(CC) -o $@ $(OBJ)
include dependencies
dependencies: $(SRC)
    depend.sh $(CFLAGS) \
    $(SRC) &gt; $@
</code></pre>

<p>The <code>depend.sh</code> script prints lines of the form:</p>
<pre><code>file.o: file.cinclude.h...
</code></pre>

<p>The most simple implementation of this is to use GCC, but you will need an equivalent awk script or C program if you have a different compiler: </p>
<pre><code>#!/bin/sh
gcc -MM -MG &quot;$@&quot;
</code></pre>

<p>This implementation of tracking C include dependencies has several serious flaws, but the one most commonly discovered is that the <code>dependencies</code> file does not, itself, depend on the C include files. That is, it is not re-built if one of the include files changes. There is no edge in the DAG joining the <code>dependencies</code> vertex to any of the include file vertices. If an include file changes to include another file (a nested include), the dependencies will not be recalculated, and potentially the C file will not be recompiled, and thus the program will not be re-built correctly.</p>
<p>A classic build-too-little problem, caused by giving <code>make</code> inadequate information, and thus causing it to build an inadequate DAG and reach the wrong conclusion.</p>
<p>The traditional solution is to build too much:</p>
<pre><code>SRC := $(wildcard *.c)
OBJ := $(SRC:.c=.o)
test: $(OBJ)
    $(CC) -o $@ $(OBJ)
include dependencies
.PHONY: dependencies
dependencies: $(SRC)
    depend.sh $(CFLAGS) \
    $(SRC) &gt; $@
</code></pre>

<p>Now, even if the project is completely up-do-date, the dependencies will be re-built. For a large project, this is very wasteful, and can be a major contributor to <code>make</code> taking â€œforeverâ€ to work out that nothing needs to be done.</p>
<p>There is a second problem, and that is that if any <em>one</em> of the C files changes, <em>all</em> of the C files will be re-scanned for include dependencies. This is as inefficient as having a <code>Makefile</code> which reads</p>
<pre><code>prog: $(SRC)
    $(CC) -o $@ $(SRC)
</code></pre>

<p>What is needed, in exact analogy to the C case, is to have an intermediate form. This is usually given a <code>.d</code> suffix. By exploiting the fact that more than one file may be named in an include line, there is no need to â€˜â€˜linkâ€™â€™ all of the <code>.d</code> files together:</p>
<pre><code>SRC := $(wildcard *.c)
OBJ := $(SRC:.c=.o)
test: $(OBJ)
    $(CC) -o $@ $(OBJ)
include $(OBJ:.o=.d)
%.d: %.c
    depend.sh $(CFLAGS) $* &gt; $@
</code></pre>

<p>This has one more thing to fix: just as the object (<code>.o</code>) files depend on the source files and the include files, so do the dependency (<code>.d</code>) files.</p>
<pre><code>file.d file.o: file.c include.h
</code></pre>

<p>This means tinkering with the <code>depend.sh</code> script again:</p>
<pre><code>#!/bin/sh
gcc -MM -MG &quot;$@&quot; |
sed -e â€™s@Ë†\(.*\)\.o:@\1.d \1.o:@â€™
</code></pre>

<p>This method of determining include file dependencies results in the <code>Makefile</code> including more files than the original method, but opening files is less expensive than rebuilding all of the dependencies every time. Typically a developer will edit one or two files before re-building; this method will rebuild the <em>exact</em> dependency file affected (or more than one, if you edited an include file). On balance, this will use less CPU, and less time.</p>
<p>In the case of a build where nothing needs to be done, <code>make</code> will actually do nothing, and will work this out very quickly.</p>
<p>However, the above technique assumes your project fits entirely within the one directory. For large projects, this usually isnâ€™t the case.</p>
<p>This means tinkering with the <code>depend.sh</code> script again:</p>
<pre><code>#!/bin/sh
DIR=&quot;$1&quot;
shift 1
case &quot;$DIR&quot; in
&quot;&quot; | &quot;.&quot;)
gcc -MM -MG &quot;$@&quot; |
sed -e â€™s@Ë†\(.*\)\.o:@\1.d \1.o:@â€™
;;
*)
gcc -MM -MG &quot;$@&quot; |
sed -e &quot;s@Ë†\(.*\)\.o:@$DIR/\1.d $DIR/\1.o:@&quot;
;;
esac
</code></pre>

<p>And the rule needs to change, too, to pass the directory as the first argument, as the script expects.</p>
<pre><code>%.d: %.c
    depend.sh â€˜dirname $*â€˜ $(CFLAGS) $* &gt; $@
</code></pre>

<p>Note that the <code>.d</code> files will be relative to the top level directory. Writing them so that they can be used from any level is possible, but beyond the scope of this paper.</p>
<h2>Multiplier</h2>
<p>All of the inefficiencies described in this section compound together. If you do 100 <code>Makefile</code> interpretations, once for each module, checking 1000 source files can take a very long time - if the interpretation requires complex processing or performs unnecessary work, or both. A whole project <code>make</code>, on the other hand, only needs to interpret a single <code>Makefile</code>.</p>
<h2>Projects versus Sand-boxes</h2>
<p>The above discussion assumes that a project resides under a single directory tree, and this is often the ideal. However, the realities of working with large software projects often lead to weird and wonderful directory structures in order to have developers working on different sections of the project without taking complete copies and thereby wasting precious disk space.</p>
<p>It is possible to see the whole-project <code>make</code> proposed here as impractical, because it does not match the evolved methods of your development process.</p>
<p>The whole-project <code>make</code> proposed here does have an effect on development methods: it can give you cleaner and simpler build environments for your developers. By using <code>make</code>â€™s VPATH feature, it is possible to copy only those files you need to edit into your private work area, often called a <em>sandbox</em>.</p>
<p>The simplest explanation of what VPATH does is to <code>make</code> an analogy with the include file search path specified using <code>-Ipath</code> options to the C compiler. This set of options describes where to look for files, just as VPATH tells <code>make</code> where to look for files.</p>
<p>By using VPATH, it is possible to â€œstackâ€ the sand-box <em>on top of</em> the project master source, so that files in the sand-box take precedence, but it is the union of all the files which make uses to perform the build (see Figure 8).</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:50%" src="/content/images/journals/ol71/Miller/Miller-Figure-8.png"></p></td>
</tr>
<tr>
<td class="title">Figure 8</td>
</tr>
</table>

<p>In this environment, the sand-box has the same tree structure as the project master source. This allows developers to safely change things across separate modules, e.g. if they are changing a module interface. It also allows the sand-box to be physically separate â€“ perhaps on a different disk, or under their home directory. It also allows the project master source to be read-only, if you have (or would like) a rigorous check-in procedure.</p>
<p>Note: in addition to adding a <code>VPATH</code> line to your development <code>Makefile</code>, you will also need to add <code>-I</code> options to the <code>CFLAGS</code> macro, so that the C compiler uses the same path as <code>make</code> does. This is simply done with a 3-line <code>Makefile</code> in your work area â€“ set a macro, set the VPATH, and then include the <code>Makefile</code> from the project master source.</p>
<h2>VPATH Semantics</h2>
<p>For the above discussion to apply, you need to use GNU Make 3.76 or later. For versions of GNU Make earlier than 3.76, you will need Paul Smithâ€™s VPATH+ patch. This may be obtained from ftp://ftp.wellfleet.com/netman/psmith/gmake/.</p>
<p>The POSIX semantics of VPATH are slightly brain-dead, so many other <code>make</code> implementations are too limited. You may want to consider installing GNU Make.</p>
<h2>The Big Picture</h2>
<p>This section brings together all of the preceding discussion, and presents the example project with its separate modules, but with a whole-project <code>Makefile</code>. The directory structure is changed little from the recursive case, except that the deeper <code>Makefile</code>s are replaced by module specific include files (see Figure 9).</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:30%" src="/content/images/journals/ol71/Miller/Miller-Figure-9.png"></p></td>
</tr>
<tr>
<td class="title">Figure 9</td>
</tr>
</table>

<p>The Makefilelooks like this:</p>
<pre><code>MODULES := ant bee
#look for include files in
# each of the modules
CFLAGS += $(patsubst %,-I%,\
    $(MODULES))

#extra libraries if required
LIBS :=

#each module will add to this
SRC :=

#include the description for
# each module
include $(patsubst %,\
    %/module.mk,$(MODULES))

#determine the object files
OBJ := \
    $(patsubst %.c,%.o, \
    $(filter %.c,$(SRC))) \
    $(patsubst %.y,%.o, \
    $(filter %.y,$(SRC)))

#link the program

prog: $(OBJ)
    $(CC) -o $@ $(OBJ) $(LIBS)

#include the C include
# dependencies
include $(OBJ:.o=.d)

#calculate C include
# dependencies
%.d: %.c
    depend.sh â€˜dirname $*.câ€˜ $(CFLAGS) $*.c &gt; $@
</code></pre>

<p>This looks absurdly large, but it has all of the common elements in the one place, so that each of the modulesâ€™ <code>make</code> includes may be small.</p>
<p>The <code>ant/module.mk</code> file looks like:</p>
<pre><code>SRC += ant/main.c
</code></pre>

<p>The <code>bee/module.mk</code> file looks like:</p>
<pre><code>SRC += bee/parse.y
LIBS += -ly
%.c %.h: %.y
    $(YACC) -d $*.y
    mv y.tab.c $*.c
    mv y.tab.h $*.h
</code></pre>

<p>Notice that the built-in rules are used for the C files, but we need special yacc processing to get the generated <code>.h</code> file.</p>
<p>The savings in this example look irrelevant, because the top-level <code>Makefile</code> is so large. But consider if there were 100 modules, each with only a few non-comment lines, and those specifically relevant to the module. The savings soon add up to a total size often <em>less</em> than the recursive case, without loss of modularity.</p>
<p>The equivalent DAG of the <code>Makefile</code> after all of the includes looks like figure 10</p>
<table class="sidebartable">
<tr>
<td><p><img style="max-width:65%" src="/content/images/journals/ol71/Miller/Miller-Figure-10.png"></p></td>
</tr>
<tr>
<td class="title">Figure 10</td>
</tr>
</table>

<p>The vertexes and edges for the include file dependency files are also present as these are important for <code>make</code> to function correctly.</p>
<h2>Side Effects</h2>
<p>There are a couple of desirable side-effects of using a single <code>Makefile</code>.</p>
<ul>
<li>
<p>The GNU Make <code>-j</code> option, for parallel builds, works even better than before. It can find even more unrelated things to do at once, and no longer has some subtle problems.</p>
</li>
<li>
<p>The general make <code>-k</code> option, to continue as far as possible even in the face of errors, works even better than before. It can find even more things to continue with.</p>
</li>
</ul>
<h2>Literature Survey</h2>
<p>How can it be possible that we have been mis-using make for 20 years? How can it be possible that behaviour previously ascribed to makeâ€™s limitations is in fact a result of mis-using it?</p>
<p>The author only started thinking about the ideas presented in this paper when faced with a number of ugly build problems on utterly different projects, but with common symptoms. By stepping back from the individual projects, and closely examining the thing they had in common, <code>make</code>, it became possible to see the larger pattern. Most of us are too caught up in the minutiae of just getting the rotten build to work that we donâ€™t have time to spare for the big picture. Especially when the item in question â€œobviouslyâ€ works, and has done so continuously for the last 20 years.</p>
<p>It is interesting that the problems of recursive <code>make</code> are rarely mentioned in the very books Unix programmers rely on for accurate, practical advice.</p>
<h2>The Original Paper</h2>
<p>The original <code>make</code> paper [1] contains no reference to recursive <code>make</code>, let alone any discussion as to the relative merits of whole project <code>make</code> over recursive <code>make</code>.</p>
<p>It is hardly surprising that the original paper did not discuss recursive make, Unix projects at the time usually did fit into a single directory.</p>
<p>It may be this which set the â€œone <code>Makefile</code> in every directoryâ€ concept so firmly in the collective Unix development mind-set.</p>
<h2>GNU Make</h2>
<p>The GNU Make manual [2] contains several pages of material concerning recursive <code>make</code>, however its discussion of the merits or otherwise of the technique are limited to the brief statement that</p>
<blockquote>
<p>This technique is useful when you want to separate makefiles for various subsystems that compose a larger system.</p>
</blockquote>
<p>No mention is made of the problems you may encounter.</p>
<h2>Managing Projects with Make</h2>
<p>The Nutshell Makebook [3] specifically promotes recursive <code>make</code> overwhole project <code>make</code> because:</p>
<blockquote>
<p>The cleanest way to build is to put a separate description file in each directory, and tie them together through a master description file that invokes make recursively. While cumbersome, the technique is easier to maintain than a single, enormous file that covers multiple directories. (p. 65)</p>
</blockquote>
<p>This is despite the bookâ€™s advice only two paragraphs earlier that</p>
<blockquote>
<p>make is happiest when you keep all your files in a single directory. (p. 64)</p>
</blockquote>
<p>Yet the book fails to discuss the contradiction in these two statements, and goes on to describe one of the traditional ways of treating the symptoms of incomplete DAGs caused by recursive <code>make</code>.</p>
<p>The book may give us a clue as to why recursive <code>make</code> has been used in this way for so many years. Notice how the above quotes confuse the concept of a directory with the concept of a <code>Makefile</code>.</p>
<p>This paper suggests a simple change to the mind-set: directory trees, however deep, are places to store files; <code>Makefiles</code> are places to describe the relationships between those files, however many.</p>
<h2>BSD Make</h2>
<p>The tutorial for BSD Make [4] says nothing at all about recursive <code>make</code>, but it is one of the few which actually described, however briefly, the relationship between a <code>Makefile</code> and a DAG (p. 30). There is also a wonderful quote</p>
<blockquote>
<p>If make doesnâ€™t do what you expect it to, itâ€™s a good chance the makefile is wrong. (p. 10)</p>
</blockquote>
<p>Which is a pithy summary of the thesis of this paper.</p>
<h2>Summary</h2>
<p>This paper presents a number of related problems, and demonstrates that they are not inherent limitations of <code>make</code>, as is commonly believed, but are the result of presenting incorrect information to <code>make</code>. This is the ancient <em>Garbage In, Garbage Out</em> principle at work. Because <code>make</code> can only operate correctly with a complete DAG, the error is in segmenting the <code>Makefile</code> into incomplete pieces.</p>
<p>This requires a shift in thinking: directory trees are simply a place to hold files, <code>Makefile</code>s are a place to remember relationships between files. Do not confuse the two because it is as important to accurately represent the relationships between files in different directories as it is to represent the relationships between files in the same directory. This has the implication that there should be exactly one <code>Makefile</code> for a project, but the magnitude of the description can be managed by using a <code>make</code> include file in each directory to describe the subset of the project files in that directory. This is just as modular as having a <code>Makefile</code> in each directory.</p>
<p>This paper has shown how a project build and a development build can be equally brief for a whole-project <code>make</code>. Given this parity of time, the gains provided by accurate dependencies mean that this process will, in fact, be faster than the recursive <code>make</code> case, and more accurate.</p>
<h2>Inter-dependent Projects</h2>
<p>In organizations with a strong culture of re-use, implementing whole-project <code>make</code> can present challenges. Rising to these challenges, however, may require looking at the bigger picture.</p>
<ul>
<li>
<p>A module may be shared between two programs because the programs are closely related. Clearly, the two programs plus the shared module belong to the same project (the module may be self-contained, but the programs are not). The dependencies must be explicitly stated, and changes to the module must result in both programs being recompiled and re-linked as appropriate. Combining them all into a single project means that whole-project <code>make</code> can accomplish this.</p>
</li>
<li>
<p>A module may be shared between two projects because they must inter-operate. Possibly your project is bigger than your current directory structure implies. The dependencies must be explicitly stated, and changes to the module must result in both projects being recompiled and re-linked as appropriate. Combining them all into a single project means that whole-project <code>make</code> can accomplish this.</p>
</li>
<li>
<p>It is the normal case to omit the edges between your project and the operating system or other installed third party tools. So normal that they are ignored in the <code>Makefile</code>s in this paper, and they are ignored in the built-in rules of <code>make</code> programs.<br><br>
Modules shared between your projects may fall into a similar category: if they change, you will deliberately re-build to include their changes, or quietly include their changes whenever the next build may happen. In either case, you do not explicitly state the dependencies, and whole-project <code>make</code> does not apply.</p>
</li>
<li>
<p>Re-use may be better served if the module were used as a template, and divergence between two projects is seen as normal. Duplicating the module in each project allows the dependencies to be explicitly stated, but requires additional effort if maintenance is required to the common portion.</p>
</li>
</ul>
<p>How to structure dependencies in a strong re-use environment thus becomes an exercise in <em>risk management</em>. What is the danger that omitting chunks of the DAG will harm your projects? How vital is it to rebuild if a module changes? What are the consequences of <em>not</em> rebuilding automatically? How can you tell when a rebuild is necessary if the dependencies are not explicitly stated? What are the consequences of forgetting to rebuild?</p>
<h2>Return On Investment</h2>
<p>Some of the techniques presented in this paper will improve the speed of your builds, even if you continue to use recursive <code>make</code>. These are not the focus of this paper, merely a useful detour.</p>
<p>The focus of this paper is that you will get more accurate builds of your project if you use whole-project <code>make</code> rather than recursive <code>make</code>.</p>
<ul>
<li>
<p>The time for <code>make</code> to work out that nothing needs to be done will not be more, and will often be less.</p>
</li>
<li>
<p>The size and complexity of the total <code>Makefile</code> input will not be more, and will often be less.</p>
</li>
<li>
<p>The total <code>Makefile</code> input is no less modular than in the recursive case.</p>
</li>
<li>
<p>The difficulty of maintaining the total <code>Makefile</code> input will not be more, and will often be less.</p>
</li>
</ul>
<p>The disadvantages of using whole-project <code>make</code> over recursive <code>make</code> are often un-measured. How much time is spent figuring out why <code>make</code> did something unexpected? How much time is spent figuring out that <code>make</code> <em>did</em> something unexpected? How much time is spent tinkering with the build process? These activities are often thought of as â€œnormalâ€ development overheads.</p>
<p>Building your project is a fundamental activity. If it is performing poorly, so are development, debugging and testing. Building your project needs to be so simple the newest recruit can do it immediately with only a single page of instructions. Building your project needs to be so simple that it rarely needs any development effort at all. Is your build process this simple?</p>
<p>Peter Miller
miller@canb.auug.org.au</p>
<h2>References</h2>
<p>1 Feldman, Stuart I. (1978). Make - A Program for Maintaining Computer Programs. Bell Laboratories Computing Science Technical Report 57<br>
2 Stallman, Richard M. and Roland McGrath (1993). GNU Make: A Program for Directing Recompilation.Free Software Foundation, Inc.<br>
3 Talbott, Steve (1991). Managing Projects with Make, 2nd Ed.Oâ€™Reilly &amp; Associates, Inc.<br>
4 de Boor, Adam (1988). PMake- A Tutorial.University of California, Berkeley</p>
<h2>About the Author</h2>
<p>Peter Miller has worked for many years in the software R&amp;D industry, principally on UNIX systems. In that time he has written tools such as Aegis (a software configuration management system) and Cook (yet another make-oid), both of which are freely available on the Internet. Supporting the use of these tools at many Internet sites provided the insights which led to this paper.</p>
<p>Please visit http://www.canb.auug.org.au/Ëœmillerp/if you would like to look at some of the authorâ€™s free software.</p>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
