    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: The Development of a BBC BASIC to Acorn ANSI C
Translator</title>
        <link>https://members.accu.org/index.php/journals/960</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 12, #1 - Jan 2000</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c128/">121</a>
                    (30)
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;The Development of a BBC BASIC to Acorn ANSI C
Translator</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 January 2000 13:15:35 +00:00 or Mon, 03 January 2000 13:15:35 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a></h2>
</div>
<p><span class="bold"><b>History</b></span></p>
<p>From 1978 to 1989 I worked for an engineering company in
Bracknell, Berkshire. During the latter phases of my time there I
was in the Computer Services Department and was put on a project to
artificially generate CINCOM Comprehensive Retrieval language to
satisfy requests for various types of report. The descriptions of
the reports and requests for prints were held on a TOTAL database.
The next project I was put on was to take raw COBOL programs and
make them conform to the site standard for layout. The suite of
programs also produced a structure diagram of the program
concerned. The technique was to reduce the program to a set of
tokens then pass the tokens on from one subroutine to another via a
pipeline which performed various operations on them. The tokens
were finally strung back together again to form the massaged COBOL
program. The site was closed at the end of 1989.</p>
<p>The second job I entered was in the field of migration, i.e.
converting programs from one language on one machine to another
language on another machine. This also involved using conversion
programs (and a lot of debugging).</p>
<p>I was made redundant, returned home to West Yorkshire and
started working on a program on an Archimedes to find straight
lines of ancient sites (ley lines). I was working in BBC BASIC but
trying to learn C and wanted to convert the program to C. I managed
to obtain a copy of Bison (a program that generates code to parse a
language) and had had the necessary experience, so I decided to see
what I could do with it.</p>
<p>The general technique was to develop a lexical analyser that
divided up the program into tokens that were then passed to a
parser described in Bison's own language. The parser would identify
various components of the language and gradually string the
translated program together. Finally the complete string of
characters would be written to the output file. Partial strings
were passed up from one phase of the grammar to another. In the
process I decoded both BBC BASIC files and also GW-BASIC files. I
prepared notes on the coding that I was passing on to interested
parties.</p>
<p>I also made use of Acorn's FrontEnd tool to turn a command line
program into a WIMP application.</p>
<p><span class="bold"><b>Techniques Used in the
Translator</b></span></p>
<p>I decided to strip out the line numbers from the converted C
program and only label those lines that where referenced in GOTO
and GOSUB statements. The translator was made to perform a pre-scan
to compile a list of referenced lines, ignoring line numbers
referenced in RESTORE statements. I had also decided not to deal
with EVAL statements or variable GOTOs. I intended also to only
detect bursts of assembler and comment them out.</p>
<p>It was fairly easy to detect programming structures and indent
the code accordingly. In later versions of the translator, this was
made user definable.</p>
<p>The main trick used by the lexical analyser was to keep a linked
list of C structs that held the attributes of each encountered
identifier - whether it was a FN, PROC or variable and its data
type (integer, floating point, character string or file pointer).
The identifier names were converted to lower case in line with C
conventions. Additionally integer variables were postfixed with _i
and string variables with _s. In later versions these postfixes
were made user definable, The variable @% was converted to
_at_i.</p>
<p>The C struct also held the dimensions of arrays and the
arguments of FNs and PROCs. The complete list of identifiers,
correctly initialised, was written out at the head of the program.
Arguments of FNs and PROCs were listed, but commented out. At the
same time I found some arguments being used globally, so each
argument was given a list of routines to which it was local. If it
was found to be used in a routine to which it was not local, it was
not commented out.</p>
<p>Also I found that BASIC did not mind if an outer programming
structure closed before the inner ones did. BASIC just closed down
the inner ones. Because of this I had to keep a list of programming
structures and inject appropriate C symbols to close the inner
structures if this occurred.</p>
<p>I found it impossible to write a perfect grammar, so in some
cases the BASIC tokens were reformatted to concur with the grammar.
The current grammar has 73 shift/reduce conflicts and no
reduce/reduce conflicts and seems to pass about 80% of programs. It
does not like mixed logic and number expressions e.g. A%=B%&lt;C%
or A%=B%=C%, but if these are enclosed in brackets, then it doesn't
object i.e. A%=(B%&lt;C%) or A%=(B%=C%).</p>
<p>[ I suspect this could be fixed with some precedence
declarations in the grammar - Tom ]</p>
<p>A check is kept of the headers that are needed, and only those
necessary are output at the head of the C program. Also #defines
are issued for PI, TRUE and FALSE, if needed. The translator cannot
deal with HIMEM, LOMEM or ERL.</p>
<p>ON ERROR statements result in error handling C functions being
defined before the main program and an attempt is made to turn
lines referred in GOSUB statements into C functions.</p>
<p>DIM statements were made to result in definitions of arrays at
the head of the routine concerned or malloc calls in the case of
memory reservation. DIMs in the main program were moved to global
storage, as were all the variables in the main program. Finally
DATA statements were moved to a special array of character strings
at the bottom of the global storage declarations, ready for any
READs or RESTOREs. LOCAL DATA statements could not be
translated.</p>
<p>Three later facilities introduced were, firstly to prevent BASIC
variables becoming C reserved words by turning the first letter to
upper case if this occurred. The next facility was to turn
conditional ENDWHILEs and NEXTs to the C command continue if they
were at a higher level of conditionality than the programming
structure they were in. The final facility was to take the
variables in variable array definitions and turn them into #define
declarations (in upper case) ready for the user to supply
appropriate fixed values.</p>
<p><span class="bold"><b>Problems with the
Translator</b></span></p>
<p>Users of the translator might conceive of using it to compile
their BBC BASIC program without any knowledge of Acorn C. There
would be two advantages in this in that it makes it more difficult
for anyone to pirate their software and that it might speed it up.
However it still requires some debugging of the C, which they might
not have the knowledge to do.</p>
<p>One instance is that BBC BASIC strings are terminated by the
carriage return symbol and C strings with a nul byte. Users also
might want to compile the generated C on some other machine, but I
found commercial programs make great play of bursts of Acorn
Assembler and SWIs (SoftWare Interrupts), for which there are no
direct equivalents on other machines.</p>
<p>In particular there are literally hundreds of SWIs. The
translator also has to make use of a library, LeafLib, which
supplies routines not supplied by Acorn with its bbc.h library. Add
to this the numerous VDU commands, *FX commands and calls to the
operating system and you might have some difficulty
translating.</p>
<p>The C compiler is very particular about data types, but they are
freely mixed in BBC BASIC, which sometimes causes compiler
warnings. By being able to get into the nuts and bolts of the
language, it has generally been possible to correctly type
expressions.</p>
<p><span class="bold"><b>Future Developments</b></span></p>
<p>Having written a grammar for BBC BASIC, which must be one of the
most complex forms of BASIC around, similar things could be done
for other forms of BASIC by using the same or similar grammars. It
would be quite feasible to detect the type of BASIC file and direct
the translation to different parts of the grammar by injecting an
initial symbol from the lexical analyser to the grammar. Equally
well a structure diagram of the program could be built up as a
linked tree in memory for subsequent printing or rendering as a
Draw file. The COBOL pretty printer mentioned above dealt with
GOTOs by numbering them and putting in arrows out from the
structure and return arrows at the arrival point. Incidentally
Bison was running out of storage on the heap so I embedded it in a
FrontEnd application to give it more storage. I also had to upgrade
my RAM to 16MB. The Acorn C compilation of !BBC_C now uses over 5MB
and takes over 8 minutes on my A540.</p>
<p>Also I have succeeded in producing a command line version
(bbc_cpc) with accompanying switches that does the same job on an
IBM PC, but takes up over 500K. There is no reason why conversions
to other languages could also be implemented with a little
effort.</p>
<p><span class="bold"><b>Conclusion</b></span></p>
<p>Having spent nearly 7 years on this project and maybe 7000
computer hours I now feel for both personal and financial reasons
that I do not wish to continue. LeafLib is now almost complete, and
in particular I have now written equivalents to PRINT, INPUT,
PRINT#, INPUT#, READ and RESTORE. The only one missing now is
ENVELOPE. I am now getting programs to compile and run. Anyone out
there interested?</p>
<p>[If anyone is interested they should contact me at the usual
address and I'll put you in touch with Martin - Tom ]</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
