    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Security Implications of Running a Web Gateway</title>
        <link>https://members.accu.org/index.php/journals/756</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 11, #1 - Nov 1998 + Internet Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c134/">111</a>
                    (19)
<br />

                                            <a href="https://members.accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/journals/c69/">Internet</a>
                    (35)
<br />

                                            <a href="https://members.accu.org/index.php/journals/c134-69/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/journals/c134+69/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Security Implications of Running a Web Gateway</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 November 1998 13:15:28 +00:00 or Tue, 03 November 1998 13:15:28 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e22" id="d0e22"></a></h2>
</div>
<p class="c2"><span class="remark">While running a gateway does not
directly impact on C/C++ programming many of us have to turn our
hand to such things. Silas is well known to many of you and so I am
delighted to let him share his thoughts on this subject. If you do
not already know, Silas is not only ACCU's Disabilities Officer but
is also sight-impaired himself..</span></p>
<p>By &quot;Web gateway&quot; I mean a CGI program that can be told to get
any web page, do something to it, and return it to the requester.
The following lists some of the possible attacks that may be made
on such a program, and attacks that can be made on any CGI program,
and the measures that I have taken against them in my own. If
anybody has anything to add, then please do send it in.</p>
<p>Firstly, something that is nothing to do with security, but is
nevertheless perhaps the most important thing when writing any CGI
program: Make sure that you get the &quot;Content-Length&quot; right! The
Content-Length header can be omitted, but doing so will slow down
some browsers, and in any event Content-Length is usually required
if your server supports the &quot;keep alive&quot; protocol (which is a good
thing, especially if your pages load lots of small images, because
the browser can get the lot in one single HTTP negotiation). But if
you do specify a Content-Length, IT MUST BE CORRECT! In other
words, you must not print ANYTHING without it being accounted for
in the Content-Length. Don't forget to make sure that when you
print newlines (\n) your system really does pass that character to
the server; if it passes \r\n then you may need to count twice.
Getting the Content-Length too big will cause the connection to
stall, and getting it too small will, in most cases, crash the Web
server (and you won't be too popular if it were someone else's
server). Get it right.</p>
<p><span class="bold"><b>And now for the security
considerations:</b></span></p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Sending rubbish CGI input. This can happen to any CGI program.
If it gets rubbish (or something that nearly means something but is
not quite correctly formatted), it's all right to give back a
rubbish response (because it should not happen during normal
use),but it's not all right to hang the program, cause an access
violation, corrupt a database, or otherwise mess things up on the
host computer. I was very careful with my exception system and
tried to make no assumptions at all about the input, so if anything
is wrong then it throws an exception before anything serious
happens. The exception handling is very simple: it just says
&quot;Something has gone wrong with my program&quot;, the nature of the
exception, contact details, and so on.</p>
</li>
<li>
<p>Getting rubbish Web pages. An attacker could easily serve a Web
page with some almost-but-not-quite-valid HTML in it and then ask
the gateway to get that page, hoping that the errors in it would
crash the gateway program. Again you need to be on your guard,
although this time it is best not to say &quot;Something has gone wrong
with my program&quot; at every mistake, because many real Web pages are
not quite right anyway.</p>
</li>
<li>
<p>Obtaining the executable. If you want to make sure that nobody
can do this, then you need to check the access permissions very
carefully. Some Web servers have bugs that let you obtain the
executable - for example, some NT web servers will return the
script if you put its name followed by::$DATA, because of NTFS's
stream system. Other servers have bugs that let you read files from
the parent directory (by typing ./) or add extra commands on the
command line. Usually if you are running a binary executable as the
CGI the risks are not so great. I did lots of testing by trying to
crack in to my own system, just to make sure.</p>
</li>
<li>
<p>Accessing internal data files, especially writing to them. This
is the same sort of thing. In my case I don't really mind who gets
the data files as long as they don't modify them - they're only
public domain mapping tables and a help file - but there are other
applications where this would be more serious. Sometimes it is easy
to get a CGI data file if that file is placed in the scripts
directory, because some servers will return the file itself if they
don't know how to execute it. Keep your data files in a different
directory if possible (and not a subdirectory of the scripts
directory), and it is a good idea to test your system by simply
trying to retrieve a data file with a browser - you may be
surprised. Some web servers insist that the FTP users, Web users
and CGI scripts all have the same set of rights, and if you have
one of these then you need to change it. Small, fast and reliable
Web servers are freely available for many platforms. When I found
that a bug in Internet Information Services had permanently allowed
universal access to one of my directories, I uninstalled it, found
a nice little program called Xitami, and was up and running within
ten minutes (although there are still some things to watch out
for).</p>
<p>A variation on this theme is asking the gateway to retrieve a
local file that would not normally be accessible from the Web. You
need to write such precautions into the gateway, along with
blocking out URLs like Telnet and email (if you are using a
web-getting system that is not your own).</p>
</li>
<li>
<p>Some security sites have a &quot;Click here to launch a test attack
against your system&quot; link. Anyone who wants to attack the gateway
computer need only ask it to get that URL. You need to have already
done the test attacks yourself and corrected any problems.</p>
</li>
<li>
<p>Using your system as a puppet. A Web gateway can be asked to get
virtually any URL, and if another system can be attacked by sending
it a funny URL then the attacker can ask the gateway to send it in
an attempt to cover tracks (or hoping to get you into trouble).
This will mostly involve sending a rogue CGI query to the other
system (via your gateway), and there is no code you can write to
reliably detect such things. You can at least explain yourself if
anybody takes any action, and offer to use your logs to help them
track down the real attacker, but it seems rather futile to try to
stop this from happening in the first place, especially given that
they can always use a public terminal or whatever rather than using
you as a puppet, and the only real reason for using you is to get
you into trouble. Some of the most silly things you can do with CGI
(such as setting a variable to an empty string) will already be
covered by your exception handlers if they are like mine, so in the
general case it is quite difficult to do one of these attacks
anyway.</p>
</li>
<li>
<p>Creating sharing violations. Web servers like Xitami are
multi-threaded, which means that several people can call your
script simultaneously. If your program were not written on this
assumption, then an attacker could use it to create problems. One
particular nasty is when the operating system or compiler routines
are untrustworthy. For example, if two instances of tmpnam()
executed simultaneously in independent processes, then there is no
guarantee that they will not potentially generate duplicate names
on every compiler. I found it helpful to get the program's process
ID, which would be unique at that time, and use it with things like
temporary files.</p>
</li>
<li>
<p>Overloading the service. It doesn't matter if it takes a long
time before a remote server responds to a page-getting request,
because this does not take processor resources on the gateway
computer (the getting process is blocked until data is available).
However, if somebody asked the gateway to get a very large web page
and do extensive processing on it, and then sent many such requests
per second, they could easily slow things down for everybody else.
They could also possibly slow things down for the person actually
using the gateway computer, especially with servers like Xitami
where the default priority of the Web service is set to high to
ensure maximum response. Normally a high priority is acceptable,
but it can cause problems with CGI scripts. In my case, I left it
set to high, because I often run legacy DOS applications in Normal
priority, and NT doesn't always know when they are idling. If the
web server were set to low priority then it would not service
requests while I am running DOS applications.</p>
</li>
</ol>
</div>
<p>In my case, I do not like the idea of setting arbitrary limits
(such as size limits) on the gateway's use (and if you do set a
size limit, then you should not trust the HTTP header's
&quot;Content-Size&quot;, because it could be artificial if the attacker has
asked for a page from their own suitably-reprogrammed phoney Web
server). People may sometimes legitimately request large pages. My
application is a gateway that sorts Web pages out for visually
impaired and international users (one needs things like frames and
tables re-arranged, the other needs conversions of non-Roman
characters), and to impose artificial limitations for security
would be against its principles (it's an &quot;access&quot; gateway, not a
&quot;let's limit your use&quot; gateway). However, most browsers will set a
time limit when retrieving pages, and it is reasonable to set a
generous time limit on CGI scripts (say five minutes) beyond which
the server will stop the script if it is still executing (because
the browser has probably already given up anyway). This will at
least mean that if somebody sent an infinite CGI request, or a
request for an infinitely long Web page (by perpetually writing
data down a TCP/IP stream), the gateway will not be blocked
forever, but it is not an ideal solution.</p>
<p>For one thing, if you are relying on the server &quot;killing&quot; your
process, then, if you can't catch the &quot;kill&quot; signal, it will leave
behind its temporary files. This might be the least of your worries
- it is possible that on some systems the processes owned by yours
(e.g. web getting) will not stop; you need to check for this if
possible, and if so make sure that those processes will stop by
themselves if necessary. Also, attackers can still overload your
system by sending large numbers of simultaneous requests of
moderate size. Assuming that an attacker is not doing &quot;IP
spoofing&quot;, you can keep track of the requesting IPs and cut out if
more than a certain number of requests happen in a certain unit
time (bearing in mind that several requests can happen in a
multiframe document etc), although this does introduce some
overhead for legitimate users. Blocking would usually need manual
intervention, since there will be cases where numerous people can
appear to be at the same IP address (e.g. a Unix server and/or a
dial-up ISP), but automatic blocking can still be programmed if it
is known that attacks are most likely to come from a particular
location. For example, I could quite easily write code to recognise
when a request is coming from within Cambridge University, and this
is the most likely source of attack on my program (if a bunch of
drunken students can send me anonymous derogatory messages about
blindness through a Web-based remailer that required re-writing the
HTML if you wanted to send to arbitrary addresses - except they
didn't think its postmaster would be on my side - then they can
probably launch an attack or two on my CGI next time). There is
still the trouble that IP monitoring adds overhead to normal users,
especially if you are using the CGI paradigm (for portability) and
have to do all that file access. Also, it would not be good to
permanently block a shared or assignable IP address (as in the
BOOTSERVs used in Cambridge).</p>
<p>I do make a point of checking the logs. This will not stop
attacks as they happen, but it will at least make sure that they
come to my notice within 24 hours (earlier if I notice them while
using the computer or the gateway myself), and if they are
sufficiently infrequent then this would be adequate (although not
ideal). I usually casually glance through the logs to check
everything's all right, i.e. no obvious malfunctions or attacks,
but I don't use the logs to pry on people - it's an access gateway,
not a surveillance gateway. I always delete old logs, I make a
point of doing my looking-through while half-thinking about
something else and just before an interesting task that makes me
forget anything I might have noticed, and I don't bother to look at
all the details if everything seems fine anyway. The logs are
useful if somebody reports a problem because I don't have to ask
them for dozens of details, and I have not yet had anybody complain
about privacy - if they are concerned then they may be surprised at
just how many logs there are around the Internet anyway. The other
problem with checking the logs is that it takes so long, especially
when you add up all the time I would spend on it over a year. I've
been toying with the idea of automatic checking (I already have
batch files to filter out my own access and so on), but developing
a good enough fully automatic program would be too time-consuming
(at least in the short and medium term) and I might as well just
read the stuff. Unattended speech synthesis doesn't help either -
it might seem a good idea to have the thing babble away while I eat
breakfast (or whatever), but all too often I need to skip a few
lines when it's obvious what's going on, otherwise the process
takes far too long. Also the synthesiser isn't very configurable -
I usually work with my residual sight, and I wouldn't like to do
serious stuff with my ancient bundled copy of an early version of
TextAssist that I've just managed to actually install.</p>
<p>I am not sure if there is a catch-all solution to overloading.
Doing the logs is a bit of a chore, but any potential attackers can
be assured that I will continue to be alert to them until I find a
better method (and I know where the managerial offices are). If you
have a limited number of users then you could set up a password
system (and then the attacker would first have to sniff the
connection to get the password), but in mine I ruled out that
possibility by introducing the character conversion options, with
potentially thousands of local students who might need them. I
write this before returning to Cambridge and putting the
internation-alised version online, and I'm beginning to think to
myself &quot;you're really going to put your foot in it this time&quot;. I
wish I could stick my cane into this course of action and see if
it's clear. Then I think of all those students struggling to get
the computers to display their native characters - no, I'm not
backing out that easily. Let's have some interesting times. I do
hope this thing actually works.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
