handouts/ho03.tex
author cu
Wed, 18 Oct 2017 14:38:25 +0100 (2017-10-18)
changeset 552 c1e9a435e16f
parent 546 3d1f65e43065
permissions -rw-r--r--
updated
\documentclass{article}
\usepackage{../style}
\usepackage{../langs}
\usepackage{../graphics}
\usepackage{../data}

%https://pentestn00b.wordpress.com/safe-to-hack-sites/
%http://pastebin.com/raw/0SNSvyjJ
%http://seclists.org/oss-sec/2016/q1/645

% relative new buffer-overflow attack
%https://www.trustwave.com/Resources/SpiderLabs-Blog/How-I-Cracked-a-Keylogger-and-Ended-Up-in-Someone-s-Inbox/
% using an exploit from 2010
% https://cve.mitre.org/cgi-bin/cvename.cgi?name=cve-2010-3333


% https://github.com/cs01/gdbgui/
% gdb frontend

\begin{document}
\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017}

\section*{Handout 3 (Buffer Overflow Attacks)}

\begin{flushright}
\begin{minipage}{9cm}
\it ``We took a network that was designed to be resilient to nuclear war and
we made it vulnerable to toasters.''\\\mbox{}\hfill\small --- Eben Upton, 2017, RPi co-founder
\end{minipage}
\end{flushright}\bigskip


\noindent
By far the most popular attack method on computers are buffer
overflow attacks or variations thereof. The first Internet
worm (Morris) exploited exactly such an attack. The popularity
is unfortunate because we nowadays have technology in place to
prevent them effectively. But these kind of attacks are still
very relevant even today since there are many legacy systems
out there and also many modern embedded systems often do not
take any precautions to prevent such attacks. The plot below
shows the percentage of buffer overflow attacks listed in the
US National Vulnerability Database.\footnote{Search for
``Buffer errors'' in the advanced serach tab at
\url{http://web.nvd.nist.gov/view/vuln/statistics}.}

\begin{center}
\begin{tikzpicture}
\begin{axis}[
    xlabel={year},
    ylabel={\% of total attacks},
    ylabel style={yshift=-1em},
    enlargelimits=false,
    xtick={1997,1999,2001,...,2017},
    xmin=1996.5,
    xmax=2018,
    ymax=21,
    ytick={0,5,...,20},
    scaled ticks=false,
    axis lines=left,
    width=12cm,
    height=5cm,
    ybar,
    nodes near coords=
     {\footnotesize
      $\pgfmathprintnumber[fixed,fixed zerofill,precision=1,use comma]{\pgfkeysvalueof{/data point/y}}$},
    x tick label style={font=\footnotesize,/pgf/number format/1000 sep={}}]
\addplot
  table [x=Year,y=Percentage] {bufferoverflows.data};
\end{axis}
\end{tikzpicture}
\end{center}

\noindent This statistics shows that in the last seven years or so the
number of buffer overflow attacks is around 10\% of all attacks
(whereby the absolute numbers of attacks grow each year). So you can
see buffer overflow attacks are very relevant today. For example, very
recently (February 2016) a buffer overflow attack was discovered in the glibc
library:\footnote{See \url{http://goo.gl/De2mA8}}

\begin{quote}\it
``Since 2008, a vulnerability has left apps and hardware open to remote
  hijacking: Researchers have discovered a potentially catastrophic flaw in
  one of the Internet's core building blocks that leaves hundreds or
  thousands of apps and hardware devices vulnerable to attacks that can take
  complete control over them.  The vulnerability was introduced in 2008 in
  GNU C Library, a collection of open source code that powers thousands of
  standalone applications and most distributions of Linux, including those
  distributed with routers and other types of hardware. A function known as
  getaddrinfo() that performs domain-name lookups contains a buffer overflow
  bug that allows attackers to remotely execute malicious code. It can be
  exploited when vulnerable devices or apps make queries to
  attacker-controlled domain names or domain name servers or when they're
  exposed to man-in-the-middle attacks where the adversary has the ability
  to monitor and manipulate data passing between a vulnerable device and the
  open Internet. All versions of glibc after 2.9 are vulnerable.''
\end{quote}


To understand how buffer overflow attacks work, we have to have
a look at how computers work ``under the hood'' (on the
machine level) and also understand some aspects of the C/C++
programming language. This might not be everyday fare for
computer science students, but who said that criminal hackers
restrict themselves to everyday fare? ...not to mention the
free-riding script-kiddies who use this technology without
even knowing what the underlying ideas are. If you want to be
a good security engineer who needs to defend against such attacks, 
then better you get to know the details too.
 
For buffer overflow attacks to work, a number of innocent
design decisions, which are really benign on their own, have
to come together. All these decisions were taken at a time
when there was no Internet: C was introduced around 1973; the
Internet TCP/IP protocol was standardised in 1982 by which
time there were maybe 500 servers connected (and all users
were well-behaved, mostly academics); Intel's first 8086 CPUs
arrived around 1977. So nobody of the ``forefathers'' can
really be blamed, but as mentioned above we should already be
way beyond the point that buffer overflow attacks are worth a
thought. Unfortunately, this is far from the reality. I let you
ponder why?

One such ``benign'' design decision is how the memory is laid
out into different regions for each process. 
 
\begin{center}
  \begin{tikzpicture}[scale=0.7]
  %\draw[step=1cm] (-3,-3) grid (3,3);
  \draw[line width=1mm] (-2, -3) rectangle (2,3);
  \draw[line width=1mm] (-2,1) -- (2,1);
  \draw[line width=1mm] (-2,-1) -- (2,-1);
  \draw (0,2) node {\large\tt text};
  \draw (0,0) node {\large\tt heap};
  \draw (0,-2) node {\large\tt stack};

  \draw (-2.7,3) node[anchor=north east] {\tt\begin{tabular}{@{}l@{}}lower\\ address\end{tabular}};
  \draw (-2.7,-3) node[anchor=south east] {\tt\begin{tabular}{@{}l@{}}higher\\ address\end{tabular}};
  \draw[->, line width=1mm] (-2.5,3) -- (-2.5,-3);

  \draw (2.7,-2) node[anchor=west] {\tt grows};
  \draw (2.7,-3) node[anchor=south west] {\tt\footnotesize older};
  \draw (2.7,-1) node[anchor=north west] {\tt\footnotesize newer};
  \draw[|->, line width=1mm] (2.5,-3) -- (2.5,-1);
  \end{tikzpicture}
\end{center}

\noindent The text region contains the program code (usually
this region is read-only). The heap stores all data the
programmer explicitly allocates. For us the most interesting
region is the stack, which contains data mostly associated
with the control flow of the program. Notice that the stack
grows from higher addresses to lower addresses (i.e.~from the
back to the front). That means that older items on the stack
are stored behind, or after, newer items. Let's look a bit
closer what happens with the stack when a program is running.
Consider the following simple C program.

\begin{minipage}{\textwidth}
\lstinputlisting[language=C]{../progs/example1.c} 
\end{minipage}

\noindent The \code{main} function calls in Line 7 the
function \code{foo} with three arguments. \code{Foo} creates
two (local) buffers, but does not do anything interesting with
them. The only purpose of this program is to illustrate what
happens behind the scenes with the stack. The interesting
question is what will the stack look like after Line 3 has
been executed? The answer is illustrated in Figure~\ref{stack}.
 
\begin{figure} 
\begin{center} 
 \begin{tikzpicture}[scale=0.65]
  \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1);
  \draw[line width=1mm] (-5,-1.2) -- (-5,0.2);
  \draw[line width=1mm] (-3,-1.2) -- (-3,0.2);
  \draw (-4,-1) node[anchor=south] {\tt main};
  \draw[line width=1mm] (-5,0) -- (-3,0);

  \draw[gray!20,fill=gray!20] (3, 0) rectangle (5,-1);
  \draw[line width=1mm] (3,-1.2) -- (3,0.2);
  \draw[line width=1mm] (5,-1.2) -- (5,0.2);
  \draw (4,-1) node[anchor=south] {\tt main};
  \draw[line width=1mm] (3,0) -- (5,0);
 
   %\draw[step=1cm] (-3,-1) grid (3,8);
  \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1);
  \draw[line width=1mm] (-1,-1.2) -- (-1,7.4);
  \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4);
  \draw (0,-1) node[anchor=south] {\tt main};
  \draw[line width=1mm] (-1,0) -- (1,0);
  \draw (0,0) node[anchor=south] {\tt arg$_3$=3};
  \draw[line width=1mm] (-1,1) -- (1,1);
  \draw (0,1) node[anchor=south] {\tt arg$_2$=2};
  \draw[line width=1mm] (-1,2) -- (1,2);
  \draw (0,2) node[anchor=south] {\tt arg$_1$=1};
  \draw[line width=1mm] (-1,3) -- (1,3);
  \draw (0,3.1) node[anchor=south] {\tt ret};
  \draw[line width=1mm] (-1,4) -- (1,4);
  \draw (0,4) node[anchor=south] {\small\tt last sp};
  \draw[line width=1mm] (-1,5) -- (1,5);
  \draw (0,5) node[anchor=south] {\tt buf$_1$};
  \draw[line width=1mm] (-1,6) -- (1,6);
  \draw (0,6) node[anchor=south] {\tt buf$_2$};
  \draw[line width=1mm] (-1,7) -- (1,7);

  \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); 
  \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5);
  \draw (2.6,3.1) node[anchor=south west] {\tt back to main()};
\end{tikzpicture}
\end{center}
\caption{The stack layout for a program where the main
function calls an auxiliary function with three arguments
(1,2 and 3). The auxiliary function has two local
buffer variables {\tt buf}$_1$ and {\tt buf}$_2$.\label{stack}} 
\end{figure}

On the left is the stack before \code{foo} is called; on the
right is the stack after \code{foo} finishes. The before and
after of the stack looks the same.
The function
call to \code{foo} in Line 7 (in the C program above) pushes
the arguments onto the stack in reverse order---shown in the
middle. Therefore first 3 then 2 and finally 1. Then it pushes
the return address onto the stack where execution should
resume once \code{foo} has finished. The last stack pointer
(\code{sp}) is needed in order to clean up the stack to the
last level---in fact there is no cleaning involved, but just
the top of the stack will be set back to this address. So the
last stack pointer also needs to be stored. The two buffers
inside \pcode{foo} are on the stack too, because they are
local data within \code{foo}. Consequently the stack in the
middle of Figure~\ref{stack} is a snapshot after Line 3 has
been executed. 

In case you are familiar with assembly instructions you can
also read off this behaviour from the machine code that the
\code{gcc} compiler generates for the program
above:\footnote{You can make \pcode{gcc} generate assembly
instructions if you call it with the \pcode{-S} option, for
example \pcode{gcc -S out in.c}\;. Or you can look at this
code by using the debugger. How to do this will be explained
in the last section.} It generates the following code for the
\pcode{main} and \pcode{foo} functions.

\begin{center}\small
\begin{tabular}[t]{p{11cm}}
{\lstinputlisting[language={[x86masm]Assembler},
  morekeywords={movl},xleftmargin=5mm]
  {../progs/example1a.s}}
\end{tabular}
\end{center}

\noindent Again you can see how the function \pcode{main}
prepares in Lines 2 to 7 the stack before calling the function
\pcode{foo}. You can see that the numbers 3, 2, 1 are stored
on the stack (the register \code{\%esp} refers to the top of
the stack; \pcode{$0x1}, \pcode{$0x2} \pcode{$0x3} are the
hexadecimal encodings for \pcode{1} to \pcode{3}). The code
for the foo function is as follows:

\begin{center}\small
\begin{tabular}[t]{p{11cm}}
{\lstinputlisting[language={[x86masm]Assembler},
  morekeywords={movl,movw},xleftmargin=5mm]
  {../progs/example1b.s}}  
\end{tabular}
\end{center}

\noindent The code for function \pcode{foo} stores
first the last stack pointer onto the stack and then
calculates the new stack pointer to have enough space for the
two local buffers (Lines 2 - 4). Then it puts the two local
buffers onto the stack and initialises them with the given
data (Lines 5 to 9). Since there is no real computation going
on inside \pcode{foo}, the function then just restores the
stack to its old state (Line 10) and crucially sets the return
address where the computation should resume (Line 10). The
instruction \code{ret} then transfers control back to the
function \pcode{main} to the instruction just after the call
to \pcode{foo}, that is Line 10.
 
Another part of the ``conspiracy'' of buffer overflow attacks
is that library functions in C look typically as follows:
 
\begin{center}
\lstinputlisting[language=C,numbers=none]{../progs/app5.c}
\end{center} 

\noindent This function copies data from a source \pcode{src}
to a destination \pcode{dst}. The important point is that it
copies the data until it reaches a zero-byte (\code{"\\0"}). 
This is a convention of the C language which assumes all
strings are terminated by such a zero-byte.

The central idea of the buffer overflow attack is to overwrite
the return address on the stack. This address decides where
the control flow of the program should resume once the
function at hand has finished its computation. So if we 
can control this address, then we can modify the control
flow of a program. To launch an attack we need 
somewhere in a function a local a buffer, say

\begin{center}
\code{char buf[8];}
\end{center}

\noindent which is filled by some user input. The
corresponding stack of such a function will look as follows

\begin{center}
 \begin{tikzpicture}[scale=0.65]
  %\draw[step=1cm] (-3,-1) grid (3,8);
  \draw[line width=1mm] (-1,1.2) -- (-1,6.4);
  \draw[line width=1mm] ( 1,1.2) -- ( 1,6.4);
  \draw (0,2) node[anchor=south] {\ldots};
  \draw[line width=1mm] (-1,3) -- (1,3);
  \draw (0,3.1) node[anchor=south] {\tt ret};
  \draw[line width=1mm] (-1,4) -- (1,4);
  \draw (0,4) node[anchor=south] {\small\tt last sp};
  \draw[line width=1mm] (-1,5) -- (1,5);
  \draw (0,5.1) node[anchor=south] {\tt buf};
  \draw[line width=1mm] (-1,6) -- (1,6);
  \draw (2,5.1) node[anchor=south] {\code{$esp}};
  \draw[<-,line width=0.5mm] (1.1,6) -- (2.5,6);

  \draw[->,line width=0.5mm] (1,4.5) -- (2.5,4.5);
  \draw (2.6,4.1) node[anchor=south west] {\code{??}};
  
  \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5);
  \draw (2.6,3.1) node[anchor=south west] {\tt jump to \code{\\x080483f4}};
\end{tikzpicture}
\end{center}

\noindent We need to fill this buffer over its limit of 8
characters so that it overwrites the stack pointer and then
also overwrites the return address. If, for example, we want
to jump to a specific address in memory, say,
\pcode{\\x080483f4} then we can fill the buffer with the data

\begin{center}
\code{char buf[8] = "AAAAAAAABBBB\\xf4\\x83\\x04\\x08";}
\end{center}
 
\noindent The first eight \pcode{A}s fill the buffer to the
rim; the next four \pcode{B}s overwrite the stack pointer
(with what data we overwrite this part is usually not
important); then comes the address we want to jump to. Notice
that we have to give the address in the reverse order. All
addresses on Intel CPUs need to be given in this way. Since
the string is enclosed in double quotes, the C convention is
that the string internally will automatically be terminated by
a zero-byte. If the programmer uses functions like
\pcode{strcpy} for filling the buffer \pcode{buf}, then we can
be sure it will overwrite the stack in this manner---since it
will copy everything up to the zero-byte. Notice that this
overwriting of the buffer only works since the newer
item---the buffer---is stored on the stack before the older
items, like return address and arguments. If it had be the
other way around, then such an overwriting by overflowing a
local buffer would just not work. Had the designers of C 
been able to foresee what headaches their way of
arranging the stack will cause, how different could be
the IT-World today?

What the outcome of such an attack is can be illustrated with
the code shown in Figure~\ref{C2}. Under ``normal operation''
this program ask for a login-name and a password. Both of
which are stored in \code{char} buffers of length 8. The
function \pcode{match} tests whether two such buffers contain
the same content. If yes, then the function lets you ``in''
(by printing \pcode{Welcome}). If not, it denies access (by
printing \pcode{Wrong identity}). The vulnerable function is
\code{get_line} in Lines 11 to 19. This function does not take
any precautions about the buffer of 8 characters being filled
beyond its 8-character-limit. Let us suppose the login name is
\pcode{test}. Then the buffer overflow can be triggered with a
specially crafted string as password (remember
\pcode{get\_line} requires a \pcode{\\n} at the end of the
input):

\begin{center}
\code{AAAAAAAABBBB\\x2c\\x85\\x04\\x08\\n}
\end{center}

\noindent The address at the end happens to be the one for the
function \pcode{welcome()}. This means even with this input
(where the login name and password clearly do not match) the
program will still print out \pcode{Welcome}. The only
information we need for this attack to work is to know where
the function \pcode{welcome()} starts in memory. This
information can be easily obtained by starting the program
inside the debugger and disassembling this function. 

\begin{lstlisting}[numbers=none,language={[x86masm]Assembler},
  morekeywords={movl,movw}]
$ gdb C2
GNU gdb (GDB) 7.2-ubuntu
(gdb) disassemble welcome
\end{lstlisting}

\noindent \pcode{C2} is the name of the program and
\pcode{gdb} is the name of the debugger. The output will be
something like this

\begin{lstlisting}[numbers=none,language={[x86masm]Assembler},
  morekeywords={movl,movw}]
0x0804852c <+0>:     push   %ebp
0x0804852d <+1>:     mov    %esp,%ebp
0x0804852f <+3>:     sub    $0x4,%esp
0x08048532 <+6>:     movl   $0x8048690,(%esp)
0x08048539 <+13>:    call   0x80483a4 <puts@plt>
0x0804853e <+18>:    movl   $0x0,(%esp)
0x08048545 <+25>:    call   0x80483b4 <exit@plt>
\end{lstlisting}

\noindent indicating that the function \pcode{welcome()}
starts at address \pcode{0x0804852c} (top address in the 
left column).

\begin{figure}[p]
\lstinputlisting[language=C]{../progs/C2.c}
\caption{A vulnerable login implementation. The use of the `own'
  \code{get\_line} function makes this program vulnerable. The
  developer should have used \emph{safe} library functions
  instead.\label{C2}}
\end{figure}

This kind of attack was very popular with commercial programs
that needed a key to be unlocked. Historically, hackers first 
broke the rather weak encryption of these locking mechanisms.
After the encryption had been made stronger, hackers used
buffer overflow attacks as shown above to jump directly to
the part of the program that was intended to be only available
after the correct key was typed in. 

\subsection*{Payloads}

Unfortunately, much more harm can be caused by buffer overflow
attacks. This is achieved by injecting code that will be run
once the return address is appropriately modified. Typically
the code that will be injected starts a shell. This gives the
attacker the ability to run programs on the target machine and
to have a good look around in order to obtain also full root
access (normally the program that is attacked would run with
lesser rights and any shell injected would also only run with
these lesser access rights). If the attacked program was 
already running as root, then the attacker can congratulate
him or herself to another computer under full control\ldots
no more work to be done.

In order to be send as part of the string that is overflowing
the buffer, we need the code for starting the shell to be
represented as a sequence of characters. For example

\lstinputlisting[language=C,numbers=none]{../progs/o1.c}

\noindent These characters represent the machine code for
opening a shell. It seems obtaining such a string requires
``higher-education'' in the architecture of the target system.
But it is actually relatively simple: First there are many
such strings ready-made---just a quick Google query away.
A nice selection of ready-made shell-codes can be found 
for example at

\begin{center}
\url{http://shellblade.net/shellcode.html}
\end{center}


Second, tools like the debugger can help us again. We can just
write the code we want in C, for example this would be the
program for starting a shell:

\lstinputlisting[language=C,numbers=none]{../progs/shell.c} 

\noindent Once compiled, we can use the debugger to obtain the
machine code, or even get the ready-made encoding as character
sequence. 

\lstinputlisting[language=C,numbers=none]{../progs/o2.c}

\noindent
While not too difficult, obtaining this string is not entirely
trivial using \pcode{gdb}. Remember the functions in C that
copy or fill buffers work such that they copy everything until
the zero byte is reached. Unfortunately the ``vanilla'' output
from the debugger for the shell-program above contains such
zero bytes. So a post-processing phase is needed to rewrite
the machine code in a way that it does not contain any zero
bytes. This is like some works of literature that have been
written so that the letter e, for example, is avoided. The
technical term for such a literature work is
\emph{lipogram}.\footnote{The most famous example of a
lipogram is a 50,000 words novel titled Gadsby, see
\url{https://archive.org/details/Gadsby}, which avoids the
letter `e' throughout.} For rewriting the machine code, you
might need to use clever tricks like

\begin{lstlisting}[numbers=none,language={[x86masm]Assembler}]
xor %eax, %eax
\end{lstlisting}

\noindent This instruction does not contain any zero-byte when
encoded as string, but produces a zero-byte on the stack when
run. 

Having removed the zero-bytes we can craft the string that
will be send to the target computer. This of course requires
that the buffer we are trying to attack can at least contain
the shellcode we want to run. But as you can see this is only
47 bytes, which is a very low bar to jump over. Actually there
are optimised versions which only need 24 bytes. More
formidable is the problem of finding the right address to jump
to. The string is typically of the form

\begin{center}
  \begin{tikzpicture}[scale=0.6]
  \draw[line width=1mm] (-2, -1) rectangle (2,3);
  \draw[line width=1mm] (-2,1.9) -- (2,1.9);
  \draw (0,2.5) node {\large\tt shell code};
  \draw[line width=1mm,fill=black] (0.3, -1) rectangle (2,-0.7);
  \draw[->,line width=0.3mm] (1.05, -1) -- (1.05,-1.7) --
  (-3,-1.7) -- (-3, 3.7) -- (-1.9, 3.7) -- (-1.9, 3.1);
  \draw (-2, 3) node[anchor=north east] {\LARGE \color{codegreen}{``}};
  \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}};
  \end{tikzpicture}
\end{center}

\noindent where we need to be very precise with the address
with which we will overwrite the buffer (indicated as a black
rectangle). It has to be precisely the first byte of the
shellcode. While this is easy with the help of a debugger (as
seen before), we typically cannot run anything, including a
debugger, on the machine yet we target. And the address is
very specific to the setup of the target machine. One way of
finding out what the right address is is to try out one-by-one
every possible address until we get lucky. With the large
memories available today, however, the odds are long for this. And if
we try out too many possible candidates too quickly, we might
be detected by the system administrator of the target system.

We can improve our odds considerably by making use of a very
clever trick. Instead of adding the shellcode at the beginning
of the string, we should add it at the end, just before we
overflow the buffer, for example

\begin{center}
  \begin{tikzpicture}[scale=0.6]
  \draw[gray!50,fill=gray!50] (-2,0.3) rectangle (2,3);
  \draw[line width=1mm] (-2, -1) rectangle (2,3);
  \draw[line width=1mm] (-2,0.3) -- (2,0.3);
  \draw[line width=1mm] (-2,-0.7) -- (2,-0.7);
  \draw (0,-0.2) node {\large\tt shell code};
  \draw[line width=1mm,fill=black] (0.3, -1) rectangle (2,-0.7);
  \draw [line width=0.5,decoration={brace,amplitude=2mm},decorate] 
    (2.3,3) -- (2.3,0.3);
  \draw[line width=0.3mm] (1.05, -1) -- (1.05,-1.7) --
  (3,-1.7) -- (3,1.65) -- (2.6, 1.65);
  \draw (-2, 3) node[anchor=north east] {\LARGE \color{codegreen}{``}};
  \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}};
  \end{tikzpicture}
\end{center}

\noindent Then we can fill up the grey part of the string with
\pcode{NOP} operations. The code for this operation is
\code{\\0x90} on Intel CPUs. It is available on every
architecture and its purpose in a CPU is to do nothing apart
from waiting a small amount of time. If we now use an address
that lets us jump to any address in the grey area we are done.
The target machine will execute these \pcode{NOP} operations
until it reaches the shellcode. That is why this NOP-part is
often called \emph{NOP-sledge}. A moment of thought should
convince you that this trick can hugely improve our odds of
finding the right address---depending on the size of the
buffer, it might only take a few tries to get the shellcode to
run. And then we are in. The code for such an attack is shown
in Figure~\ref{C3}. It is directly taken from the original
paper about ``Smashing the Stack for Fun and Profit'' (see
pointer given at the end).

\begin{figure}[p]
\lstinputlisting[language=C]{../progs/C3.c}
\caption{Overwriting a buffer with a string containing a
  payload. Lines 14 and 15 write the address of the buffer into
  \code{large\_string}. The payload is copied in Lines 17 and 18. Line
  20 copies the (too large) string into the buffer.\label{C3}}
\end{figure}

By the way you might now have the question how do attackers
find out about vulnerable systems in the first place? Well,
the automated version uses \emph{fuzzers}, which throw
randomly generated user input at applications and observe the
behaviour. If an application segfaults (throws a segmentation
error) then this is a good indication that a buffer overflow
vulnerability can be exploited.


\subsubsection*{Format String Attacks}

Another question might arise, where do we get all this
information about addresses necessary for mounting a buffer
overflow attack without having yet access to the system? The
answer are \emph{format string attacks}. While technically
they are programming mistakes (and they are pointed out as
warning by modern compilers), they can be easily made and
therefore an easy target. Let us look at the simplest version
of a vulnerable program.

\lstinputlisting[language=C]{../progs/C4.c}

\noindent The intention of this program is to print out the
first argument given on the command line. The ``secret
string'' is never to be printed. The problem is that the C
function \pcode{printf} normally expects a format string---a
schema that directs how a string should be printed. This would
be for example a proper invocation of this function:

\begin{lstlisting}[numbers=none,language=C]
long n = 123456789;
printf("This is a long %lu!", n);
\end{lstlisting}

\noindent In the program above, instead, the format string
has been forgotten and only \pcode{argv[1]} is printed.
Now if we give on the command line a string such as

\begin{center}
\code{"foo \%s"}
\end{center}

\noindent then \pcode{printf} expects a string to 
follow. But there is no string that follows, and how
the argument resolution works in C will in fact print out 
the secret string! This can be handily exploited by 
using the format string \code{"\%x"}, which reads out the 
stack. So \code{"\%x....\%x"} will give you as much 
information from the stack as you need and over the 
Internet.

While the program above contains clearly a programming 
mistake (forgotten format string), things are not as simple
when the application reads data from the user and prompts
responses containing the user input. Consider the slight
variant of the program above

\lstinputlisting[language=C]{../progs/C5.c}

\noindent Here the programmer actually tried to take extra
care to not fall prey to a buffer overflow attack, but in the
process made the program susceptible to a format string
attack. Clearly the \pcode{printf} function in Line 7 contains
now an explicit format string, but because the commandline
input is copied using the function \pcode{snprintf} the result
will be the same---the string can be exploited by embedding
format strings into the user input. Here the programmer really
cannot be blamed (much) because by using \pcode{snprintf} he
or she tried to make sure only 10 characters get copied into
the local buffer---in this way avoiding the obvious buffer
overflow attack.

\subsubsection*{Caveats and Defences}

How can we defend against these attacks? Well, a reflex could
be to blame programmers. Precautions should be taken by them
so that buffers cannot be overfilled and format strings
should not be forgotten. This might actually be slightly
simpler to achieve by programmers nowadays since safe versions
of the library functions exist, which always specify the
precise number of bytes that should be copied. Compilers also
nowadays provide warnings when format strings are omitted. So
proper education of programmers is definitely a part of a
defence against such attacks. However, if we leave it at that,
then we have the mess we have today with new attacks
discovered almost daily. 

There is actually a quite long record of publications
proposing defences against buffer overflow attacks. One method
is to declare the stack data as not executable. In this way it
is impossible to inject a payload as shown above which is then
executed once the stack is smashed. But this needs hardware
support which allows one to declare certain memory regions to
be not executable. Such a feature was not introduced before
the Intel 386, for example. Also if you have a JIT
(just-in-time) compiler it might be advantageous to have
the stack containing executable data. So it is definitely a 
trade-off.

Anyway attackers have found ways around this defence: they
developed \emph{return-to-lib-C} attacks. The idea is to not
inject code, but already use the code that is present at the
target computer. The lib-C library, for example, already
contains the code for spawning a shell. With
\emph{return-to-lib-C} one just has to find out where this
code is located. But attackers can make good guesses. 

Another defence is called \emph{stack canaries}. The advantage
is that they can be automatically inserted into compiled code
and do not need any hardware support. Though they will make
your program run slightly slower. The idea behind \emph{stack
canaries} is to push a random number onto the stack just
before local data is stored. For our very first function
\pcode{foo} the stack would with a \emph{stack canary} look as
follows

\begin{center}
\begin{tikzpicture}[scale=0.65]
  %\draw[step=1cm] (-3,-1) grid (3,8);
  \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1);
  \draw[line width=1mm] (-1,-1.2) -- (-1,7.4);
  \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4);
  \draw (0,-1) node[anchor=south] {\tt main};
  \draw[line width=1mm] (-1,0) -- (1,0);
  \draw (0,0) node[anchor=south] {\tt arg$_3$=3};
  \draw[line width=1mm] (-1,1) -- (1,1);
  \draw (0,1) node[anchor=south] {\tt arg$_2$=2};
  \draw[line width=1mm] (-1,2) -- (1,2);
  \draw (0,2) node[anchor=south] {\tt arg$_1$=1};
  \draw[line width=1mm] (-1,3) -- (1,3);
  \draw (0,3.1) node[anchor=south] {\tt ret};
  \draw[line width=1mm] (-1,4) -- (1,4);
  \draw (0,4) node[anchor=south] {\small\tt last sp};
  \draw[line width=1mm] (-1,5) -- (1,5);
  \draw (0,5.1) node[anchor=south] {\tt\small\textcolor{red}{\textbf{random}}};
  \draw[line width=1mm] (-1,6) -- (1,6);
  \draw (0,6) node[anchor=south] {\tt buf};
  \draw[line width=1mm] (-1,7) -- (1,7);
  \end{tikzpicture}
\end{center}

\noindent The idea behind this random number is that when the
function finishes, it is checked that this random number is
still intact on the stack. If not, then a buffer overflow has
occurred. Although this is quite effective, it requires 
suitable support for generating random numbers. This is always
hard to get right and attackers are happy to exploit the 
resulting weaknesses.

Another defence is \emph{address space randomisation}. This
defence tries to make it harder for an attacker to guess 
addresses where code is stored. It turns out that addresses
where code is stored is rather predictable. Randomising the
place where programs are stored mitigates this problem 
somewhat.

As mentioned before, modern operating systems have these
defences enabled by default and make buffer overflow attacks
harder, but not impossible. Indeed, I---as an amateur
attacker---had to explicitly switch off these defences. 
A real attacker would be more knowledgeable and not need this
shortcut.

To explain BoAs, I run my examples under an Ubuntu version ``Maverick
Meerkat'' from October 2010 and the gcc 4.4.5. I have not
tried whether newer versions would work as well. I tested all
examples inside a virtual
box\footnote{\url{https://www.virtualbox.org}} insulating my
main system from any harm. When compiling the programs I
called the compiler with the following options:

\begin{center}
\begin{tabular}{l@{\hspace{1mm}}l}
\pcode{/usr/bin/gcc} & \pcode{-ggdb -O0}\\
                     & \pcode{-fno-stack-protector}\\
                     & \pcode{-mpreferred-stack-boundary=2}\\
                     & \pcode{-z execstack} 
\end{tabular}
\end{center}

\noindent The first two options are innocent as they instruct the
compiler to include debugging information and also produce
non-optimised code (the latter makes the output of the code a
bit more predictable). The third is important as it switches
off defences like the stack canaries. The fourth again makes
it a bit easier to read the code. The final option makes the
stack executable, thus the example in Figure~\ref{C3} works as
intended. While this might be considered as a complete cheat....since I
explicitly switched off all defences; I hope I was able convey
the point that this is actually not too far from realistic
scenarios. I have shown you the classic version of the buffer
overflow attacks. Updated and more advanced variants do exist.

With the standard defences switched on, you might want to
argue buffer-overflow attacks have been solved on computers
(desktops and servers) but the computing landscape of today is
much wider than that. The main problem today are embedded
systems against which attacker can equally cause a lot of harm
and which are much less defended. Anthony Bonkoski makes a
similar argument in his security blog:

\begin{center}
\url{http://jabsoft.io/2013/09/25/are-buffer-overflows-solved-yet-a-historical-tale/}
\end{center}


There is one more rather effective defence against buffer 
overflow attacks: Why not using a safe language? Java at its 
inception was touted as a safe language because it hides
all explicit memory management from the user. This definitely
incurs a runtime penalty, but for bog-standard user-input
processing applications, speed is not of such an essence 
anymore. There are of course also many other programming 
languages that are safe, i.e.~immune to buffer overflow
attacks.
\bigskip

\subsubsection*{Further Reading}

If you want to know more about buffer overflow attacks, the
original Phrack article ``Smashing The Stack For Fun And
Profit'' by Elias Levy (also known as Aleph One) is an
engaging read:

\begin{center}
\url{http://phrack.org/issues/49/14.html}
\end{center} 

\noindent This is an article from 1996 and some parts are
not up-to-date anymore. The article called
``Smashing the Stack in 2010''

\begin{center}
\url{http://www.mgraziano.info/docs/stsi2010.pdf}
\end{center}

\noindent updates, as the name says, most information to 2010.
There is another Phrack article about return-into-lib(c) exploits 
from 2012:

\begin{center}
\url{http://phrack.org/issues/58/4.html}
\end{center}

\noindent
The main topic is about getting around the non-executability of stack
data (in case it is protected).  This article gives some further
pointers into the recent literature about buffer overflow attacks.

Buffer overflow attacks are not just restricted to Linux and 
``normal'' computers. There is a book

\begin{quote}\rm 
``iOS Hacker's Handbook'' by Miller et al, Wiley, 2012
\end{quote}

\noindent
which seem to describe buffer overflow attacks on iOS. A book from the
same publisher exists also for Android (from 2014) which seem to also
feature buffer overflow attacks. Alas I do not own copies of these
books.


\subsubsection*{A Crash-Course for GDB}

If you want to try out the examples from KEATS it might be
helpful to know about the following commands of the GNU 
Debugger:

\begin{itemize}
\item \texttt{(l)ist n} -- lists the source file from line 
\texttt{n}, the number can be omitted 
\item \texttt{disassemble fun-name} -- shows the assembly code 
of a function
\item \texttt{info registers} -- prints out the current 
content of all registers
\item \texttt{run args} -- starts the program, potential 
arguments can be given
\item \texttt{(b)reak line-number} -- sets break point
\item \texttt{(c)ontinue} -- continue execution until next 
breakpoint
\item \texttt{x/nxw addr} -- prints out \texttt{n} words starting 
from address \pcode{addr}, the address could be \code{$esp} 
for looking at the content of the stack
\item \texttt{x/nxb addr} -- prints out \texttt{n} bytes 
\item \texttt{q} -- quits the debugger
\item \texttt{tui enable} or \texttt{C-x C-a} --- switches into
TUI mode 
\end{itemize}

 
\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: