\documentclass{article}\usepackage{../style}\usepackage{../langs}\begin{document}\section*{Handout 3 (Buffer Overflow Attacks)}By far the most popular attack method on computers are bufferoverflow attacks or variations thereof. The popularity isunfortunate because we nowadays have technology in place toprevent them effectively. But these kind of attacks are stillvery relevant even today since there are many legacy systemsout there and also many modern embedded systems often do nottake any precautions to prevent such attacks.To understand how buffer overflow attacks work, we have to havea look at how computers work ``under the hood'' (on themachine level) and also understand some aspects of the C/C++programming language. This might not be everyday fare forcomputer science students, but who said that criminal hackersrestrict themselves to everyday fare? Not to mention thefree-riding script-kiddies who use this technology withouteven knowing what the underlying ideas are. If you want to bea good security engineer who needs to defend such attacks, then better you get to know the details.For buffer overflow attacks to work, a number of innocentdesign decisions, which are really benign on their own, needto conspire against you. All these decisions were taken at atime when there was no Internet: C was introduced around 1973;the Internet TCP/IP protocol was standardised in 1982 by whichtime there were maybe 500 servers connected (and all userswere well-behaved, mostly academics); Intel's first 8086 CPUsarrived around 1977. So nobody of the ``forefathers'' canreally be blamed, but as mentioned above we should already beway beyond the point that buffer overflow attacks are worth athought. Unfortunately, this is far from the truth. I let youponder why?One such ``benign'' design decision is how the memory is laidout into different regions for each process. \begin{center} \begin{tikzpicture}[scale=0.7] %\draw[step=1cm] (-3,-3) grid (3,3); \draw[line width=1mm] (-2, -3) rectangle (2,3); \draw[line width=1mm] (-2,1) -- (2,1); \draw[line width=1mm] (-2,-1) -- (2,-1); \draw (0,2) node {\large\tt text}; \draw (0,0) node {\large\tt heap}; \draw (0,-2) node {\large\tt stack}; \draw (-2.7,3) node[anchor=north east] {\tt\begin{tabular}{@{}l@{}}lower\\ address\end{tabular}}; \draw (-2.7,-3) node[anchor=south east] {\tt\begin{tabular}{@{}l@{}}higher\\ address\end{tabular}}; \draw[->, line width=1mm] (-2.5,3) -- (-2.5,-3); \draw (2.7,-2) node[anchor=west] {\tt grows}; \draw (2.7,-3) node[anchor=south west] {\tt\footnotesize older}; \draw (2.7,-1) node[anchor=north west] {\tt\footnotesize newer}; \draw[|->, line width=1mm] (2.5,-3) -- (2.5,-1); \end{tikzpicture}\end{center}\noindent The text region contains the program code (usuallythis region is read-only). The heap stores all data theprogrammer explicitly allocates. For us the most interestingregion is the stack, which contains data mostly associatedwith the control flow of the program. Notice that the stackgrows from a higher addresses to lower addresses. That meansthat older items on the stack will be stored behind, or after,newer items. Let's look a bit closer what happens with thestack when a program is running. Consider the following simpleC program.\lstinputlisting[language=C]{../progs/example1.c} \noindent The \code{main} function calls in Line 7 thefunction \code{foo} with three arguments. \code{Foo} createstwo (local) buffers, but does not do anything interesting withthem. The only purpose of this program is to illustrate whathappens behind the scenes with the stack. The interestingquestion is what will the stack be after Line 3 has beenexecuted? The answer can be illustrated as follows:\begin{center} \begin{tikzpicture}[scale=0.65] \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1); \draw[line width=1mm] (-5,-1.2) -- (-5,0.2); \draw[line width=1mm] (-3,-1.2) -- (-3,0.2); \draw (-4,-1) node[anchor=south] {\tt main}; \draw[line width=1mm] (-5,0) -- (-3,0); \draw[gray!20,fill=gray!20] (3, 0) rectangle (5,-1); \draw[line width=1mm] (3,-1.2) -- (3,0.2); \draw[line width=1mm] (5,-1.2) -- (5,0.2); \draw (4,-1) node[anchor=south] {\tt main}; \draw[line width=1mm] (3,0) -- (5,0); %\draw[step=1cm] (-3,-1) grid (3,8); \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); \draw[line width=1mm] (-1,-1.2) -- (-1,7.4); \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4); \draw (0,-1) node[anchor=south] {\tt main}; \draw[line width=1mm] (-1,0) -- (1,0); \draw (0,0) node[anchor=south] {\tt arg$_3$=3}; \draw[line width=1mm] (-1,1) -- (1,1); \draw (0,1) node[anchor=south] {\tt arg$_2$=2}; \draw[line width=1mm] (-1,2) -- (1,2); \draw (0,2) node[anchor=south] {\tt arg$_1$=1}; \draw[line width=1mm] (-1,3) -- (1,3); \draw (0,3.1) node[anchor=south] {\tt ret}; \draw[line width=1mm] (-1,4) -- (1,4); \draw (0,4) node[anchor=south] {\small\tt last sp}; \draw[line width=1mm] (-1,5) -- (1,5); \draw (0,5) node[anchor=south] {\tt buf$_1$}; \draw[line width=1mm] (-1,6) -- (1,6); \draw (0,6) node[anchor=south] {\tt buf$_2$}; \draw[line width=1mm] (-1,7) -- (1,7); \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5); \draw (2.6,3.1) node[anchor=south west] {\tt back to main()};\end{tikzpicture}\end{center} \noindent On the left is the stack before \code{foo} iscalled; on the right is the stack after \code{foo} finishes.The function call to \code{foo} in Line 7 pushes the argumentsonto the stack in reverse order---shown in the middle.Therefore first 3 then 2 and finally 1. Then it pushes thereturn address onto the stack where execution should resumeonce \code{foo} has finished. The last stack pointer(\code{sp}) is needed in order to clean up the stack to thelast level---in fact there is no cleaning involved, but justthe top of the stack will be set back. So the last stackpointer also needs to be stored. The two buffers inside\pcode{foo} are on the stack too, because they are local datawithin \code{foo}. Consequently the stack in the middle is asnapshot after Line 3 has been executed. In case you arefamiliar with assembly instructions you can also read off thisbehaviour from the machine code that the \code{gcc} compilergenerates for the program above:\footnote{You can make\pcode{gcc} generate assembly instructions if you call it withthe \pcode{-S} option, for example \pcode{gcc -S out in.c}\;.Or you can look at this code by using the debugger. How to dothis will be explained later.}.\begin{center}\small\begin{tabular}[t]{@{}c@{\hspace{8mm}}c@{}}{\lstinputlisting[language={[x86masm]Assembler}, morekeywords={movl},xleftmargin=5mm] {../progs/example1a.s}} &{\lstinputlisting[language={[x86masm]Assembler}, morekeywords={movl,movw},xleftmargin=5mm] {../progs/example1b.s}} \end{tabular}\end{center}\noindent On the left you can see how the function\pcode{main} prepares in Lines 2 to 7 the stack before callingthe function \pcode{foo}. You can see that the numbers 3, 2, 1are stored on the stack (the register \code{$esp} refers tothe top of the stack). On the right you can see how thefunction \pcode{foo} stores the two local buffers onto thestack and initialises them with the given data (Lines 2 to 9).Since there is no real computation going on inside\pcode{foo}, the function then just restores the stack to itsold state and crucially sets the return address where thecomputation should resume (Line 9 in the code on the left-handside). The instruction \code{ret} then transfers control backto the function \pcode{main} to the the instruction just afterthe call to \pcode{foo}, that is Line 9.Another part of the ``conspiracy'' is that library functionsin C look typically as follows:\begin{center}\lstinputlisting[language=C,numbers=none]{../progs/app5.c}\end{center} \noindent This function copies data from a source \pcode{src}to a destination \pcode{dst}. The important point is that itcopies the data until it reaches a zero-byte (\code{"\\0"}). The central idea of the buffer overflow attack is to overwritethe return address on the stack which designates where thecontrol flow of the program should resume once the function athand has finished its computation. So if we have somewhere ina function a local a buffer, say\begin{center}\code{char buf[8];}\end{center}\noindentthen the corresponding stack will look as follows\begin{center} \begin{tikzpicture}[scale=0.65] %\draw[step=1cm] (-3,-1) grid (3,8); \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); \draw[line width=1mm] (-1,-1.2) -- (-1,6.4); \draw[line width=1mm] ( 1,-1.2) -- ( 1,6.4); \draw (0,-1) node[anchor=south] {\tt main}; \draw[line width=1mm] (-1,0) -- (1,0); \draw (0,0) node[anchor=south] {\tt arg$_3$=3}; \draw[line width=1mm] (-1,1) -- (1,1); \draw (0,1) node[anchor=south] {\tt arg$_2$=2}; \draw[line width=1mm] (-1,2) -- (1,2); \draw (0,2) node[anchor=south] {\tt arg$_1$=1}; \draw[line width=1mm] (-1,3) -- (1,3); \draw (0,3.1) node[anchor=south] {\tt ret}; \draw[line width=1mm] (-1,4) -- (1,4); \draw (0,4) node[anchor=south] {\small\tt last sp}; \draw[line width=1mm] (-1,5) -- (1,5); \draw (0,5.1) node[anchor=south] {\tt buf}; \draw[line width=1mm] (-1,6) -- (1,6); \draw (2,5.1) node[anchor=south] {\code{$esp}}; \draw[<-,line width=0.5mm] (1.1,6) -- (2.5,6); \draw[->,line width=0.5mm] (1,4.5) -- (2.5,4.5); \draw (2.6,4.1) node[anchor=south west] {\code{??}}; \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5); \draw (2.6,3.1) node[anchor=south west] {\tt jump to \code{\\x080483f4}};\end{tikzpicture}\end{center}\noindent We need to fill this buffer over its limit of 8characters so that it overwrites the stack pointer and thenalso overwrites the return address. If, for example, we wantto jump to a specific address in memory, say,\pcode{\\x080483f4} then we can fill the buffer with the data\begin{center}\code{char buf[8] = "AAAAAAAABBBB\\xf4\\x83\\x04\\x08";}\end{center}\noindent The first eight \pcode{A}s fill the buffer to therim; the next four \pcode{B}s overwrite the stack pointer(with what data we overwrite this part is usually notimportant); then comes the address we want to jump to. Noticethat we have to give the address in the reverse order. Alladdresses on Intel CPUs need to be given in this way. Sincethe string is enclosed in double quotes, the C convention isthat the string internally will automatically be terminated bya zero-byte. If the programmer uses functions like\pcode{strcpy} for filling the buffer \pcode{buf}, then we canbe sure it will overwrite the stack in this manner---since itwill copy everything up to the zero-byte. Notice that thisoverwriting of the buffer only works since the newer item, thebuffer, is stored on the stack before the older items, likereturn address and arguments. If it had be the other wayaround, then such an overwriting by overflowing a local bufferwould just not work.What the outcome of such an attack is can be illustrated withthe code shown in Figure~\ref{C2}. Under ``normal operation''this program ask for a login-name and a password. Both ofwhich are stored in \code{char} buffers of length 8. Thefunction \pcode{match} tests whether two such buffers containthe same. If yes, then the function lets you ``in'' (byprinting \pcode{Welcome}). If not, it denies access (byprinting \pcode{Wrong identity}). The vulnerable function is\code{get_line} in Lines 11 to 19. This function does not takeany precautions about the buffer of 8 characters being filledbeyond this 8-character-limit. Let us suppose the login nameis \pcode{test}. Then the buffer overflow can be triggeredwith a specially crafted string as password:\begin{center}\code{AAAAAAAABBBB\\x2c\\x85\\x04\\x08\\n}\end{center}\noindent The address at the end happens to be the one for thefunction \pcode{welcome()}. This means even with this input(where the login name and password clearly do not match) theprogram will still print out \pcode{Welcome}. The onlyinformation we need for this attack is to know where thefunction \pcode{welcome()} starts in memory. This informationcan be easily obtained by starting the program inside thedebugger and disassembling this function. \begin{lstlisting}[numbers=none,language={[x86masm]Assembler}, morekeywords={movl,movw}]$ gdb C2GNU gdb (GDB) 7.2-ubuntu(gdb) disassemble welcome\end{lstlisting}\noindent \pcode{C2} is the name of the program and\pcode{gdb} is the name of the debugger. The output will besomething like this\begin{lstlisting}[numbers=none,language={[x86masm]Assembler}, morekeywords={movl,movw}]0x0804852c <+0>: push %ebp0x0804852d <+1>: mov %esp,%ebp0x0804852f <+3>: sub $0x4,%esp0x08048532 <+6>: movl $0x8048690,(%esp)0x08048539 <+13>: call 0x80483a4 <puts@plt>0x0804853e <+18>: movl $0x0,(%esp)0x08048545 <+25>: call 0x80483b4 <exit@plt>\end{lstlisting}\noindent indicating that the function \pcode{welcome()}starts at address \pcode{0x0804852c} (top address in the left column).\begin{figure}[p]\lstinputlisting[language=C]{../progs/C2.c}\caption{A suspicious login implementation.\label{C2}}\end{figure}This kind of attack was very popular with commercial programsthat needed a key to be unlocked. Historically, hackers first broke the rather weak encryption of these locking mechanisms.After the encryption had been made stronger, hackers usedbuffer overflow attacks as shown above to jump directly tothe part of the program that was intended to be only availableafter the correct key was typed in. \subsection*{Paylods}Unfortunately, much more harm can be caused by buffer overflowattacks. This is achieved by injecting code that will be runonce the return address is appropriately modified. Typicallythe code that will be injected is for running a shell. Thisgives the attacker the ability to run programs on the targetmachine and have a good look around, provided the attackedprocess was not already running as root.\footnote{In that casethe attacker would do already congratulate him or herself toanother computer under full control.} In order to be send aspart of the string that is overflowing the buffer, we need thecode to be represented as a sequence of characters. Forexample\lstinputlisting[language=C,numbers=none]{../progs/o1.c}\noindent These characters represent the machine code foropening a shell. It seems obtaining such a string requireshigher-education in the architecture of the target system. Butit is actually relatively simple: First there are many suchstring ready-made---just a quick Google query away. Second,tools like the debugger can help us again. We can just writethe code we want in C, for example this would be the programfor starting a shell\lstinputlisting[language=C,numbers=none]{../progs/shell.c} \noindent Once compiled, we can use the debugger to obtain the machine code, or even the ready-made encoding as charactersequence. While easy, obtaining this string is not entirely trivial.Remember the functions in C that copy or fill buffers worksuch that they copy everything until the zero byte is reached.Unfortunately the ``vanilla'' output from the debugger for theshell-program above will contain such zero bytes. So apost-processing phase is needed to rewrite the machine code ina way that it does not contain any zero bytes. This is likesome works of literature that have been written so that theletter 'i', for example, is avoided. For rewriting the machinecode, you might need to use clever tricks like\begin{lstlisting}[numbers=none,language={[x86masm]Assembler}]xor %eax, %eax\end{lstlisting}\noindent This instruction does not contain any zero byte whenencoded, but produces a zero byte on the stack when run. Having removed the zero bytes we can craft the string that will be send to the target computer. It is typically of the form\begin{center} \begin{tikzpicture}[scale=0.7] \draw[line width=1mm] (-2, -1) rectangle (2,3); \draw[line width=1mm] (-2,1.9) -- (2,1.9); \draw (0,2.5) node {\large\tt shell code}; \draw[line width=1mm,fill=black] (0.3, -1) rectangle (2,-0.7); \draw[->,line width=0.3mm] (1.05, -1) -- (1.05,-1.7) -- (-3,-1.7) -- (-3, 3.7) -- (-1.9, 3.7) -- (-1.9, 3.1); \draw (-2, 3) node[anchor=north east] {\LARGE \color{codegreen}{``}}; \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}}; \end{tikzpicture}\end{center}\noindent This of course requires that the buffer we aretrying to attack can at least contain the shellcode we want torun. But as you can see this is only 47 bytes, which is a verylow bar to jump over. More formidable is the choice of findingthe right address to jump to. As indicated in the picture weneed to be very precise with the address with which we willoverwrite the buffer. It has to be precisely the first byte ofthe shellcode. While this is easy with the help of a debugger(as seen before), we typically cannot run anything on themachine yet we target. And the address is very specific to thesetup of the target machine. One way of finding out what theright address is is to try out one by one until we get lucky.With the large memories available today, however, the odds arelong. And if we try out too many possible candidates tooquickly, we might be detected by the system administrator ofthe target system.We can improve our odds considerably by following a clever trick. Instead of adding the shellcode at the beginning of thestring, we should add it at the end, just before we overflow the buffer, for example\begin{center} \begin{tikzpicture}[scale=0.7] \draw[line width=1mm] (-2, -1) rectangle (2,3); \draw[line width=1mm] (-2,1.9) -- (2,1.9); \draw (0,2.5) node {\large\tt shell code}; \draw[line width=1mm,fill=black] (0.3, -1) rectangle (2,-0.7); \draw (-2, 3) node[anchor=north east] {\LARGE \color{codegreen}{``}}; \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}}; \end{tikzpicture}\end{center}\noindent Then we can fill up the gray part of the string witha \pcode{NOP} operation. The code for this operation is\code{\\0x90}. It is available on every architecture and itspurpose it to to nothing apart from waiting a small amount oftime. If we now use an address that lets us jump to anyaddress in the gray area we are done. The target machine will execute these \pcode{NOP} operations until it reaches theshellcode. A moment of thought can convince you that thistrick can hugely improve our odds of finding the right address---depending on the size of the buffer, it mightonly take a few tries to get the shellcode to run.\bigskip\bigskip\subsubsection*{A Crash-Course for GDB}\begin{itemize}\item \texttt{(l)ist n} -- listing the source file from line \texttt{n}\item \texttt{disassemble fun-name}\item \texttt{run args} -- starts the program, potential arguments can be given\item \texttt{(b)reak line-number} -- set break point\item \texttt{(c)ontinue} -- continue execution until next breakpoint in a line number\item \texttt{x/nxw addr} -- print out \texttt{n} words starting from address \pcode{addr}, the address could be \code{$esp} for looking at the content of the stack\item \texttt{x/nxb addr} -- print out \texttt{n} bytes \end{itemize}\bigskip\bigskip \noindent If you want to know more aboutbuffer overflow attacks, the original Phrack article``Smashing The Stack For Fun And Profit'' by Elias Levy (alsoknown as Aleph One) is an engaging read:\begin{center}\url{http://phrack.org/issues/49/14.html}\end{center} \noindent This is an article from 1996 and some parts arenot up-to-date anymore. The article called``Smashing the Stack in 2010''\begin{center}\url{http://www.mgraziano.info/docs/stsi2010.pdf}\end{center}\noindent updates, as the name says, most information to 2010.\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: