sen-material: comparison handouts/ho03.tex

equal deleted inserted replaced

-:40efc28963af
+:b784175a69dc
 \begin{document}
 \section*{Handout 3 (Buffer Overflow Attacks)}
 By far the most popular attack method on computers are buffer
-overflow attacks or variations thereof. The popularity is
+overflow attacks or variations thereof. The first Internet
-unfortunate because we nowadays have technology in place to
+worm (Morris) exploited exactly such an attack. The popularity
+is unfortunate because we nowadays have technology in place to
 prevent them effectively. But these kind of attacks are still
 very relevant even today since there are many legacy systems
 out there and also many modern embedded systems often do not
 take any precautions to prevent such attacks.
 \lstinputlisting[language=C]{../progs/C3.c}
 \caption{Overwriting a buffer with a string containing a
 payload.\label{C3}}
 \end{figure}
+By the way you might have the question how do attackers find
+out about vulnerable systems? Well, the automated version uses
+\emph{fuzzers}, which throw randomly generated user input at
+applications and observe the behaviour. If an application
+seg-faults (throws a segmentation error) then this is a good
+indication that a buffer overflow vulnerability can be
+exploited.
 \subsubsection*{Format String Attacks}
-A question might arise, where do we get all this information
+Another question might arise, where do we get all this
-about addresses necessary for mounting a buffer overflow
+information about addresses necessary for mounting a buffer
-attack without having yet access to the system? The answer are
+overflow attack without having yet access to the system? The
-\emph{format string attacks}. While technically they are
+answer are \emph{format string attacks}. While technically
-programming mistakes (and they are pointed out as warning by
+they are programming mistakes (and they are pointed out as
-modern compilers), they can be easily made and therefore an
+warning by modern compilers), they can be easily made and
-easy target. Let us look at the simplest version of a
+therefore an easy target. Let us look at the simplest version
-vulnerable program.
+of a vulnerable program.
 \lstinputlisting[language=C]{../progs/C4.c}
 \noindent The intention is to print out the first argument
 given on the command line. The ``secret string'' is never to
 avoiding the obvious buffer overflow attack.
 \subsubsection*{Caveats and Defences}
 How can we defend against these attacks? Well, a reflex could
-be to blame programmers. Precautions should be taken that
+be to blame programmers. Precautions should be taken so that
 buffers cannot been overfilled and format strings should not
-be forgotten.
+be forgotten. This might actually be slightly simpler nowadays
+since safe versions of the library functions exists, which
-\bigskip\bigskip
+always specify the precise number of bytes that should be
-\subsubsection*{A Crash-Course for GDB}
+copied. Compilers also nowadays provide warnings when format
+strings are omitted. So proper education of programmers is
-\begin{itemize}
+definitely a part of a defence against such attacks. However,
-\item \texttt{(l)ist n} -- listing the source file from line
+if we leave it at that, then we have the mess we have today
-\texttt{n}
+with new attacks discovered almost daily.
-\item \texttt{disassemble fun-name}
-\item \texttt{run args} -- starts the program, potential
+There is actually a quite long record of publications
-arguments can be given
+proposing defences against buffer overflow attacks. One method
-\item \texttt{(b)reak line-number} -- set break point
+is to declare the stack data as not executable. In this way it
-\item \texttt{(c)ontinue} -- continue execution until next
+is impossible to inject a payload as shown above which is then
-breakpoint in a line number
+executed once the stack is smashed. But this needs hardware
+support which allows one to declare certain memory regions to
-\item \texttt{x/nxw addr} -- print out \texttt{n} words starting
+be not executable. Such a feature was not introduced before
-from address \pcode{addr}, the address could be \code{$esp}
+the Intel 386, for example. Also if you have a JIT
-for looking at the content of the stack
+(just-in-time) compiler it might be advantageous to have
-\item \texttt{x/nxb addr} -- print out \texttt{n} bytes
+the stack containing executable data. So it is definitely a
-\end{itemize}
+trade-off.
+Anyway attackers have found ways around this defence: they
-\bigskip\bigskip \noindent If you want to know more about
+developed \emph{return-to-lib-C} attacks. The idea is to not
-buffer overflow attacks, the original Phrack article
+inject code, but already use the code that is present at the
-``Smashing The Stack For Fun And Profit'' by Elias Levy (also
+target computer. The lib-C library, for example, already
-known as Aleph One) is an engaging read:
+contains the code for spawning a shell. With
+\emph{return-to-lib-C} one just has to find out where this
+code is located. But attackers can make good guesses. In my
+examples I took a shortcut and always made the stack
+executable.
+Another defence is called \emph{stack canaries}. The advantage
+is that they can be automatically inserted into compiled code
+and do not need any hardware support. Though they will make
+your program run slightly slower. The idea behind \emph{stack
+canaries} is to push a random number onto the stack just
+before local data is stored. For our very first function the
+stack would with a \emph{stack canary} look as follows
+\begin{center}
+\begin{tikzpicture}[scale=0.65]
+%\draw[step=1cm] (-3,-1) grid (3,8);
+\draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1);
+\draw[line width=1mm] (-1,-1.2) -- (-1,7.4);
+\draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4);
+\draw (0,-1) node[anchor=south] {\tt main};
+\draw[line width=1mm] (-1,0) -- (1,0);
+\draw (0,0) node[anchor=south] {\tt arg$_3$=3};
+\draw[line width=1mm] (-1,1) -- (1,1);
+\draw (0,1) node[anchor=south] {\tt arg$_2$=2};
+\draw[line width=1mm] (-1,2) -- (1,2);
+\draw (0,2) node[anchor=south] {\tt arg$_1$=1};
+\draw[line width=1mm] (-1,3) -- (1,3);
+\draw (0,3.1) node[anchor=south] {\tt ret};
+\draw[line width=1mm] (-1,4) -- (1,4);
+\draw (0,4) node[anchor=south] {\small\tt last sp};
+\draw[line width=1mm] (-1,5) -- (1,5);
+\draw (0,5.1) node[anchor=south] {\tt\small\textcolor{red}{\textbf{random}}};
+\draw[line width=1mm] (-1,6) -- (1,6);
+\draw (0,6) node[anchor=south] {\tt buf};
+\draw[line width=1mm] (-1,7) -- (1,7);
+\end{tikzpicture}
+\end{center}
+\noindent The idea behind this random number is that when the
+function finishes, it is checked that this random number is
+still intact on the stack. If not, then a buffer overflow has
+occurred. Although this is quite effective, but requires
+suitable support for generating random numbers. This is always
+hard to get right and attackers are happy to exploit the
+resulting weaknesses.
+Another defence is \emph{address space randomisation}. This
+defence tries to make it harder for an attacker to guess
+addresses where code is stored. It turns out that addresses
+where code is stored is rather predictable. Randomising the
+place where programs are stored mitigates this problem
+somewhat.
+As mentioned before, modern operating systems have these
+defences enabled by default and make buffer overflow attacks
+harder, but not impossible. Indeed, I as an amateur attacker
+had to explicitly switch off these defences. I run my example
+under an Ubuntu version ``Maverick Meerkat'' from October
+2010 and the gcc 4.4.5. I have not tried whether newer versions
+would work as well. I tested all examples inside a virtual
+box\footnote{https://www.virtualbox.org} insulating my main
+system from any harm. When compiling the programs I called
+the compiler with the following options:
+\begin{center}
+\begin{tabular}{l@{\hspace{1mm}}l}
+\pcode{/usr/bin/gcc} & \pcode{-ggdb -O0}\\
+& \pcode{-fno-stack-protector}\\
+& \pcode{-mpreferred-stack-boundary=2}\\
+& \pcode{-z execstack}
+\end{tabular}
+\end{center}
+\noindent The first two are innocent as they instruct the
+compiler to include debugging information and also produce
+non-optimised code (the latter makes the output of the code a
+bit more predictable). The third is important as it switches
+of defences like the stack canaries. The fourth again makes it
+a bit easier to read the code. The final option makes the
+stack executable, thus the the example in Figure~\ref{C3}
+works as intended. While this might be considered
+cheating....since I explicitly switched off all defences, I
+hope I was able convey that this is actually not too far
+from realistic scenarios. I have shown you the classic version
+of the buffer overflow attacks. Updated variants do exist.
+Also one might argue buffer-overflow attacks have been
+solved on computers (desktops or servers) but the computing
+landscape of nowadays is wider than ever. The main problem
+nowadays are embedded systems against which attacker can
+equally cause a lot of harm and which are much less defended
+against. Anthony Bonkoski makes a similar argument in his
+security blog:
+\begin{center}
+\url{http://jabsoft.io/2013/09/25/are-buffer-overflows-solved-yet-a-historical-tale/}
+\end{center}
+There is one more rather effective defence against buffer
+overflow attacks: Why not using a safe language? Java at its
+inception was touted as a safe language because it hides
+all explicit memory management from the user. This definitely
+incurs a runtime penalty, but for bog-standard user-input
+processing applications, speed is not of such an essence
+anymore. There are of course also many other programming
+languages that are safe, i.e.~immune to buffer overflow
+attacks.
+\bigskip
+\noindent If you want to know more about buffer overflow
+attacks, the original Phrack article ``Smashing The Stack For
+Fun And Profit'' by Elias Levy (also known as Aleph One) is an
+engaging read:
 \begin{center}
 \url{http://phrack.org/issues/49/14.html}
 \end{center}
 \begin{center}
 \url{http://www.mgraziano.info/docs/stsi2010.pdf}
 \end{center}
 \noindent updates, as the name says, most information to 2010.
+There are also sources for buffer overflow attack in
-\end{document}
+\subsubsection*{A Crash-Course for GDB}
+If you want to try out the examples from KEATS it might be
+helpful to know about the following commands of the GNU
+Debugger:
+\begin{itemize}
+\item \texttt{(l)ist n} -- lists the source file from line
+\texttt{n}, the number can be omitted
+\item \texttt{disassemble fun-name} -- show the assembly code
+of a function
+\item \texttt{run args} -- starts the program, potential
+arguments can be given
+\item \texttt{(b)reak line-number} -- sets break point
+\item \texttt{(c)ontinue} -- continue execution until next
+breakpoint
+\item \texttt{x/nxw addr} -- prints out \texttt{n} words starting
+from address \pcode{addr}, the address could be \code{$esp}
+for looking at the content of the stack
+\item \texttt{x/nxb addr} -- prints out \texttt{n} bytes
+\end{itemize}
+\bigskip\bigskip \noindent \end{document}
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: t
 %%% End:

changeset 237	b784175a69dc
parent 236	40efc28963af
child 238	6ba55ba5b588