handouts/ho03.tex
changeset 546 3d1f65e43065
parent 516 0fbfb0a86fa8
equal deleted inserted replaced
545:0697622fb181 546:3d1f65e43065
    39 very relevant even today since there are many legacy systems
    39 very relevant even today since there are many legacy systems
    40 out there and also many modern embedded systems often do not
    40 out there and also many modern embedded systems often do not
    41 take any precautions to prevent such attacks. The plot below
    41 take any precautions to prevent such attacks. The plot below
    42 shows the percentage of buffer overflow attacks listed in the
    42 shows the percentage of buffer overflow attacks listed in the
    43 US National Vulnerability Database.\footnote{Search for
    43 US National Vulnerability Database.\footnote{Search for
    44 ``Buffer errors'' at
    44 ``Buffer errors'' in the advanced serach tab at
    45 \url{http://web.nvd.nist.gov/view/vuln/statistics}.}
    45 \url{http://web.nvd.nist.gov/view/vuln/statistics}.}
    46 
    46 
    47 \begin{center}
    47 \begin{center}
    48 \begin{tikzpicture}
    48 \begin{tikzpicture}
    49 \begin{axis}[
    49 \begin{axis}[
    50     xlabel={year},
    50     xlabel={year},
    51     ylabel={\% of total attacks},
    51     ylabel={\% of total attacks},
    52     ylabel style={yshift=-1em},
    52     ylabel style={yshift=-1em},
    53     enlargelimits=false,
    53     enlargelimits=false,
    54     xtick={1997,2000,2002,...,2016},
    54     xtick={1997,1999,2001,...,2017},
    55     xmin=1996.5,
    55     xmin=1996.5,
    56     xmax=2017,
    56     xmax=2018,
    57     ymax=21,
    57     ymax=21,
    58     ytick={0,5,...,20},
    58     ytick={0,5,...,20},
    59     scaled ticks=false,
    59     scaled ticks=false,
    60     axis lines=left,
    60     axis lines=left,
    61     width=12cm,
    61     width=12cm,
    74 \noindent This statistics shows that in the last seven years or so the
    74 \noindent This statistics shows that in the last seven years or so the
    75 number of buffer overflow attacks is around 10\% of all attacks
    75 number of buffer overflow attacks is around 10\% of all attacks
    76 (whereby the absolute numbers of attacks grow each year). So you can
    76 (whereby the absolute numbers of attacks grow each year). So you can
    77 see buffer overflow attacks are very relevant today. For example, very
    77 see buffer overflow attacks are very relevant today. For example, very
    78 recently (February 2016) a buffer overflow attack was discovered in the glibc
    78 recently (February 2016) a buffer overflow attack was discovered in the glibc
    79 library:\footnote{See \url{goo.gl/De2mA8}}
    79 library:\footnote{See \url{http://goo.gl/De2mA8}}
    80 
    80 
    81 \begin{quote}\it
    81 \begin{quote}\it
    82 ``Since 2008, vulnerability has left apps and hardware open to remote
    82 ``Since 2008, a vulnerability has left apps and hardware open to remote
    83   hijacking: Researchers have discovered a potentially catastrophic flaw in
    83   hijacking: Researchers have discovered a potentially catastrophic flaw in
    84   one of the Internet's core building blocks that leaves hundreds or
    84   one of the Internet's core building blocks that leaves hundreds or
    85   thousands of apps and hardware devices vulnerable to attacks that can take
    85   thousands of apps and hardware devices vulnerable to attacks that can take
    86   complete control over them.  The vulnerability was introduced in 2008 in
    86   complete control over them.  The vulnerability was introduced in 2008 in
    87   GNU C Library, a collection of open source code that powers thousands of
    87   GNU C Library, a collection of open source code that powers thousands of
   214 (1,2 and 3). The auxiliary function has two local
   214 (1,2 and 3). The auxiliary function has two local
   215 buffer variables {\tt buf}$_1$ and {\tt buf}$_2$.\label{stack}} 
   215 buffer variables {\tt buf}$_1$ and {\tt buf}$_2$.\label{stack}} 
   216 \end{figure}
   216 \end{figure}
   217 
   217 
   218 On the left is the stack before \code{foo} is called; on the
   218 On the left is the stack before \code{foo} is called; on the
   219 right is the stack after \code{foo} finishes. The function
   219 right is the stack after \code{foo} finishes. The before and
       
   220 after of the stack looks the same.
       
   221 The function
   220 call to \code{foo} in Line 7 (in the C program above) pushes
   222 call to \code{foo} in Line 7 (in the C program above) pushes
   221 the arguments onto the stack in reverse order---shown in the
   223 the arguments onto the stack in reverse order---shown in the
   222 middle. Therefore first 3 then 2 and finally 1. Then it pushes
   224 middle. Therefore first 3 then 2 and finally 1. Then it pushes
   223 the return address onto the stack where execution should
   225 the return address onto the stack where execution should
   224 resume once \code{foo} has finished. The last stack pointer
   226 resume once \code{foo} has finished. The last stack pointer
   263   morekeywords={movl,movw},xleftmargin=5mm]
   265   morekeywords={movl,movw},xleftmargin=5mm]
   264   {../progs/example1b.s}}  
   266   {../progs/example1b.s}}  
   265 \end{tabular}
   267 \end{tabular}
   266 \end{center}
   268 \end{center}
   267 
   269 
   268 \noindent You can see how the function \pcode{foo} stores
   270 \noindent The code for function \pcode{foo} stores
   269 first the last stack pointer onto the stack and then
   271 first the last stack pointer onto the stack and then
   270 calculates the new stack pointer to have enough space for the
   272 calculates the new stack pointer to have enough space for the
   271 two local buffers (Lines 2 - 4). Then it puts the two local
   273 two local buffers (Lines 2 - 4). Then it puts the two local
   272 buffers onto the stack and initialises them with the given
   274 buffers onto the stack and initialises them with the given
   273 data (Lines 5 to 9). Since there is no real computation going
   275 data (Lines 5 to 9). Since there is no real computation going
   507 will be send to the target computer. This of course requires
   509 will be send to the target computer. This of course requires
   508 that the buffer we are trying to attack can at least contain
   510 that the buffer we are trying to attack can at least contain
   509 the shellcode we want to run. But as you can see this is only
   511 the shellcode we want to run. But as you can see this is only
   510 47 bytes, which is a very low bar to jump over. Actually there
   512 47 bytes, which is a very low bar to jump over. Actually there
   511 are optimised versions which only need 24 bytes. More
   513 are optimised versions which only need 24 bytes. More
   512 formidable is the choice of finding the right address to jump
   514 formidable is the problem of finding the right address to jump
   513 to. The string is typically of the form
   515 to. The string is typically of the form
   514 
   516 
   515 \begin{center}
   517 \begin{center}
   516   \begin{tikzpicture}[scale=0.6]
   518   \begin{tikzpicture}[scale=0.6]
   517   \draw[line width=1mm] (-2, -1) rectangle (2,3);
   519   \draw[line width=1mm] (-2, -1) rectangle (2,3);
   530 rectangle). It has to be precisely the first byte of the
   532 rectangle). It has to be precisely the first byte of the
   531 shellcode. While this is easy with the help of a debugger (as
   533 shellcode. While this is easy with the help of a debugger (as
   532 seen before), we typically cannot run anything, including a
   534 seen before), we typically cannot run anything, including a
   533 debugger, on the machine yet we target. And the address is
   535 debugger, on the machine yet we target. And the address is
   534 very specific to the setup of the target machine. One way of
   536 very specific to the setup of the target machine. One way of
   535 finding out what the right address is is to try out one by one
   537 finding out what the right address is is to try out one-by-one
   536 every possible address until we get lucky. With the large
   538 every possible address until we get lucky. With the large
   537 memories available today, however, the odds are long. And if
   539 memories available today, however, the odds are long for this. And if
   538 we try out too many possible candidates too quickly, we might
   540 we try out too many possible candidates too quickly, we might
   539 be detected by the system administrator of the target system.
   541 be detected by the system administrator of the target system.
   540 
   542 
   541 We can improve our odds considerably by making use of a very
   543 We can improve our odds considerably by making use of a very
   542 clever trick. Instead of adding the shellcode at the beginning
   544 clever trick. Instead of adding the shellcode at the beginning
   659 
   661 
   660 \subsubsection*{Caveats and Defences}
   662 \subsubsection*{Caveats and Defences}
   661 
   663 
   662 How can we defend against these attacks? Well, a reflex could
   664 How can we defend against these attacks? Well, a reflex could
   663 be to blame programmers. Precautions should be taken by them
   665 be to blame programmers. Precautions should be taken by them
   664 so that buffers cannot been overfilled and format strings
   666 so that buffers cannot be overfilled and format strings
   665 should not be forgotten. This might actually be slightly
   667 should not be forgotten. This might actually be slightly
   666 simpler to achieve by programmers nowadays since safe versions
   668 simpler to achieve by programmers nowadays since safe versions
   667 of the library functions exist, which always specify the
   669 of the library functions exist, which always specify the
   668 precise number of bytes that should be copied. Compilers also
   670 precise number of bytes that should be copied. Compilers also
   669 nowadays provide warnings when format strings are omitted. So
   671 nowadays provide warnings when format strings are omitted. So
   746 harder, but not impossible. Indeed, I---as an amateur
   748 harder, but not impossible. Indeed, I---as an amateur
   747 attacker---had to explicitly switch off these defences. 
   749 attacker---had to explicitly switch off these defences. 
   748 A real attacker would be more knowledgeable and not need this
   750 A real attacker would be more knowledgeable and not need this
   749 shortcut.
   751 shortcut.
   750 
   752 
   751 To work I run my example under an Ubuntu version ``Maverick
   753 To explain BoAs, I run my examples under an Ubuntu version ``Maverick
   752 Meerkat'' from October 2010 and the gcc 4.4.5. I have not
   754 Meerkat'' from October 2010 and the gcc 4.4.5. I have not
   753 tried whether newer versions would work as well. I tested all
   755 tried whether newer versions would work as well. I tested all
   754 examples inside a virtual
   756 examples inside a virtual
   755 box\footnote{\url{https://www.virtualbox.org}} insulating my
   757 box\footnote{\url{https://www.virtualbox.org}} insulating my
   756 main system from any harm. When compiling the programs I
   758 main system from any harm. When compiling the programs I
   763                      & \pcode{-mpreferred-stack-boundary=2}\\
   765                      & \pcode{-mpreferred-stack-boundary=2}\\
   764                      & \pcode{-z execstack} 
   766                      & \pcode{-z execstack} 
   765 \end{tabular}
   767 \end{tabular}
   766 \end{center}
   768 \end{center}
   767 
   769 
   768 \noindent The first two are innocent as they instruct the
   770 \noindent The first two options are innocent as they instruct the
   769 compiler to include debugging information and also produce
   771 compiler to include debugging information and also produce
   770 non-optimised code (the latter makes the output of the code a
   772 non-optimised code (the latter makes the output of the code a
   771 bit more predictable). The third is important as it switches
   773 bit more predictable). The third is important as it switches
   772 off defences like the stack canaries. The fourth again makes
   774 off defences like the stack canaries. The fourth again makes
   773 it a bit easier to read the code. The final option makes the
   775 it a bit easier to read the code. The final option makes the
   774 stack executable, thus the example in Figure~\ref{C3} works as
   776 stack executable, thus the example in Figure~\ref{C3} works as
   775 intended. While this might be considered cheating....since I
   777 intended. While this might be considered as a complete cheat....since I
   776 explicitly switched off all defences, I hope I was able convey
   778 explicitly switched off all defences; I hope I was able convey
   777 the point that this is actually not too far from realistic
   779 the point that this is actually not too far from realistic
   778 scenarios. I have shown you the classic version of the buffer
   780 scenarios. I have shown you the classic version of the buffer
   779 overflow attacks. Updated and more advanced variants do exist.
   781 overflow attacks. Updated and more advanced variants do exist.
   780 
   782 
   781 With the standard defences switched on, you might want to
   783 With the standard defences switched on, you might want to