handouts/ho03.tex
changeset 206 0105257429f3
parent 204 8fe0dc898c73
child 209 fd43a9cd9c07
equal deleted inserted replaced
205:88416b7df38c 206:0105257429f3
     1 \documentclass{article}
     1 \documentclass{article}
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 
     3 \usepackage{../langs}
     4 
     4 
     5 \begin{document}
     5 \begin{document}
     6 
     6 
     7 \section*{Handout 3 (Buffer Overflow Attacks)}
     7 \section*{Handout 3 (Buffer Overflow Attacks)}
     8 
     8 
    21 computer science students, but who said that criminal hackers
    21 computer science students, but who said that criminal hackers
    22 restrict themselves to everyday fare? Not to mention the
    22 restrict themselves to everyday fare? Not to mention the
    23 free-riding script-kiddies who use this technology without
    23 free-riding script-kiddies who use this technology without
    24 knowing what are the underlying ideas.
    24 knowing what are the underlying ideas.
    25  
    25  
       
    26 For buffer overflow attacks to work, a number of innocent
       
    27 design decisions, which are really benign on their own, need
       
    28 to conspire against you. All these decisions were pretty much
       
    29 taken in a time when there was no Internet: C was introduced
       
    30 around 1973, the Internet TCP/IP protocol was standardised in
       
    31 1982 by which time there were maybe 500 servers connected
       
    32 worldwide (all users were well-behaved), Intel's first 8086
       
    33 CPUs arrived around 1977. So nobody of the creators can 
       
    34 really be blamed, but as mentioned above we should already 
       
    35 be way beyond the point that buffer overflow attacks are
       
    36 worth a thought. Unfortunately this is far from the truth. I 
       
    37 let you think why?
       
    38 
       
    39 One such ``benign'' design decision is how the memory is laid
       
    40 out into different regions for each process. 
    26  
    41  
    27 \bigskip
    42 \begin{center}
    28 For buffer overflow attacks to work a number of innocent
    43   \begin{tikzpicture}[scale=0.7]
    29 design decisions, which are benign on their own, need to 
    44   %\draw[step=1cm] (-3,-3) grid (3,3);
    30 conspire against you. One such design decision is how the 
    45   \draw[line width=1mm] (-2, -3) rectangle (2,3);
    31 memory is laid out for each process. 
    46   \draw[line width=1mm] (-2,1) -- (2,1);
       
    47   \draw[line width=1mm] (-2,-1) -- (2,-1);
       
    48   \draw (0,2) node {\large\tt text};
       
    49   \draw (0,0) node {\large\tt heap};
       
    50   \draw (0,-2) node {\large\tt stack};
       
    51 
       
    52   \draw (-2.7,3) node[anchor=north east] {\tt\begin{tabular}{@{}l@{}}lower\\ address\end{tabular}};
       
    53   \draw (-2.7,-3) node[anchor=south east] {\tt\begin{tabular}{@{}l@{}}higher\\ address\end{tabular}};
       
    54   \draw[->, line width=1mm] (-2.5,3) -- (-2.5,-3);
       
    55 
       
    56   \draw (2.7,-2) node[anchor=west] {\tt grows};
       
    57   \draw (2.7,-3) node[anchor=south west] {\tt\footnotesize older};
       
    58   \draw (2.7,-1) node[anchor=north west] {\tt\footnotesize newer};
       
    59   \draw[|->, line width=1mm] (2.5,-3) -- (2.5,-1);
       
    60   \end{tikzpicture}
       
    61 \end{center}
       
    62 
       
    63 \noindent The text region contains the program code (usually
       
    64 this region is read-only). The heap stores all data the
       
    65 programmer explicitly allocates. For us the most interesting
       
    66 region is the stack, which contains data mostly associated
       
    67 with the ``control flow'' of the program. Notice that the stack
       
    68 grows from a higher addresses to lower addresses. That means 
       
    69 that older items on the stack will be stored behind newer 
       
    70 items. Let's look a bit closer what happens with the stack.
       
    71 Consider the the trivial C program.
       
    72  
       
    73 \lstinputlisting[language=C]{../progs/example1.c} 
       
    74  
       
    75 \noindent The main function calls \code{foo} with three
       
    76 argument. Foo contains two (local) buffers. The interesting
       
    77 point is what will the stack looks like after Line 3 has been
       
    78 executed? The answer is as follows:
       
    79  
       
    80 \begin{center} 
       
    81  \begin{tikzpicture}[scale=0.65]
       
    82   \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1);
       
    83   \draw[line width=1mm] (-5,-1.2) -- (-5,0.2);
       
    84   \draw[line width=1mm] (-3,-1.2) -- (-3,0.2);
       
    85   \draw (-4,-1) node[anchor=south] {\tt main};
       
    86   \draw[line width=1mm] (-5,0) -- (-3,0);
       
    87 
       
    88   \draw[gray!20,fill=gray!20] (3, 0) rectangle (5,-1);
       
    89   \draw[line width=1mm] (3,-1.2) -- (3,0.2);
       
    90   \draw[line width=1mm] (5,-1.2) -- (5,0.2);
       
    91   \draw (4,-1) node[anchor=south] {\tt main};
       
    92   \draw[line width=1mm] (3,0) -- (5,0);
       
    93  
       
    94    %\draw[step=1cm] (-3,-1) grid (3,8);
       
    95   \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1);
       
    96   \draw[line width=1mm] (-1,-1.2) -- (-1,7.4);
       
    97   \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4);
       
    98   \draw (0,-1) node[anchor=south] {\tt main};
       
    99   \draw[line width=1mm] (-1,0) -- (1,0);
       
   100   \draw (0,0) node[anchor=south] {\tt arg$_3$=3};
       
   101   \draw[line width=1mm] (-1,1) -- (1,1);
       
   102   \draw (0,1) node[anchor=south] {\tt arg$_2$=2};
       
   103   \draw[line width=1mm] (-1,2) -- (1,2);
       
   104   \draw (0,2) node[anchor=south] {\tt arg$_1$=1};
       
   105   \draw[line width=1mm] (-1,3) -- (1,3);
       
   106   \draw (0,3.1) node[anchor=south] {\tt ret};
       
   107   \draw[line width=1mm] (-1,4) -- (1,4);
       
   108   \draw (0,4) node[anchor=south] {\small\tt last sp};
       
   109   \draw[line width=1mm] (-1,5) -- (1,5);
       
   110   \draw (0,5) node[anchor=south] {\tt buf$_1$};
       
   111   \draw[line width=1mm] (-1,6) -- (1,6);
       
   112   \draw (0,6) node[anchor=south] {\tt buf$_2$};
       
   113   \draw[line width=1mm] (-1,7) -- (1,7);
       
   114 
       
   115   \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); 
       
   116   \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5);
       
   117   \draw (2.6,3.1) node[anchor=south west] {\tt back to main()};
       
   118 \end{tikzpicture}
       
   119 \end{center} 
       
   120 
       
   121 \noindent On the left is the stack before \code{foo} is
       
   122 called; on the right is the stack after \code{foo} finishes.
       
   123 The function call to \code{foo} in Line 7 pushes the arguments
       
   124 onto the stack in reverse order---shown in the middle.
       
   125 Therefore first 3 then 2 and finally 1. Then it pushes the
       
   126 return address to the stack where execution should resume once
       
   127 \code{foo} has finished. The last stack pointer (\code{sp}) is
       
   128 needed in order to clean up the stack to the last level---in
       
   129 fact there is no cleaning involved, but just the top of the
       
   130 stack will be set back. The two buffers are also on the stack,
       
   131 because they are local data within \code{foo}.
       
   132 
       
   133  
       
   134 Another part of the ``conspiracy'' is that library functions
       
   135 in C look typically as follows:
       
   136  
       
   137 \begin{center}
       
   138 \lstinputlisting[language=C,numbers=none]{../progs/app5.c}
       
   139 \end{center} 
       
   140 
       
   141 \noindent This function copies data from a source \pcode{src}
       
   142 to a destination \pcode{dst}. It copies the data until it 
       
   143 reaches a zero-byte (\code{"\\0"}). 
       
   144 
       
   145 \bigskip\bigskip
       
   146 \subsubsection*{A Crash-Course on GDB}
       
   147 
       
   148 \begin{itemize}
       
   149 \item \texttt{(l)ist n} -- listing the source file from line 
       
   150 \texttt{n}
       
   151 \item \texttt{disassemble fun-name}
       
   152 \item \texttt{run} -- starts the program
       
   153 \item \texttt{(b)reak line-number} -- set break point
       
   154 \item \texttt{(c)ontinue} -- continue execution until next 
       
   155 breakpoint in a line number
       
   156 
       
   157 \item \texttt{x/nxw addr} -- print out \texttt{n} words starting 
       
   158 from address \pcode{addr}, the address could be \code{$esp} 
       
   159 for looking at the content of the stack
       
   160 \item \texttt{x/nxb addr} -- print out \texttt{n} bytes 
       
   161 \end{itemize}
       
   162 
    32  
   163  
    33 \bigskip\bigskip \noindent If you want to know more about
   164 \bigskip\bigskip \noindent If you want to know more about
    34 buffer overflow attacks, the original Phrack article
   165 buffer overflow attacks, the original Phrack article
    35 ``Smashing The Stack For Fun And Profit'' by Elias Levy (also
   166 ``Smashing The Stack For Fun And Profit'' by Elias Levy (also
    36 known as Aleph One) is an engaging read:
   167 known as Aleph One) is an engaging read:
    37 
   168 
    38 \begin{center}
   169 \begin{center}
    39 \url{http://phrack.org/issues/49/14.html}
   170 \url{http://phrack.org/issues/49/14.html}
    40 \end{center} 
   171 \end{center} 
       
   172 
       
   173 \noindent This is an article from 1996 and some parts are
       
   174 not up-to-date anymore. The article called
       
   175 ``Smashing the Stack in 2010''
       
   176 
       
   177 \begin{center}
       
   178 \url{http://www.mgraziano.info/docs/stsi2010.pdf}
       
   179 \end{center}
       
   180 
       
   181 \noindent updates, as the name says, most information to 2010.
    41  
   182  
    42 \end{document}
   183 \end{document}
    43 
   184 
    44 %%% Local Variables: 
   185 %%% Local Variables: 
    45 %%% mode: latex
   186 %%% mode: latex