21 computer science students, but who said that criminal hackers |
21 computer science students, but who said that criminal hackers |
22 restrict themselves to everyday fare? Not to mention the |
22 restrict themselves to everyday fare? Not to mention the |
23 free-riding script-kiddies who use this technology without |
23 free-riding script-kiddies who use this technology without |
24 knowing what are the underlying ideas. |
24 knowing what are the underlying ideas. |
25 |
25 |
|
26 For buffer overflow attacks to work, a number of innocent |
|
27 design decisions, which are really benign on their own, need |
|
28 to conspire against you. All these decisions were pretty much |
|
29 taken in a time when there was no Internet: C was introduced |
|
30 around 1973, the Internet TCP/IP protocol was standardised in |
|
31 1982 by which time there were maybe 500 servers connected |
|
32 worldwide (all users were well-behaved), Intel's first 8086 |
|
33 CPUs arrived around 1977. So nobody of the creators can |
|
34 really be blamed, but as mentioned above we should already |
|
35 be way beyond the point that buffer overflow attacks are |
|
36 worth a thought. Unfortunately this is far from the truth. I |
|
37 let you think why? |
|
38 |
|
39 One such ``benign'' design decision is how the memory is laid |
|
40 out into different regions for each process. |
26 |
41 |
27 \bigskip |
42 \begin{center} |
28 For buffer overflow attacks to work a number of innocent |
43 \begin{tikzpicture}[scale=0.7] |
29 design decisions, which are benign on their own, need to |
44 %\draw[step=1cm] (-3,-3) grid (3,3); |
30 conspire against you. One such design decision is how the |
45 \draw[line width=1mm] (-2, -3) rectangle (2,3); |
31 memory is laid out for each process. |
46 \draw[line width=1mm] (-2,1) -- (2,1); |
|
47 \draw[line width=1mm] (-2,-1) -- (2,-1); |
|
48 \draw (0,2) node {\large\tt text}; |
|
49 \draw (0,0) node {\large\tt heap}; |
|
50 \draw (0,-2) node {\large\tt stack}; |
|
51 |
|
52 \draw (-2.7,3) node[anchor=north east] {\tt\begin{tabular}{@{}l@{}}lower\\ address\end{tabular}}; |
|
53 \draw (-2.7,-3) node[anchor=south east] {\tt\begin{tabular}{@{}l@{}}higher\\ address\end{tabular}}; |
|
54 \draw[->, line width=1mm] (-2.5,3) -- (-2.5,-3); |
|
55 |
|
56 \draw (2.7,-2) node[anchor=west] {\tt grows}; |
|
57 \draw (2.7,-3) node[anchor=south west] {\tt\footnotesize older}; |
|
58 \draw (2.7,-1) node[anchor=north west] {\tt\footnotesize newer}; |
|
59 \draw[|->, line width=1mm] (2.5,-3) -- (2.5,-1); |
|
60 \end{tikzpicture} |
|
61 \end{center} |
|
62 |
|
63 \noindent The text region contains the program code (usually |
|
64 this region is read-only). The heap stores all data the |
|
65 programmer explicitly allocates. For us the most interesting |
|
66 region is the stack, which contains data mostly associated |
|
67 with the ``control flow'' of the program. Notice that the stack |
|
68 grows from a higher addresses to lower addresses. That means |
|
69 that older items on the stack will be stored behind newer |
|
70 items. Let's look a bit closer what happens with the stack. |
|
71 Consider the the trivial C program. |
|
72 |
|
73 \lstinputlisting[language=C]{../progs/example1.c} |
|
74 |
|
75 \noindent The main function calls \code{foo} with three |
|
76 argument. Foo contains two (local) buffers. The interesting |
|
77 point is what will the stack looks like after Line 3 has been |
|
78 executed? The answer is as follows: |
|
79 |
|
80 \begin{center} |
|
81 \begin{tikzpicture}[scale=0.65] |
|
82 \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1); |
|
83 \draw[line width=1mm] (-5,-1.2) -- (-5,0.2); |
|
84 \draw[line width=1mm] (-3,-1.2) -- (-3,0.2); |
|
85 \draw (-4,-1) node[anchor=south] {\tt main}; |
|
86 \draw[line width=1mm] (-5,0) -- (-3,0); |
|
87 |
|
88 \draw[gray!20,fill=gray!20] (3, 0) rectangle (5,-1); |
|
89 \draw[line width=1mm] (3,-1.2) -- (3,0.2); |
|
90 \draw[line width=1mm] (5,-1.2) -- (5,0.2); |
|
91 \draw (4,-1) node[anchor=south] {\tt main}; |
|
92 \draw[line width=1mm] (3,0) -- (5,0); |
|
93 |
|
94 %\draw[step=1cm] (-3,-1) grid (3,8); |
|
95 \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); |
|
96 \draw[line width=1mm] (-1,-1.2) -- (-1,7.4); |
|
97 \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4); |
|
98 \draw (0,-1) node[anchor=south] {\tt main}; |
|
99 \draw[line width=1mm] (-1,0) -- (1,0); |
|
100 \draw (0,0) node[anchor=south] {\tt arg$_3$=3}; |
|
101 \draw[line width=1mm] (-1,1) -- (1,1); |
|
102 \draw (0,1) node[anchor=south] {\tt arg$_2$=2}; |
|
103 \draw[line width=1mm] (-1,2) -- (1,2); |
|
104 \draw (0,2) node[anchor=south] {\tt arg$_1$=1}; |
|
105 \draw[line width=1mm] (-1,3) -- (1,3); |
|
106 \draw (0,3.1) node[anchor=south] {\tt ret}; |
|
107 \draw[line width=1mm] (-1,4) -- (1,4); |
|
108 \draw (0,4) node[anchor=south] {\small\tt last sp}; |
|
109 \draw[line width=1mm] (-1,5) -- (1,5); |
|
110 \draw (0,5) node[anchor=south] {\tt buf$_1$}; |
|
111 \draw[line width=1mm] (-1,6) -- (1,6); |
|
112 \draw (0,6) node[anchor=south] {\tt buf$_2$}; |
|
113 \draw[line width=1mm] (-1,7) -- (1,7); |
|
114 |
|
115 \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); |
|
116 \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5); |
|
117 \draw (2.6,3.1) node[anchor=south west] {\tt back to main()}; |
|
118 \end{tikzpicture} |
|
119 \end{center} |
|
120 |
|
121 \noindent On the left is the stack before \code{foo} is |
|
122 called; on the right is the stack after \code{foo} finishes. |
|
123 The function call to \code{foo} in Line 7 pushes the arguments |
|
124 onto the stack in reverse order---shown in the middle. |
|
125 Therefore first 3 then 2 and finally 1. Then it pushes the |
|
126 return address to the stack where execution should resume once |
|
127 \code{foo} has finished. The last stack pointer (\code{sp}) is |
|
128 needed in order to clean up the stack to the last level---in |
|
129 fact there is no cleaning involved, but just the top of the |
|
130 stack will be set back. The two buffers are also on the stack, |
|
131 because they are local data within \code{foo}. |
|
132 |
|
133 |
|
134 Another part of the ``conspiracy'' is that library functions |
|
135 in C look typically as follows: |
|
136 |
|
137 \begin{center} |
|
138 \lstinputlisting[language=C,numbers=none]{../progs/app5.c} |
|
139 \end{center} |
|
140 |
|
141 \noindent This function copies data from a source \pcode{src} |
|
142 to a destination \pcode{dst}. It copies the data until it |
|
143 reaches a zero-byte (\code{"\\0"}). |
|
144 |
|
145 \bigskip\bigskip |
|
146 \subsubsection*{A Crash-Course on GDB} |
|
147 |
|
148 \begin{itemize} |
|
149 \item \texttt{(l)ist n} -- listing the source file from line |
|
150 \texttt{n} |
|
151 \item \texttt{disassemble fun-name} |
|
152 \item \texttt{run} -- starts the program |
|
153 \item \texttt{(b)reak line-number} -- set break point |
|
154 \item \texttt{(c)ontinue} -- continue execution until next |
|
155 breakpoint in a line number |
|
156 |
|
157 \item \texttt{x/nxw addr} -- print out \texttt{n} words starting |
|
158 from address \pcode{addr}, the address could be \code{$esp} |
|
159 for looking at the content of the stack |
|
160 \item \texttt{x/nxb addr} -- print out \texttt{n} bytes |
|
161 \end{itemize} |
|
162 |
32 |
163 |
33 \bigskip\bigskip \noindent If you want to know more about |
164 \bigskip\bigskip \noindent If you want to know more about |
34 buffer overflow attacks, the original Phrack article |
165 buffer overflow attacks, the original Phrack article |
35 ``Smashing The Stack For Fun And Profit'' by Elias Levy (also |
166 ``Smashing The Stack For Fun And Profit'' by Elias Levy (also |
36 known as Aleph One) is an engaging read: |
167 known as Aleph One) is an engaging read: |
37 |
168 |
38 \begin{center} |
169 \begin{center} |
39 \url{http://phrack.org/issues/49/14.html} |
170 \url{http://phrack.org/issues/49/14.html} |
40 \end{center} |
171 \end{center} |
|
172 |
|
173 \noindent This is an article from 1996 and some parts are |
|
174 not up-to-date anymore. The article called |
|
175 ``Smashing the Stack in 2010'' |
|
176 |
|
177 \begin{center} |
|
178 \url{http://www.mgraziano.info/docs/stsi2010.pdf} |
|
179 \end{center} |
|
180 |
|
181 \noindent updates, as the name says, most information to 2010. |
41 |
182 |
42 \end{document} |
183 \end{document} |
43 |
184 |
44 %%% Local Variables: |
185 %%% Local Variables: |
45 %%% mode: latex |
186 %%% mode: latex |