121 function \code{foo} with three arguments. \code{Foo} creates |
121 function \code{foo} with three arguments. \code{Foo} creates |
122 two (local) buffers, but does not do anything interesting with |
122 two (local) buffers, but does not do anything interesting with |
123 them. The only purpose of this program is to illustrate what |
123 them. The only purpose of this program is to illustrate what |
124 happens behind the scenes with the stack. The interesting |
124 happens behind the scenes with the stack. The interesting |
125 question is what will the stack look like after Line 3 has |
125 question is what will the stack look like after Line 3 has |
126 been executed? The answer can be illustrated as follows: |
126 been executed? The answer is illustrated in Figure~\ref{stack}. |
127 |
127 |
|
128 \begin{figure} |
128 \begin{center} |
129 \begin{center} |
129 \begin{tikzpicture}[scale=0.65] |
130 \begin{tikzpicture}[scale=0.65] |
130 \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1); |
131 \draw[gray!20,fill=gray!20] (-5, 0) rectangle (-3,-1); |
131 \draw[line width=1mm] (-5,-1.2) -- (-5,0.2); |
132 \draw[line width=1mm] (-5,-1.2) -- (-5,0.2); |
132 \draw[line width=1mm] (-3,-1.2) -- (-3,0.2); |
133 \draw[line width=1mm] (-3,-1.2) -- (-3,0.2); |
162 |
163 |
163 \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); |
164 \draw[->,line width=0.5mm] (1,4.5) -- (1.8,4.5) -- (1.8, 0) -- (1.1,0); |
164 \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5); |
165 \draw[->,line width=0.5mm] (1,3.5) -- (2.5,3.5); |
165 \draw (2.6,3.1) node[anchor=south west] {\tt back to main()}; |
166 \draw (2.6,3.1) node[anchor=south west] {\tt back to main()}; |
166 \end{tikzpicture} |
167 \end{tikzpicture} |
167 \end{center} |
168 \end{center} |
168 |
169 \caption{The stack layout for a program where the main |
169 \noindent On the left is the stack before \code{foo} is |
170 function calls an auxiliary function with three arguments |
170 called; on the right is the stack after \code{foo} finishes. |
171 (1,2 and 3). The auxiliary function has two local |
171 The function call to \code{foo} in Line 7 pushes the arguments |
172 buffer variables {\tt buf}$_1$ and {\tt buf}$_2$.\label{stack}} |
172 onto the stack in reverse order---shown in the middle. |
173 \end{figure} |
173 Therefore first 3 then 2 and finally 1. Then it pushes the |
174 |
174 return address onto the stack where execution should resume |
175 On the left is the stack before \code{foo} is called; on the |
175 once \code{foo} has finished. The last stack pointer |
176 right is the stack after \code{foo} finishes. The function |
|
177 call to \code{foo} in Line 7 (in the C program above) pushes |
|
178 the arguments onto the stack in reverse order---shown in the |
|
179 middle. Therefore first 3 then 2 and finally 1. Then it pushes |
|
180 the return address onto the stack where execution should |
|
181 resume once \code{foo} has finished. The last stack pointer |
176 (\code{sp}) is needed in order to clean up the stack to the |
182 (\code{sp}) is needed in order to clean up the stack to the |
177 last level---in fact there is no cleaning involved, but just |
183 last level---in fact there is no cleaning involved, but just |
178 the top of the stack will be set back to this address. So the |
184 the top of the stack will be set back to this address. So the |
179 last stack pointer also needs to be stored. The two buffers |
185 last stack pointer also needs to be stored. The two buffers |
180 inside \pcode{foo} are on the stack too, because they are |
186 inside \pcode{foo} are on the stack too, because they are |
181 local data within \code{foo}. Consequently the stack in the |
187 local data within \code{foo}. Consequently the stack in the |
182 middle is a snapshot after Line 3 has been executed. In case |
188 middle of Figure~\ref{stack} is a snapshot after Line 3 has |
183 you are familiar with assembly instructions you can also read |
189 been executed. |
184 off this behaviour from the machine code that the \code{gcc} |
190 |
185 compiler generates for the program above:\footnote{You can |
191 In case you are familiar with assembly instructions you can |
186 make \pcode{gcc} generate assembly instructions if you call it |
192 also read off this behaviour from the machine code that the |
187 with the \pcode{-S} option, for example \pcode{gcc -S out |
193 \code{gcc} compiler generates for the program |
188 in.c}\;. Or you can look at this code by using the debugger. |
194 above:\footnote{You can make \pcode{gcc} generate assembly |
189 How to do this will be explained in the last section.} |
195 instructions if you call it with the \pcode{-S} option, for |
|
196 example \pcode{gcc -S out in.c}\;. Or you can look at this |
|
197 code by using the debugger. How to do this will be explained |
|
198 in the last section.} It generates the following code for the |
|
199 \pcode{main} and \pcode{foo} functions. |
190 |
200 |
191 \begin{center}\small |
201 \begin{center}\small |
192 \begin{tabular}[t]{p{11cm}} |
202 \begin{tabular}[t]{p{11cm}} |
193 {\lstinputlisting[language={[x86masm]Assembler}, |
203 {\lstinputlisting[language={[x86masm]Assembler}, |
194 morekeywords={movl},xleftmargin=5mm] |
204 morekeywords={movl},xleftmargin=5mm] |
195 {../progs/example1a.s}} |
205 {../progs/example1a.s}} |
196 \end{tabular} |
206 \end{tabular} |
197 \end{center} |
207 \end{center} |
|
208 |
|
209 \noindent Again you can see how the function \pcode{main} |
|
210 prepares in Lines 2 to 7 the stack before calling the function |
|
211 \pcode{foo}. You can see that the numbers 3, 2, 1 are stored |
|
212 on the stack (the register \code{\%esp} refers to the top of |
|
213 the stack; \pcode{$0x1}, \pcode{$0x2} \pcode{$0x3} are the |
|
214 hexadecimal encodings for \pcode{1} to \pcode{3}). The code |
|
215 for the foo function is as follows: |
|
216 |
198 \begin{center}\small |
217 \begin{center}\small |
199 \begin{tabular}[t]{p{11cm}} |
218 \begin{tabular}[t]{p{11cm}} |
200 {\lstinputlisting[language={[x86masm]Assembler}, |
219 {\lstinputlisting[language={[x86masm]Assembler}, |
201 morekeywords={movl,movw},xleftmargin=5mm] |
220 morekeywords={movl,movw},xleftmargin=5mm] |
202 {../progs/example1b.s}} |
221 {../progs/example1b.s}} |
203 \end{tabular} |
222 \end{tabular} |
204 \end{center} |
223 \end{center} |
205 |
224 |
206 \noindent On the left you can see how the function |
225 \noindent You can see how the function \pcode{foo} stores |
207 \pcode{main} prepares in Lines 2 to 7 the stack before calling |
226 first the last stack pointer onto the stack and then |
208 the function \pcode{foo}. You can see that the numbers 3, 2, 1 |
227 calculates the new stack pointer to have enough space for the |
209 are stored on the stack (the register \code{$esp} refers to |
228 two local buffers (Lines 2 - 4). Then it puts the two local |
210 the top of the stack; \pcode{$0x1}, \pcode{$0x2} \pcode{$0x3} |
|
211 are the encodings for \pcode{1} to \pcode{3}). On the right |
|
212 you can see how the function \pcode{foo} stores the two local |
|
213 buffers onto the stack and initialises them with the given |
229 buffers onto the stack and initialises them with the given |
214 data (Lines 2 to 9). Since there is no real computation going |
230 data (Lines 5 to 9). Since there is no real computation going |
215 on inside \pcode{foo}, the function then just restores the |
231 on inside \pcode{foo}, the function then just restores the |
216 stack to its old state and crucially sets the return address |
232 stack to its old state (Line 10) and crucially sets the return |
217 where the computation should resume (Line 9 in the code on the |
233 address where the computation should resume (Line 10). The |
218 right-hand side). The instruction \code{ret} then transfers |
234 instruction \code{ret} then transfers control back to the |
219 control back to the function \pcode{main} to the |
235 function \pcode{main} to the instruction just after the call |
220 instruction just after the call to \pcode{foo}, that is Line |
236 to \pcode{foo}, that is Line 10. |
221 9. |
|
222 |
237 |
223 Another part of the ``conspiracy'' of buffer overflow attacks |
238 Another part of the ``conspiracy'' of buffer overflow attacks |
224 is that library functions in C look typically as follows: |
239 is that library functions in C look typically as follows: |
225 |
240 |
226 \begin{center} |
241 \begin{center} |
249 corresponding stack of such a function will look as follows |
264 corresponding stack of such a function will look as follows |
250 |
265 |
251 \begin{center} |
266 \begin{center} |
252 \begin{tikzpicture}[scale=0.65] |
267 \begin{tikzpicture}[scale=0.65] |
253 %\draw[step=1cm] (-3,-1) grid (3,8); |
268 %\draw[step=1cm] (-3,-1) grid (3,8); |
254 \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); |
269 \draw[line width=1mm] (-1,1.2) -- (-1,6.4); |
255 \draw[line width=1mm] (-1,-1.2) -- (-1,6.4); |
270 \draw[line width=1mm] ( 1,1.2) -- ( 1,6.4); |
256 \draw[line width=1mm] ( 1,-1.2) -- ( 1,6.4); |
271 \draw (0,2) node[anchor=south] {\ldots}; |
257 \draw (0,-1) node[anchor=south] {\tt main}; |
|
258 \draw[line width=1mm] (-1,0) -- (1,0); |
|
259 \draw (0,0) node[anchor=south] {\tt arg$_3$=3}; |
|
260 \draw[line width=1mm] (-1,1) -- (1,1); |
|
261 \draw (0,1) node[anchor=south] {\tt arg$_2$=2}; |
|
262 \draw[line width=1mm] (-1,2) -- (1,2); |
|
263 \draw (0,2) node[anchor=south] {\tt arg$_1$=1}; |
|
264 \draw[line width=1mm] (-1,3) -- (1,3); |
272 \draw[line width=1mm] (-1,3) -- (1,3); |
265 \draw (0,3.1) node[anchor=south] {\tt ret}; |
273 \draw (0,3.1) node[anchor=south] {\tt ret}; |
266 \draw[line width=1mm] (-1,4) -- (1,4); |
274 \draw[line width=1mm] (-1,4) -- (1,4); |
267 \draw (0,4) node[anchor=south] {\small\tt last sp}; |
275 \draw (0,4) node[anchor=south] {\small\tt last sp}; |
268 \draw[line width=1mm] (-1,5) -- (1,5); |
276 \draw[line width=1mm] (-1,5) -- (1,5); |
299 that the string internally will automatically be terminated by |
307 that the string internally will automatically be terminated by |
300 a zero-byte. If the programmer uses functions like |
308 a zero-byte. If the programmer uses functions like |
301 \pcode{strcpy} for filling the buffer \pcode{buf}, then we can |
309 \pcode{strcpy} for filling the buffer \pcode{buf}, then we can |
302 be sure it will overwrite the stack in this manner---since it |
310 be sure it will overwrite the stack in this manner---since it |
303 will copy everything up to the zero-byte. Notice that this |
311 will copy everything up to the zero-byte. Notice that this |
304 overwriting of the buffer only works since the newer item, the |
312 overwriting of the buffer only works since the newer |
305 buffer, is stored on the stack before the older items, like |
313 item---the buffer---is stored on the stack before the older |
306 return address and arguments. If it had be the other way |
314 items, like return address and arguments. If it had be the |
307 around, then such an overwriting by overflowing a local buffer |
315 other way around, then such an overwriting by overflowing a |
308 would just not work. Had the designers of C had just been able |
316 local buffer would just not work. Had the designers of C |
309 to foresee what headaches their way of arranging the stack |
317 been able to foresee what headaches their way of |
310 caused in the time where computers are accessible from |
318 arranging the stack will cause, how different could be |
311 everywhere? |
319 the IT-World today? |
312 |
320 |
313 What the outcome of such an attack is can be illustrated with |
321 What the outcome of such an attack is can be illustrated with |
314 the code shown in Figure~\ref{C2}. Under ``normal operation'' |
322 the code shown in Figure~\ref{C2}. Under ``normal operation'' |
315 this program ask for a login-name and a password. Both of |
323 this program ask for a login-name and a password. Both of |
316 which are stored in \code{char} buffers of length 8. The |
324 which are stored in \code{char} buffers of length 8. The |
381 Unfortunately, much more harm can be caused by buffer overflow |
391 Unfortunately, much more harm can be caused by buffer overflow |
382 attacks. This is achieved by injecting code that will be run |
392 attacks. This is achieved by injecting code that will be run |
383 once the return address is appropriately modified. Typically |
393 once the return address is appropriately modified. Typically |
384 the code that will be injected starts a shell. This gives the |
394 the code that will be injected starts a shell. This gives the |
385 attacker the ability to run programs on the target machine and |
395 attacker the ability to run programs on the target machine and |
386 to have a good look around, provided the attacked process was not |
396 to have a good look around in order to obtain also full root |
387 already running as root.\footnote{In that case the attacker |
397 access (normally the program that is attacked would run with |
388 would already congratulate him or herself to another |
398 lesser rights and any shell injected would also only run with |
389 computer under full control.} In order to be send as part of |
399 these lesser access rights). If the attacked program was |
390 the string that is overflowing the buffer, we need the code to |
400 already running as root, then the attacker can congratulate |
391 be represented as a sequence of characters. For example |
401 him or herself to another computer under full control\ldots |
|
402 no more work to be done. |
|
403 |
|
404 In order to be send as part of the string that is overflowing |
|
405 the buffer, we need the code for starting the shell to be |
|
406 represented as a sequence of characters. For example |
392 |
407 |
393 \lstinputlisting[language=C,numbers=none]{../progs/o1.c} |
408 \lstinputlisting[language=C,numbers=none]{../progs/o1.c} |
394 |
409 |
395 \noindent These characters represent the machine code for |
410 \noindent These characters represent the machine code for |
396 opening a shell. It seems obtaining such a string requires |
411 opening a shell. It seems obtaining such a string requires |
397 ``higher-education'' in the architecture of the target system. But |
412 ``higher-education'' in the architecture of the target system. |
398 it is actually relatively simple: First there are many such |
413 But it is actually relatively simple: First there are many |
399 string ready-made---just a quick Google query away. Second, |
414 such strings ready-made---just a quick Google query away. |
400 tools like the debugger can help us again. We can just write |
415 Second, tools like the debugger can help us again. We can just |
401 the code we want in C, for example this would be the program |
416 write the code we want in C, for example this would be the |
402 for starting a shell: |
417 program for starting a shell: |
403 |
418 |
404 \lstinputlisting[language=C,numbers=none]{../progs/shell.c} |
419 \lstinputlisting[language=C,numbers=none]{../progs/shell.c} |
405 |
420 |
406 \noindent Once compiled, we can use the debugger to obtain |
421 \noindent Once compiled, we can use the debugger to obtain |
407 the machine code, or even the ready-made encoding as character |
422 the machine code, or even the ready-made encoding as character |
408 sequence. |
423 sequence. |
409 |
424 |
410 While easy, obtaining this string is not entirely trivial |
425 While not too difficult, obtaining this string is not entirely |
411 using \pcode{gdb}. Remember the functions in C that copy or |
426 trivial using \pcode{gdb}. Remember the functions in C that |
412 fill buffers work such that they copy everything until the |
427 copy or fill buffers work such that they copy everything until |
413 zero byte is reached. Unfortunately the ``vanilla'' output |
428 the zero byte is reached. Unfortunately the ``vanilla'' output |
414 from the debugger for the shell-program above will contain |
429 from the debugger for the shell-program above will contain |
415 such zero bytes. So a post-processing phase is needed to |
430 such zero bytes. So a post-processing phase is needed to |
416 rewrite the machine code in a way that it does not contain any |
431 rewrite the machine code in a way that it does not contain any |
417 zero bytes. This is like some works of literature that have |
432 zero bytes. This is like some works of literature that have |
418 been written so that the letter e, for example, is avoided. |
433 been written so that the letter e, for example, is avoided. |
419 The technical term for such a literature work is |
434 The technical term for such a literature work is |
420 \emph{lipogram}.\footnote{The most famous example of a |
435 \emph{lipogram}.\footnote{The most famous example of a |
421 lipogram is a 50,000 words novel titled Gadsby, see |
436 lipogram is a 50,000 words novel titled Gadsby, see |
422 \url{https://archive.org/details/Gadsby}, which avoids the |
437 \url{https://archive.org/details/Gadsby}, which avoids the |
423 letter `e' throughout.} For rewriting the |
438 letter `e' throughout.} For rewriting the machine code, you |
424 machine code, you might need to use clever tricks like |
439 might need to use clever tricks like |
425 |
440 |
426 \begin{lstlisting}[numbers=none,language={[x86masm]Assembler}] |
441 \begin{lstlisting}[numbers=none,language={[x86masm]Assembler}] |
427 xor %eax, %eax |
442 xor %eax, %eax |
428 \end{lstlisting} |
443 \end{lstlisting} |
429 |
444 |
451 \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}}; |
466 \draw ( 2,-0.9) node[anchor=west] {\LARGE\color{codegreen}{''}}; |
452 \end{tikzpicture} |
467 \end{tikzpicture} |
453 \end{center} |
468 \end{center} |
454 |
469 |
455 \noindent where we need to be very precise with the address |
470 \noindent where we need to be very precise with the address |
456 with which we will overwrite the buffer. It has to be |
471 with which we will overwrite the buffer (indicated as a black |
457 precisely the first byte of the shellcode. While this is easy |
472 rectangle). It has to be precisely the first byte of the |
458 with the help of a debugger (as seen before), we typically |
473 shellcode. While this is easy with the help of a debugger (as |
459 cannot run anything, including a debugger, on the machine yet |
474 seen before), we typically cannot run anything, including a |
460 we target. And the address is very specific to the setup of |
475 debugger, on the machine yet we target. And the address is |
461 the target machine. One way of finding out what the right |
476 very specific to the setup of the target machine. One way of |
462 address is is to try out one by one every possible |
477 finding out what the right address is is to try out one by one |
463 address until we get lucky. With the large memories available |
478 every possible address until we get lucky. With the large |
464 today, however, the odds are long. And if we try out too many |
479 memories available today, however, the odds are long. And if |
465 possible candidates too quickly, we might be detected by the |
480 we try out too many possible candidates too quickly, we might |
466 system administrator of the target system. |
481 be detected by the system administrator of the target system. |
467 |
482 |
468 We can improve our odds considerably by following a clever |
483 We can improve our odds considerably by making use of a very |
469 trick. Instead of adding the shellcode at the beginning of the |
484 clever trick. Instead of adding the shellcode at the beginning |
470 string, we should add it at the end, just before we overflow |
485 of the string, we should add it at the end, just before we |
471 the buffer, for example |
486 overflow the buffer, for example |
472 |
487 |
473 \begin{center} |
488 \begin{center} |
474 \begin{tikzpicture}[scale=0.6] |
489 \begin{tikzpicture}[scale=0.6] |
475 \draw[gray!50,fill=gray!50] (-2,0.3) rectangle (2,3); |
490 \draw[gray!50,fill=gray!50] (-2,0.3) rectangle (2,3); |
476 \draw[line width=1mm] (-2, -1) rectangle (2,3); |
491 \draw[line width=1mm] (-2, -1) rectangle (2,3); |
487 \end{tikzpicture} |
502 \end{tikzpicture} |
488 \end{center} |
503 \end{center} |
489 |
504 |
490 \noindent Then we can fill up the grey part of the string with |
505 \noindent Then we can fill up the grey part of the string with |
491 \pcode{NOP} operations. The code for this operation is |
506 \pcode{NOP} operations. The code for this operation is |
492 \code{\\0x90}. It is available on every architecture and its |
507 \code{\\0x90} on Intel CPUs. It is available on every |
493 purpose in a CPU is to do nothing apart from waiting a small |
508 architecture and its purpose in a CPU is to do nothing apart |
494 amount of time. If we now use an address that lets us jump to |
509 from waiting a small amount of time. If we now use an address |
495 any address in the grey area we are done. The target machine |
510 that lets us jump to any address in the grey area we are done. |
496 will execute these \pcode{NOP} operations until it reaches the |
511 The target machine will execute these \pcode{NOP} operations |
497 shellcode. That is why this NOP-part is often called |
512 until it reaches the shellcode. That is why this NOP-part is |
498 \emph{NOP-sledge}. A moment of thought should convince you |
513 often called \emph{NOP-sledge}. A moment of thought should |
499 that this trick can hugely improve our odds of finding the |
514 convince you that this trick can hugely improve our odds of |
500 right address---depending on the size of the buffer, it might |
515 finding the right address---depending on the size of the |
501 only take a few tries to get the shellcode to run. And then we |
516 buffer, it might only take a few tries to get the shellcode to |
502 are in. The code for such an attack is shown in |
517 run. And then we are in. The code for such an attack is shown |
503 Figure~\ref{C3}. It is directly taken from the original paper |
518 in Figure~\ref{C3}. It is directly taken from the original |
504 about ``Smashing the Stack for Fun and Profit'' (see pointer |
519 paper about ``Smashing the Stack for Fun and Profit'' (see |
505 given at the end). |
520 pointer given at the end). |
506 |
521 |
507 \begin{figure}[p] |
522 \begin{figure}[p] |
508 \lstinputlisting[language=C]{../progs/C3.c} |
523 \lstinputlisting[language=C]{../progs/C3.c} |
509 \caption{Overwriting a buffer with a string containing a |
524 \caption{Overwriting a buffer with a string containing a |
510 payload.\label{C3}} |
525 payload.\label{C3}} |
511 \end{figure} |
526 \end{figure} |
512 |
527 |
513 By the way you might have the question how do attackers find |
528 By the way you might naw have the question how do attackers |
514 out about vulnerable systems? Well, the automated version uses |
529 find out about vulnerable systems in the first place? Well, |
515 \emph{fuzzers}, which throw randomly generated user input at |
530 the automated version uses \emph{fuzzers}, which throw |
516 applications and observe the behaviour. If an application |
531 randomly generated user input at applications and observe the |
517 seg-faults (throws a segmentation error) then this is a good |
532 behaviour. If an application segfaults (throws a segmentation |
518 indication that a buffer overflow vulnerability can be |
533 error) then this is a good indication that a buffer overflow |
519 exploited. |
534 vulnerability can be exploited. |
520 |
535 |
521 |
536 |
522 \subsubsection*{Format String Attacks} |
537 \subsubsection*{Format String Attacks} |
523 |
538 |
524 Another question might arise, where do we get all this |
539 Another question might arise, where do we get all this |
586 |
601 |
587 How can we defend against these attacks? Well, a reflex could |
602 How can we defend against these attacks? Well, a reflex could |
588 be to blame programmers. Precautions should be taken by them |
603 be to blame programmers. Precautions should be taken by them |
589 so that buffers cannot been overfilled and format strings |
604 so that buffers cannot been overfilled and format strings |
590 should not be forgotten. This might actually be slightly |
605 should not be forgotten. This might actually be slightly |
591 simpler nowadays since safe versions of the library functions |
606 simpler to achieve by programmers nowadays since safe versions |
592 exist, which always specify the precise number of bytes that |
607 of the library functions exist, which always specify the |
593 should be copied. Compilers also nowadays provide warnings |
608 precise number of bytes that should be copied. Compilers also |
594 when format strings are omitted. So proper education of |
609 nowadays provide warnings when format strings are omitted. So |
595 programmers is definitely a part of a defence against such |
610 proper education of programmers is definitely a part of a |
596 attacks. However, if we leave it at that, then we have the |
611 defence against such attacks. However, if we leave it at that, |
597 mess we have today with new attacks discovered almost daily. |
612 then we have the mess we have today with new attacks |
|
613 discovered almost daily. |
598 |
614 |
599 There is actually a quite long record of publications |
615 There is actually a quite long record of publications |
600 proposing defences against buffer overflow attacks. One method |
616 proposing defences against buffer overflow attacks. One method |
601 is to declare the stack data as not executable. In this way it |
617 is to declare the stack data as not executable. In this way it |
602 is impossible to inject a payload as shown above which is then |
618 is impossible to inject a payload as shown above which is then |
612 developed \emph{return-to-lib-C} attacks. The idea is to not |
628 developed \emph{return-to-lib-C} attacks. The idea is to not |
613 inject code, but already use the code that is present at the |
629 inject code, but already use the code that is present at the |
614 target computer. The lib-C library, for example, already |
630 target computer. The lib-C library, for example, already |
615 contains the code for spawning a shell. With |
631 contains the code for spawning a shell. With |
616 \emph{return-to-lib-C} one just has to find out where this |
632 \emph{return-to-lib-C} one just has to find out where this |
617 code is located. But attackers can make good guesses. In my |
633 code is located. But attackers can make good guesses. |
618 examples I took a shortcut and always made the stack |
634 |
619 executable. |
635 Another defence is called \emph{stack canaries}. The advantage |
620 |
|
621 Another defence is called \emph{stack canaries}. The advantage |
|
622 is that they can be automatically inserted into compiled code |
636 is that they can be automatically inserted into compiled code |
623 and do not need any hardware support. Though they will make |
637 and do not need any hardware support. Though they will make |
624 your program run slightly slower. The idea behind \emph{stack |
638 your program run slightly slower. The idea behind \emph{stack |
625 canaries} is to push a random number onto the stack just |
639 canaries} is to push a random number onto the stack just |
626 before local data is stored. For our very first function the |
640 before local data is stored. For our very first function |
627 stack would with a \emph{stack canary} look as follows |
641 \pcode{foo} the stack would with a \emph{stack canary} look as |
|
642 follows |
628 |
643 |
629 \begin{center} |
644 \begin{center} |
630 \begin{tikzpicture}[scale=0.65] |
645 \begin{tikzpicture}[scale=0.65] |
631 %\draw[step=1cm] (-3,-1) grid (3,8); |
646 %\draw[step=1cm] (-3,-1) grid (3,8); |
632 \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); |
647 \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); |
666 place where programs are stored mitigates this problem |
681 place where programs are stored mitigates this problem |
667 somewhat. |
682 somewhat. |
668 |
683 |
669 As mentioned before, modern operating systems have these |
684 As mentioned before, modern operating systems have these |
670 defences enabled by default and make buffer overflow attacks |
685 defences enabled by default and make buffer overflow attacks |
671 harder, but not impossible. Indeed, I as an amateur attacker |
686 harder, but not impossible. Indeed, I---as an amateur |
672 had to explicitly switch off these defences. I run my example |
687 attacker---had to explicitly switch off these defences. |
673 under an Ubuntu version ``Maverick Meerkat'' from October |
688 A real attacker would be more knowledgeable and not need this |
674 2010 and the gcc 4.4.5. I have not tried whether newer versions |
689 shortcut. |
675 would work as well. I tested all examples inside a virtual |
690 |
676 box\footnote{\url{https://www.virtualbox.org}} insulating my main |
691 To work I run my example under an Ubuntu version ``Maverick |
677 system from any harm. When compiling the programs I called |
692 Meerkat'' from October 2010 and the gcc 4.4.5. I have not |
678 the compiler with the following options: |
693 tried whether newer versions would work as well. I tested all |
|
694 examples inside a virtual |
|
695 box\footnote{\url{https://www.virtualbox.org}} insulating my |
|
696 main system from any harm. When compiling the programs I |
|
697 called the compiler with the following options: |
679 |
698 |
680 \begin{center} |
699 \begin{center} |
681 \begin{tabular}{l@{\hspace{1mm}}l} |
700 \begin{tabular}{l@{\hspace{1mm}}l} |
682 \pcode{/usr/bin/gcc} & \pcode{-ggdb -O0}\\ |
701 \pcode{/usr/bin/gcc} & \pcode{-ggdb -O0}\\ |
683 & \pcode{-fno-stack-protector}\\ |
702 & \pcode{-fno-stack-protector}\\ |
688 |
707 |
689 \noindent The first two are innocent as they instruct the |
708 \noindent The first two are innocent as they instruct the |
690 compiler to include debugging information and also produce |
709 compiler to include debugging information and also produce |
691 non-optimised code (the latter makes the output of the code a |
710 non-optimised code (the latter makes the output of the code a |
692 bit more predictable). The third is important as it switches |
711 bit more predictable). The third is important as it switches |
693 off defences like the stack canaries. The fourth again makes it |
712 off defences like the stack canaries. The fourth again makes |
694 a bit easier to read the code. The final option makes the |
713 it a bit easier to read the code. The final option makes the |
695 stack executable, thus the example in Figure~\ref{C3} |
714 stack executable, thus the example in Figure~\ref{C3} works as |
696 works as intended. While this might be considered |
715 intended. While this might be considered cheating....since I |
697 cheating....since I explicitly switched off all defences, I |
716 explicitly switched off all defences, I hope I was able convey |
698 hope I was able convey the point that this is actually not too far from |
717 the point that this is actually not too far from realistic |
699 realistic scenarios. I have shown you the classic version of |
718 scenarios. I have shown you the classic version of the buffer |
700 the buffer overflow attacks. Updated variants do exist. Also |
719 overflow attacks. Updated and more advanced variants do exist. |
701 one might argue buffer-overflow attacks have been solved on |
720 |
702 computers (desktops or servers) but the computing landscape of today |
721 With the standard defences switched on, you might want to |
703 is much wider than that. The main problem today are |
722 argue buffer-overflow attacks have been solved on computers |
704 embedded systems against which attacker can equally cause a |
723 (desktops and servers) but the computing landscape of today is |
705 lot of harm and which are much less defended. Anthony Bonkoski |
724 much wider than that. The main problem today are embedded |
706 makes a similar argument in his security blog: |
725 systems against which attacker can equally cause a lot of harm |
|
726 and which are much less defended. Anthony Bonkoski makes a |
|
727 similar argument in his security blog: |
707 |
728 |
708 \begin{center} |
729 \begin{center} |
709 \url{http://jabsoft.io/2013/09/25/are-buffer-overflows-solved-yet-a-historical-tale/} |
730 \url{http://jabsoft.io/2013/09/25/are-buffer-overflows-solved-yet-a-historical-tale/} |
710 \end{center} |
731 \end{center} |
711 |
732 |