458 \lstinputlisting[language=C]{../progs/C3.c} |
459 \lstinputlisting[language=C]{../progs/C3.c} |
459 \caption{Overwriting a buffer with a string containing a |
460 \caption{Overwriting a buffer with a string containing a |
460 payload.\label{C3}} |
461 payload.\label{C3}} |
461 \end{figure} |
462 \end{figure} |
462 |
463 |
|
464 By the way you might have the question how do attackers find |
|
465 out about vulnerable systems? Well, the automated version uses |
|
466 \emph{fuzzers}, which throw randomly generated user input at |
|
467 applications and observe the behaviour. If an application |
|
468 seg-faults (throws a segmentation error) then this is a good |
|
469 indication that a buffer overflow vulnerability can be |
|
470 exploited. |
|
471 |
|
472 |
463 \subsubsection*{Format String Attacks} |
473 \subsubsection*{Format String Attacks} |
464 |
474 |
465 A question might arise, where do we get all this information |
475 Another question might arise, where do we get all this |
466 about addresses necessary for mounting a buffer overflow |
476 information about addresses necessary for mounting a buffer |
467 attack without having yet access to the system? The answer are |
477 overflow attack without having yet access to the system? The |
468 \emph{format string attacks}. While technically they are |
478 answer are \emph{format string attacks}. While technically |
469 programming mistakes (and they are pointed out as warning by |
479 they are programming mistakes (and they are pointed out as |
470 modern compilers), they can be easily made and therefore an |
480 warning by modern compilers), they can be easily made and |
471 easy target. Let us look at the simplest version of a |
481 therefore an easy target. Let us look at the simplest version |
472 vulnerable program. |
482 of a vulnerable program. |
473 |
483 |
474 \lstinputlisting[language=C]{../progs/C4.c} |
484 \lstinputlisting[language=C]{../progs/C4.c} |
475 |
485 |
476 \noindent The intention is to print out the first argument |
486 \noindent The intention is to print out the first argument |
477 given on the command line. The ``secret string'' is never to |
487 given on the command line. The ``secret string'' is never to |
524 avoiding the obvious buffer overflow attack. |
534 avoiding the obvious buffer overflow attack. |
525 |
535 |
526 \subsubsection*{Caveats and Defences} |
536 \subsubsection*{Caveats and Defences} |
527 |
537 |
528 How can we defend against these attacks? Well, a reflex could |
538 How can we defend against these attacks? Well, a reflex could |
529 be to blame programmers. Precautions should be taken that |
539 be to blame programmers. Precautions should be taken so that |
530 buffers cannot been overfilled and format strings should not |
540 buffers cannot been overfilled and format strings should not |
531 be forgotten. |
541 be forgotten. This might actually be slightly simpler nowadays |
532 |
542 since safe versions of the library functions exists, which |
533 \bigskip\bigskip |
543 always specify the precise number of bytes that should be |
534 \subsubsection*{A Crash-Course for GDB} |
544 copied. Compilers also nowadays provide warnings when format |
535 |
545 strings are omitted. So proper education of programmers is |
536 \begin{itemize} |
546 definitely a part of a defence against such attacks. However, |
537 \item \texttt{(l)ist n} -- listing the source file from line |
547 if we leave it at that, then we have the mess we have today |
538 \texttt{n} |
548 with new attacks discovered almost daily. |
539 \item \texttt{disassemble fun-name} |
549 |
540 \item \texttt{run args} -- starts the program, potential |
550 There is actually a quite long record of publications |
541 arguments can be given |
551 proposing defences against buffer overflow attacks. One method |
542 \item \texttt{(b)reak line-number} -- set break point |
552 is to declare the stack data as not executable. In this way it |
543 \item \texttt{(c)ontinue} -- continue execution until next |
553 is impossible to inject a payload as shown above which is then |
544 breakpoint in a line number |
554 executed once the stack is smashed. But this needs hardware |
545 |
555 support which allows one to declare certain memory regions to |
546 \item \texttt{x/nxw addr} -- print out \texttt{n} words starting |
556 be not executable. Such a feature was not introduced before |
547 from address \pcode{addr}, the address could be \code{$esp} |
557 the Intel 386, for example. Also if you have a JIT |
548 for looking at the content of the stack |
558 (just-in-time) compiler it might be advantageous to have |
549 \item \texttt{x/nxb addr} -- print out \texttt{n} bytes |
559 the stack containing executable data. So it is definitely a |
550 \end{itemize} |
560 trade-off. |
551 |
561 |
552 |
562 Anyway attackers have found ways around this defence: they |
553 \bigskip\bigskip \noindent If you want to know more about |
563 developed \emph{return-to-lib-C} attacks. The idea is to not |
554 buffer overflow attacks, the original Phrack article |
564 inject code, but already use the code that is present at the |
555 ``Smashing The Stack For Fun And Profit'' by Elias Levy (also |
565 target computer. The lib-C library, for example, already |
556 known as Aleph One) is an engaging read: |
566 contains the code for spawning a shell. With |
|
567 \emph{return-to-lib-C} one just has to find out where this |
|
568 code is located. But attackers can make good guesses. In my |
|
569 examples I took a shortcut and always made the stack |
|
570 executable. |
|
571 |
|
572 Another defence is called \emph{stack canaries}. The advantage |
|
573 is that they can be automatically inserted into compiled code |
|
574 and do not need any hardware support. Though they will make |
|
575 your program run slightly slower. The idea behind \emph{stack |
|
576 canaries} is to push a random number onto the stack just |
|
577 before local data is stored. For our very first function the |
|
578 stack would with a \emph{stack canary} look as follows |
|
579 |
|
580 \begin{center} |
|
581 \begin{tikzpicture}[scale=0.65] |
|
582 %\draw[step=1cm] (-3,-1) grid (3,8); |
|
583 \draw[gray!20,fill=gray!20] (-1, 0) rectangle (1,-1); |
|
584 \draw[line width=1mm] (-1,-1.2) -- (-1,7.4); |
|
585 \draw[line width=1mm] ( 1,-1.2) -- ( 1,7.4); |
|
586 \draw (0,-1) node[anchor=south] {\tt main}; |
|
587 \draw[line width=1mm] (-1,0) -- (1,0); |
|
588 \draw (0,0) node[anchor=south] {\tt arg$_3$=3}; |
|
589 \draw[line width=1mm] (-1,1) -- (1,1); |
|
590 \draw (0,1) node[anchor=south] {\tt arg$_2$=2}; |
|
591 \draw[line width=1mm] (-1,2) -- (1,2); |
|
592 \draw (0,2) node[anchor=south] {\tt arg$_1$=1}; |
|
593 \draw[line width=1mm] (-1,3) -- (1,3); |
|
594 \draw (0,3.1) node[anchor=south] {\tt ret}; |
|
595 \draw[line width=1mm] (-1,4) -- (1,4); |
|
596 \draw (0,4) node[anchor=south] {\small\tt last sp}; |
|
597 \draw[line width=1mm] (-1,5) -- (1,5); |
|
598 \draw (0,5.1) node[anchor=south] {\tt\small\textcolor{red}{\textbf{random}}}; |
|
599 \draw[line width=1mm] (-1,6) -- (1,6); |
|
600 \draw (0,6) node[anchor=south] {\tt buf}; |
|
601 \draw[line width=1mm] (-1,7) -- (1,7); |
|
602 \end{tikzpicture} |
|
603 \end{center} |
|
604 |
|
605 \noindent The idea behind this random number is that when the |
|
606 function finishes, it is checked that this random number is |
|
607 still intact on the stack. If not, then a buffer overflow has |
|
608 occurred. Although this is quite effective, but requires |
|
609 suitable support for generating random numbers. This is always |
|
610 hard to get right and attackers are happy to exploit the |
|
611 resulting weaknesses. |
|
612 |
|
613 Another defence is \emph{address space randomisation}. This |
|
614 defence tries to make it harder for an attacker to guess |
|
615 addresses where code is stored. It turns out that addresses |
|
616 where code is stored is rather predictable. Randomising the |
|
617 place where programs are stored mitigates this problem |
|
618 somewhat. |
|
619 |
|
620 As mentioned before, modern operating systems have these |
|
621 defences enabled by default and make buffer overflow attacks |
|
622 harder, but not impossible. Indeed, I as an amateur attacker |
|
623 had to explicitly switch off these defences. I run my example |
|
624 under an Ubuntu version ``Maverick Meerkat'' from October |
|
625 2010 and the gcc 4.4.5. I have not tried whether newer versions |
|
626 would work as well. I tested all examples inside a virtual |
|
627 box\footnote{https://www.virtualbox.org} insulating my main |
|
628 system from any harm. When compiling the programs I called |
|
629 the compiler with the following options: |
|
630 |
|
631 \begin{center} |
|
632 \begin{tabular}{l@{\hspace{1mm}}l} |
|
633 \pcode{/usr/bin/gcc} & \pcode{-ggdb -O0}\\ |
|
634 & \pcode{-fno-stack-protector}\\ |
|
635 & \pcode{-mpreferred-stack-boundary=2}\\ |
|
636 & \pcode{-z execstack} |
|
637 \end{tabular} |
|
638 \end{center} |
|
639 |
|
640 \noindent The first two are innocent as they instruct the |
|
641 compiler to include debugging information and also produce |
|
642 non-optimised code (the latter makes the output of the code a |
|
643 bit more predictable). The third is important as it switches |
|
644 of defences like the stack canaries. The fourth again makes it |
|
645 a bit easier to read the code. The final option makes the |
|
646 stack executable, thus the the example in Figure~\ref{C3} |
|
647 works as intended. While this might be considered |
|
648 cheating....since I explicitly switched off all defences, I |
|
649 hope I was able convey that this is actually not too far |
|
650 from realistic scenarios. I have shown you the classic version |
|
651 of the buffer overflow attacks. Updated variants do exist. |
|
652 Also one might argue buffer-overflow attacks have been |
|
653 solved on computers (desktops or servers) but the computing |
|
654 landscape of nowadays is wider than ever. The main problem |
|
655 nowadays are embedded systems against which attacker can |
|
656 equally cause a lot of harm and which are much less defended |
|
657 against. Anthony Bonkoski makes a similar argument in his |
|
658 security blog: |
|
659 |
|
660 \begin{center} |
|
661 \url{http://jabsoft.io/2013/09/25/are-buffer-overflows-solved-yet-a-historical-tale/} |
|
662 \end{center} |
|
663 |
|
664 |
|
665 There is one more rather effective defence against buffer |
|
666 overflow attacks: Why not using a safe language? Java at its |
|
667 inception was touted as a safe language because it hides |
|
668 all explicit memory management from the user. This definitely |
|
669 incurs a runtime penalty, but for bog-standard user-input |
|
670 processing applications, speed is not of such an essence |
|
671 anymore. There are of course also many other programming |
|
672 languages that are safe, i.e.~immune to buffer overflow |
|
673 attacks. |
|
674 \bigskip |
|
675 |
|
676 \noindent If you want to know more about buffer overflow |
|
677 attacks, the original Phrack article ``Smashing The Stack For |
|
678 Fun And Profit'' by Elias Levy (also known as Aleph One) is an |
|
679 engaging read: |
557 |
680 |
558 \begin{center} |
681 \begin{center} |
559 \url{http://phrack.org/issues/49/14.html} |
682 \url{http://phrack.org/issues/49/14.html} |
560 \end{center} |
683 \end{center} |
561 |
684 |