handouts/ho09.tex
changeset 346 5a6e8b7d20f7
parent 338 f1491e0d7be0
child 347 efad8155513f
equal deleted inserted replaced
345:7f0ac1355f0b 346:5a6e8b7d20f7
     1 \documentclass{article}
     1 \documentclass{article}
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 \usepackage{../langs}
     3 \usepackage{../langs}
       
     4 \usepackage{../graphics}
       
     5 \usepackage{../grammar}
       
     6 \usepackage{multicol}
     4 
     7 
     5 \begin{document}
     8 \begin{document}
     6 
     9 
     7 \section*{Handout 9 (Static Analysis)}
    10 \section*{Handout 9 (Static Analysis)}
     8 
    11 
     9 If we want to improve the safety and security of our programs,
    12 If we want to improve the safety and security of our programs,
    10 we need a more principled approach to programming. Testing is
    13 we need a more principled approach to programming. Testing is
    11 good, but as Dijkstra famously said: ``Testing can only show
    14 good, but as Dijkstra famously said: 
    12 the presence of bugs, not their absence''. While such a more
    15 
    13 principled approach has been the subject of intense study for
    16 \begin{quote}\it 
    14 a long time, only in the past few years some impressive
    17 ``Program testing can be a very effective way to show the
    15 results have been achieved. One is the complete formalisation
    18 \underline{\smash{presence}} of bugs, but it is hopelessly
    16 and (mathematical) verification of a microkernel operating
    19 inadequate for showing their \underline{\smash{absence}}.''
    17 system called seL4.
    20 \end{quote}
       
    21 
       
    22 \noindent While such a more principled approach has been the
       
    23 subject of intense study for a long, long time, only in the
       
    24 past few years some impressive results have been achieved. One
       
    25 is the complete formalisation and (mathematical) verification
       
    26 of a microkernel operating system called seL4.
    18 
    27 
    19 \begin{center}
    28 \begin{center}
    20 \url{http://sel4.systems}
    29 \url{http://sel4.systems}
    21 \end{center}
    30 \end{center}
    22 
    31 
    23 \noindent In 2011, this work was included in the MIT
    32 \noindent This work was in 2011 included in the MIT Technology
    24 Technology Review in the annual list of the world’s ten most
    33 Review in the annual list of the world’s ten most important
    25 important emerging
    34 emerging
    26 technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}}
    35 technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}}
    27 While this work is impressive, its technical details are too
    36 While this work is impressive, its technical details are too
    28 enormous for explanation here. Therefore let us look at
    37 enormous for an explanation here. Therefore let us look at
    29 something much simpler, namely finding out properties about
    38 something much simpler, namely finding out properties about
    30 programs using \emph{static analysis}.
    39 programs using \emph{static analysis}.
    31 
    40 
    32 Static analysis is one technique that checks properties
    41 Static analysis is one technique that checks properties of a
    33 of a program without actually running the program. This
    42 program without actually running the program. This should
    34 should raise alarm bells---the reason that almost all 
    43 raise alarm bells with you---because almost all interesting
    35 interesting properties about programs are equivalent to
    44 properties about programs are equivalent to the halting
    36 the halting problem, which we know is undecidable. 
    45 problem, which we know is undecidable. For example estimating
       
    46 the memory consumption of programs is in general undecidable,
       
    47 just like the halting problem. Static analysis circumvents
       
    48 this undecidability-problem by essentially allowing answers
       
    49 \emph{yes} and \emph{no}, but also \emph{don't know}. With
       
    50 this ``trick'' even the halting problem becomes
       
    51 decidable\ldots{}for example we could always say \emph{don't
       
    52 know}. Of course this would be silly. The point is that we
       
    53 should be striving for a method that answers as often as
       
    54 possible \emph{yes} or \emph{no}---just in cases when it is
       
    55 too difficult we fall back on the \emph{don't-know}-answer.
       
    56 This might sound all like abstract nonsense. Therefore let us
       
    57 look at a concrete example.
       
    58 
       
    59 
       
    60 \subsubsection*{A Simple, Idealised Programming Language}
       
    61 
       
    62 Our starting point is a small, idealised programming language.
       
    63 This language contains variables holding integers. We want to
       
    64 find out what the sign of these integers will be when the
       
    65 program runs. This seems like a very simple problem, but it
       
    66 will turn out even such a simple analysis is in general
       
    67 undecidable, just like Turing's halting problem. Is it an
       
    68 interesting problem? Well, yes---if a compiler can find out
       
    69 that for example a variable will never be negative and this
       
    70 variable is used as an index for an array, then the compiler
       
    71 does not need to generate code for an underflow-test. Remember
       
    72 some languages are immune to buffer-overflow attacks because
       
    73 they add bound checks everywhere. This could potentially
       
    74 drastically speed up the generated code.
       
    75 
       
    76 Since we want to 
       
    77 
       
    78 \begin{multicols}{2}
       
    79 \begin{plstx}[rhs style=,one per line,left margin=9mm]
       
    80 : \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp}
       
    81                | \meta{Exp} \texttt{*} \meta{Exp}
       
    82                | \meta{Exp} \texttt{=} \meta{Exp} 
       
    83                | \meta{num}
       
    84                | \meta{var}\\
       
    85 \end{plstx}
       
    86 \columnbreak
       
    87 \begin{plstx}[rhs style=,one per line]
       
    88 : \meta{Stmt} ::= \meta{label} \texttt{:}
       
    89                 | \meta{var} \texttt{:=} \meta{Exp}
       
    90                 | \texttt{jmp?} \meta{Exp} \meta{label}
       
    91                 | \texttt{goto} \meta{label}\\
       
    92 : \meta{Prog} ::= \meta{Stmt} \ldots{} \meta{Stmt}\\
       
    93 \end{plstx}
       
    94 \end{multicols}
       
    95 
       
    96 \begin{lstlisting}[numbers=none,
       
    97                    language={},xleftmargin=10mm]
       
    98       a := 1
       
    99       n := 5 
       
   100 top:  jmp? n = 0 done 
       
   101       a := a * n 
       
   102       n := n + -1 
       
   103       goto top 
       
   104 done:
       
   105 \end{lstlisting}
       
   106 
       
   107 \begin{lstlisting}[numbers=none,
       
   108                    language={},xleftmargin=10mm]
       
   109       n := 6
       
   110       m1 := 0
       
   111       m2 := 1
       
   112 loop: jmp? n = 0 done
       
   113       tmp := m2
       
   114       m2 := m1 + m2
       
   115       m1 := tmp
       
   116       n := n + -1
       
   117       goto top
       
   118 done:
       
   119 \end{lstlisting}
    37 
   120 
    38 \bigskip
   121 \bigskip
    39 
   122 
    40 \noindent What would be missing in comparison with real
   123 \noindent What would be missing in comparison with real
    41 (low-level machine) code? Well, the numbers we assume to be
   124 (low-level machine) code? Well, the numbers we assume to be