1 \documentclass{article} |
1 \documentclass{article} |
2 \usepackage{../style} |
2 \usepackage{../style} |
3 \usepackage{../langs} |
3 \usepackage{../langs} |
|
4 \usepackage{../graphics} |
|
5 \usepackage{../grammar} |
|
6 \usepackage{multicol} |
4 |
7 |
5 \begin{document} |
8 \begin{document} |
6 |
9 |
7 \section*{Handout 9 (Static Analysis)} |
10 \section*{Handout 9 (Static Analysis)} |
8 |
11 |
9 If we want to improve the safety and security of our programs, |
12 If we want to improve the safety and security of our programs, |
10 we need a more principled approach to programming. Testing is |
13 we need a more principled approach to programming. Testing is |
11 good, but as Dijkstra famously said: ``Testing can only show |
14 good, but as Dijkstra famously said: |
12 the presence of bugs, not their absence''. While such a more |
15 |
13 principled approach has been the subject of intense study for |
16 \begin{quote}\it |
14 a long time, only in the past few years some impressive |
17 ``Program testing can be a very effective way to show the |
15 results have been achieved. One is the complete formalisation |
18 \underline{\smash{presence}} of bugs, but it is hopelessly |
16 and (mathematical) verification of a microkernel operating |
19 inadequate for showing their \underline{\smash{absence}}.'' |
17 system called seL4. |
20 \end{quote} |
|
21 |
|
22 \noindent While such a more principled approach has been the |
|
23 subject of intense study for a long, long time, only in the |
|
24 past few years some impressive results have been achieved. One |
|
25 is the complete formalisation and (mathematical) verification |
|
26 of a microkernel operating system called seL4. |
18 |
27 |
19 \begin{center} |
28 \begin{center} |
20 \url{http://sel4.systems} |
29 \url{http://sel4.systems} |
21 \end{center} |
30 \end{center} |
22 |
31 |
23 \noindent In 2011, this work was included in the MIT |
32 \noindent This work was in 2011 included in the MIT Technology |
24 Technology Review in the annual list of the world’s ten most |
33 Review in the annual list of the world’s ten most important |
25 important emerging |
34 emerging |
26 technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}} |
35 technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}} |
27 While this work is impressive, its technical details are too |
36 While this work is impressive, its technical details are too |
28 enormous for explanation here. Therefore let us look at |
37 enormous for an explanation here. Therefore let us look at |
29 something much simpler, namely finding out properties about |
38 something much simpler, namely finding out properties about |
30 programs using \emph{static analysis}. |
39 programs using \emph{static analysis}. |
31 |
40 |
32 Static analysis is one technique that checks properties |
41 Static analysis is one technique that checks properties of a |
33 of a program without actually running the program. This |
42 program without actually running the program. This should |
34 should raise alarm bells---the reason that almost all |
43 raise alarm bells with you---because almost all interesting |
35 interesting properties about programs are equivalent to |
44 properties about programs are equivalent to the halting |
36 the halting problem, which we know is undecidable. |
45 problem, which we know is undecidable. For example estimating |
|
46 the memory consumption of programs is in general undecidable, |
|
47 just like the halting problem. Static analysis circumvents |
|
48 this undecidability-problem by essentially allowing answers |
|
49 \emph{yes} and \emph{no}, but also \emph{don't know}. With |
|
50 this ``trick'' even the halting problem becomes |
|
51 decidable\ldots{}for example we could always say \emph{don't |
|
52 know}. Of course this would be silly. The point is that we |
|
53 should be striving for a method that answers as often as |
|
54 possible \emph{yes} or \emph{no}---just in cases when it is |
|
55 too difficult we fall back on the \emph{don't-know}-answer. |
|
56 This might sound all like abstract nonsense. Therefore let us |
|
57 look at a concrete example. |
|
58 |
|
59 |
|
60 \subsubsection*{A Simple, Idealised Programming Language} |
|
61 |
|
62 Our starting point is a small, idealised programming language. |
|
63 This language contains variables holding integers. We want to |
|
64 find out what the sign of these integers will be when the |
|
65 program runs. This seems like a very simple problem, but it |
|
66 will turn out even such a simple analysis is in general |
|
67 undecidable, just like Turing's halting problem. Is it an |
|
68 interesting problem? Well, yes---if a compiler can find out |
|
69 that for example a variable will never be negative and this |
|
70 variable is used as an index for an array, then the compiler |
|
71 does not need to generate code for an underflow-test. Remember |
|
72 some languages are immune to buffer-overflow attacks because |
|
73 they add bound checks everywhere. This could potentially |
|
74 drastically speed up the generated code. |
|
75 |
|
76 Since we want to |
|
77 |
|
78 \begin{multicols}{2} |
|
79 \begin{plstx}[rhs style=,one per line,left margin=9mm] |
|
80 : \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp} |
|
81 | \meta{Exp} \texttt{*} \meta{Exp} |
|
82 | \meta{Exp} \texttt{=} \meta{Exp} |
|
83 | \meta{num} |
|
84 | \meta{var}\\ |
|
85 \end{plstx} |
|
86 \columnbreak |
|
87 \begin{plstx}[rhs style=,one per line] |
|
88 : \meta{Stmt} ::= \meta{label} \texttt{:} |
|
89 | \meta{var} \texttt{:=} \meta{Exp} |
|
90 | \texttt{jmp?} \meta{Exp} \meta{label} |
|
91 | \texttt{goto} \meta{label}\\ |
|
92 : \meta{Prog} ::= \meta{Stmt} \ldots{} \meta{Stmt}\\ |
|
93 \end{plstx} |
|
94 \end{multicols} |
|
95 |
|
96 \begin{lstlisting}[numbers=none, |
|
97 language={},xleftmargin=10mm] |
|
98 a := 1 |
|
99 n := 5 |
|
100 top: jmp? n = 0 done |
|
101 a := a * n |
|
102 n := n + -1 |
|
103 goto top |
|
104 done: |
|
105 \end{lstlisting} |
|
106 |
|
107 \begin{lstlisting}[numbers=none, |
|
108 language={},xleftmargin=10mm] |
|
109 n := 6 |
|
110 m1 := 0 |
|
111 m2 := 1 |
|
112 loop: jmp? n = 0 done |
|
113 tmp := m2 |
|
114 m2 := m1 + m2 |
|
115 m1 := tmp |
|
116 n := n + -1 |
|
117 goto top |
|
118 done: |
|
119 \end{lstlisting} |
37 |
120 |
38 \bigskip |
121 \bigskip |
39 |
122 |
40 \noindent What would be missing in comparison with real |
123 \noindent What would be missing in comparison with real |
41 (low-level machine) code? Well, the numbers we assume to be |
124 (low-level machine) code? Well, the numbers we assume to be |