--- a/handouts/ho09.tex Tue Dec 09 12:09:33 2014 +0000
+++ b/handouts/ho09.tex Wed Dec 10 23:50:35 2014 +0000
@@ -29,7 +29,7 @@
\url{http://sel4.systems}
\end{center}
-\noindent This work was in 2011 included in the MIT Technology
+\noindent In 2011 this work was included in the MIT Technology
Review in the annual list of the world’s ten most important
emerging
technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}}
@@ -38,7 +38,7 @@
something much simpler, namely finding out properties about
programs using \emph{static analysis}.
-Static analysis is one technique that checks properties of a
+Static analysis is a technique that checks properties of a
program without actually running the program. This should
raise alarm bells with you---because almost all interesting
properties about programs are equivalent to the halting
@@ -60,12 +60,13 @@
\subsubsection*{A Simple, Idealised Programming Language}
Our starting point is a small, idealised programming language.
-This language contains variables holding integers. We want to
-find out what the sign of these integers will be when the
-program runs. This seems like a very simple problem, but it
-will turn out even such a simple analysis if approached
-naively is in general undecidable, just like Turing's halting
-problem. I let you think why?
+This language, amongst other things, contains variables
+holding integers. We want to find out what the sign of these
+integers will be when the program runs. This sign-analysis
+seems like a very simple problem, but it will turn out even
+such simple problems, if approached naively, are in general
+undecidable, just like Turing's halting problem. I let you
+think why?
Is sign-analysis of variables an interesting problem? Well,
@@ -73,46 +74,76 @@
will never be negative and this variable is used as an index
for an array, then the compiler does not need to generate code
for an underflow-test. Remember some languages are immune to
-buffer-overflow attacks because they add bound checks
-everywhere. This could potentially drastically speed up the
-generated code.
+buffer-overflow attacks because they add underflow and
+overflow checks everywhere. If the compiler can omit the
+underflow test, for example, then this can potentially
+drastically speed up the generated code.
-Since we want to
+What do programs in our programming language look like? The
+following grammar gives a first specification:
\begin{multicols}{2}
-\begin{plstx}[rhs style=,one per line,left margin=9mm]
: \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp}
- | \meta{Exp} \texttt{*} \meta{Exp}
- | \meta{Exp} \texttt{=} \meta{Exp}
- | \meta{num}
- | \meta{var}\\
-\end{plstx}
-\columnbreak
-\begin{plstx}[rhs style=,one per line]
+\begin{plstx}[rhs style=,one per line,left margin=9mm]
: \meta{Stmt} ::= \meta{label} \texttt{:}
| \meta{var} \texttt{:=} \meta{Exp}
| \texttt{jmp?} \meta{Exp} \meta{label}
| \texttt{goto} \meta{label}\\
: \meta{Prog} ::= \meta{Stmt} \ldots{} \meta{Stmt}\\
\end{plstx}
+\columnbreak
+\begin{plstx}[rhs style=,one per line]
: \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp}
+ | \meta{Exp} \texttt{*} \meta{Exp}
+ | \meta{Exp} \texttt{=} \meta{Exp}
+ | \meta{num}
+ | \meta{var}\\
+\end{plstx}
\end{multicols}
-\begin{lstlisting}[numbers=none,
- language={},xleftmargin=10mm]
+\noindent I assume you are familiar with such
+grammars.\footnote{\url{http://en.wikipedia.org/wiki/Backus–Naur_Form}}
+There are three main syntactic categories: \emph{statments}
+and \emph{expressions} as well as \emph{programs}, which are
+sequences of statements. Statements are either labels,
+variable assignments, conditional jumps (\pcode{jmp?}) and
+unconditional jumps (\pcode{goto}). Labels are just strings,
+which can be used as the target of a jump. The conditional
+jumps and variable assignments involve (arithmetic)
+expressions. Expressions are either numbers, variables or
+compound expressions built up from \pcode{+}, \pcode{*} and
+\emph{=} (for simplicity reasons we do not consider any other
+operations). We assume we have negative and positive numbers,
+\ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
+\pcode{2}\ldots{} An example program that calculates the
+factorial of 5 is as follows:
+
+\begin{lstlisting}[language={},xleftmargin=10mm]
a := 1
n := 5
-top: jmp? n = 0 done
+top:
+ jmp? n = 0 done
a := a * n
n := n + -1
goto top
done:
\end{lstlisting}
+\noindent Each line of the program contains a statement. In
+the first two lines we assign values to the variables
+\pcode{a} and \pcode{n}. In line 4 we test whether \pcode{n}
+is zero, in which case we jump to the end of the program
+marked with the label \pcode{done}. If \pcode{n} is not zero,
+we multiply the content of \pcode{a} by \pcode{n}, decrease
+\pcode{n} by one and jump back to the beginning of the loop,
+marked with the label \pcode{top}.
+
+\begin{figure}[t]
\begin{lstlisting}[numbers=none,
language={},xleftmargin=10mm]
n := 6
m1 := 0
m2 := 1
-loop: jmp? n = 0 done
+loop:
+ jmp? n = 0 done
tmp := m2
m2 := m1 + m2
m1 := tmp
@@ -120,10 +151,14 @@
goto top
done:
\end{lstlisting}
+\label{A mystery program in our idealised programming language.}
+\end{figure}
-\bigskip
+Even if our language is rather small, it is still Turing
+complete---so rather powerful. However, discussing this more
+would lead us to far astray.
-\noindent What would be missing in comparison with real
+What would be missing in comparison with real
(low-level machine) code? Well, the numbers we assume to be
arbitrary precision, which is not the case in real code. There
basic number formats have a rang and might over-run or