sen-material: comparison handouts/ho09.tex

equal deleted inserted replaced

-:8787c16bc26e
+:c90f803dc7ea
 \section*{Handout 9 (Static Analysis)}
 If we want to improve the safety and security of our programs,
 we need a more principled approach to programming. Testing is
-good, but as Dijkstra famously wrote:
+good, but as Edsger Dijkstra famously wrote:
 \begin{quote}\it
 ``Program testing can be a very effective way to show the
 \underline{\smash{presence}} of bugs, but it is hopelessly
 inadequate for showing their \underline{\smash{absence}}.''
 yes---if a compiler can find out that for example a variable
 will never be negative and this variable is used as an index
 for an array, then the compiler does not need to generate code
 for an underflow-check. Remember some languages are immune to
 buffer-overflow attacks, but they need to add underflow and
-overflow checks everywhere. According to Regehr, an expert in
+overflow checks everywhere. According to John Regher, an
-the field of compilers, overflow checks can cause 5-10\%
+expert in the field of compilers, overflow checks can cause
-slowdown, and in some languages even 100\% for tight
+5-10\% slowdown, in some languages even 100\% for tight
 loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If
 the compiler can omit the underflow check, for example, then
 this can potentially drastically speed up the generated code.
 What do programs in our simple programming language look like?
 expressions built up from \pcode{+}, \pcode{*} and \emph{=}
 (for simplicity reasons we do not consider any other
 operations). We assume we have negative and positive numbers,
 \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
 \pcode{2}\ldots{} An example program that calculates the
-factorial of 5 is in oure programming language as follows:
+factorial of 5 is in our programming language as follows:
 \begin{lstlisting}[language={},xleftmargin=10mm]
 a := 1
 n := 5
 top:
 describe an interpreter for our programming language, which
 specifies exactly how programs are supposed to be
 run\ldots{}at least we will specify this for all \emph{good}
 programs. By good programs I mean where all variables are
 initialised, for example. Our interpreter will just crash if
-it cannot find out the value for a variable, in case it is not
+it cannot find out the value for a variable when it is not
 initialised. Also, we will assume that labels in good programs
 are unique, otherwise our programs will calculate ``garbage''.
 First we will pre-process our programs. This will simplify the
-definition of our interpreter later on. We will transform
+definition of our interpreter later on. By pre-processing our
-programs into \emph{snippets}. Their purpose is to simplify
+programs we will transform programs into \emph{snippets}.
-the definition of what the interpreter should do in case of a
+Their purpose is to simplify the definition of what the
-jump. A snippet is a label and all code that comes after the
+interpreter should do in case of a jump. A snippet is a label
-label. This essentially means a snippet is a \emph{map} from
+and all the code that comes after the label. This essentially
-labels to code. Given that programs are sequences (or lists)
+means a snippet is a \emph{map} from labels to
-of statements, we can easily calculate the snippets by just
+code.\footnote{Be sure you know what maps are. In a
-traversing this sequence and recursively generating the map.
+programming context they are often represented as association
-Suppose a program is of the general form
+list where some data is associated with a key.}
+Given that programs are sequences (or lists) of statements, we
+can easily calculate the snippets by just traversing this
+sequence and recursively generating the map. Suppose a program
+is of the general form
 \begin{center}
 $stmt_1\;stmt_2\; \ldots\; stmt_n$
 \end{center}
 \noindent I highlighted the \emph{keys} in this map. Since
 there are three labels in the factorial program (remember we
 added \pcode{""}), there are three keys. When running the
 factorial program and encountering a jump, then we only have
 to consult this snippets-map in order to find out what the
-next instruction should be.
+next statements should be.
 We should now be in the position to define how a program
 should be run. In the context of interpreters, this
 ``running'' of programs is often called \emph{evaluation}. Let
 us start with the definition of how expressions are evaluated.
 \end{tabular}
 \end{center}
 \noindent This environment $env$ also acts like a map: it
 associates variable with their current values. For example
-such an environment might look as follows
+ß†after evaluating the first two lines in our factorial
+program, such an environment might look as follows
 \begin{center}
 \begin{tabular}{ll}
 $\fbox{\texttt{a}} \mapsto \texttt{1}$ &
 $\fbox{\texttt{n}} \mapsto \texttt{5}$
 \end{tabular}
 \end{center}
-\noindent Again I hilighted the keys. In the clause for
+\noindent Again I highlighted the keys. In the clause for
 variables, we can therefore consult this environment and
 return whatever value is currently stored for this variable.
 This is written $env(x)$. If we query this map with $x$ we
 obtain the corresponding number. You might ask what happens if
 an environment does not contain any value for, say, the
-variable $x$? Well, then our interpreter just crashes, or will
+variable $x$? Well, then our interpreter just ``crashes'', or
-raise an exception. In this case we have a ``bad'' program
+more precisely will raise an exception. In this case we have a
-that tried to use a variable before it was initialised. The
+``bad'' program that tried to use a variable before it was
-programmer should not have don this. With the second version
+initialised. The programmer should not have done this. In a
-of \textit{eval\_exp} we completed our definition for
+real programming language we would of course try a bit harder
-evaluating expressions.
+and for example give an error at compile time, or design our
+language in such a way that this can never happen. With the
+second version of \textit{eval\_exp} we completed our
+definition for evaluating expressions.
 Next comes the evaluation function for statements. We define
-this function in such a way that we evaluate a whole sequence
+this function in such a way that we recursively evaluate a
-of statements. Assume a program $p$ and its preprocessed
+whole sequence of statements. Assume a program $p$ (you want
-snippets $sn$. Then we define:
+to evaluate) and its pre-processed snippets $sn$. Then we can
+define:
 \begin{center}
 \begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}}
 $\textit{eval\_stmts}([], env)$ & $\dn$ & $env$\\
 $\textit{eval\_stmts}(\texttt{label:}\;rest, env)$ &
 $\dn$ & $\textit{eval\_stmts}(rest, env)$ \\
 $\textit{eval\_stmts}(\texttt{x\,:=\,e}\;rest, env)$ &
 $\dn$ & $\textit{eval\_stmts}(rest,
 env[x := \textit{eval\_exp}(\texttt{e}, env)])$\\
+$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$
+& $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$\\
 $\textit{eval\_stmts}(\texttt{jmp?\,e\,lbl}\;rest, env)$
 & $\dn$ & $\begin{cases}\begin{array}{@{}l@{\hspace{-12mm}}r@{}}
 \textit{eval\_stmts}(sn(\texttt{lbl}), env)\\
 & \text{if}\;\textit{eval\_exp}(\texttt{e}, env) = 1\\
 \textit{eval\_stmts}(rest, env) & \text{otherwise}\\
-\end{array}\end{cases}$\\
+\end{array}\end{cases}$
-$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$
-& $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$
 \end{tabular}
 \end{center}
+\noindent The first clause is for the empty program, or when
+we arrived at the end of the program. In this case we just
+return the environment. The second clause is for when the next
+statement is a label. That means the program is of the form
+$\texttt{label:}\;rest$ where the label is some string and
+$rest$ stands for all following statement. This case is easy,
+because our evaluation function just discards the label and
+evaluates the rest of the statements (we already extracted all
+important information about labels when we pre-processed our
+programs and generated the snippets). The third clause is for
+variable assignments. Again we just evaluate the rest for the
+statements, but with a modified environment---since the
+variable assignment is supposed to introduce a new variable or
+change the current value of a variable. For this modification
+of the environment we first evaluate the expression
+$\texttt{e}$ using our evaluation function for expressions.
+This gives us a number. Then we assign this number to the
+variable $x$ in the environment. This modified environment
+will be used to evaluate the rest of the program. The fourth
+clause is for the unconditional jump to a label. That means we
+have to look up in our snippets map $sn$ what are the next
+statements for this label are. Therefore we will continue with
+evaluating, not with the rest of the program, but with the
+statements stored in the snippets-map under the label
+$\texttt{lbl}$. The fifth clause for conditional jumps is
+similar, but in order to make the jump we first need to
+evaluate the expression $\texttt{e}$ in order to find out
+whether it is $1$. If yes, we jump, otherwise we just continue
+with evaluating the rest of the program.
+Our interpreter works in two stages: First we pre-process our
+program generating the snippets map $sn$, say. Second we call
+the evaluation function with the default entry point and the
+empty environment:
+\begin{center}
+$\textit{eval\_stmts}(sn(\pcode{""}), \varnothing)$
+\end{center}
+\noindent It is interesting to not that our interpreter when
+it comes to the end of the program returns an environment. Our
+programming language does not contain any constructs for input
+and output. Therefore this environment is the only effect we
+can observe when running the program (apart from that our
+interpreter might need some time before finishing the
+evaluation of the program)
 \end{document}

changeset 359	c90f803dc7ea
parent 357	5b91f5ad2772
child 360	eb2004430215