diff -r 8787c16bc26e -r c90f803dc7ea handouts/ho09.tex --- a/handouts/ho09.tex Sat Dec 20 00:24:59 2014 +0000 +++ b/handouts/ho09.tex Sat Dec 20 14:48:15 2014 +0000 @@ -11,7 +11,7 @@ If we want to improve the safety and security of our programs, we need a more principled approach to programming. Testing is -good, but as Dijkstra famously wrote: +good, but as Edsger Dijkstra famously wrote: \begin{quote}\it ``Program testing can be a very effective way to show the @@ -77,9 +77,9 @@ for an array, then the compiler does not need to generate code for an underflow-check. Remember some languages are immune to buffer-overflow attacks, but they need to add underflow and -overflow checks everywhere. According to Regehr, an expert in -the field of compilers, overflow checks can cause 5-10\% -slowdown, and in some languages even 100\% for tight +overflow checks everywhere. According to John Regher, an +expert in the field of compilers, overflow checks can cause +5-10\% slowdown, in some languages even 100\% for tight loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If the compiler can omit the underflow check, for example, then this can potentially drastically speed up the generated code. @@ -121,7 +121,7 @@ operations). We assume we have negative and positive numbers, \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1}, \pcode{2}\ldots{} An example program that calculates the -factorial of 5 is in oure programming language as follows: +factorial of 5 is in our programming language as follows: \begin{lstlisting}[language={},xleftmargin=10mm] a := 1 @@ -218,20 +218,25 @@ run\ldots{}at least we will specify this for all \emph{good} programs. By good programs I mean where all variables are initialised, for example. Our interpreter will just crash if -it cannot find out the value for a variable, in case it is not +it cannot find out the value for a variable when it is not initialised. Also, we will assume that labels in good programs are unique, otherwise our programs will calculate ``garbage''. First we will pre-process our programs. This will simplify the -definition of our interpreter later on. We will transform -programs into \emph{snippets}. Their purpose is to simplify -the definition of what the interpreter should do in case of a -jump. A snippet is a label and all code that comes after the -label. This essentially means a snippet is a \emph{map} from -labels to code. Given that programs are sequences (or lists) -of statements, we can easily calculate the snippets by just -traversing this sequence and recursively generating the map. -Suppose a program is of the general form +definition of our interpreter later on. By pre-processing our +programs we will transform programs into \emph{snippets}. +Their purpose is to simplify the definition of what the +interpreter should do in case of a jump. A snippet is a label +and all the code that comes after the label. This essentially +means a snippet is a \emph{map} from labels to +code.\footnote{Be sure you know what maps are. In a +programming context they are often represented as association +list where some data is associated with a key.} + +Given that programs are sequences (or lists) of statements, we +can easily calculate the snippets by just traversing this +sequence and recursively generating the map. Suppose a program +is of the general form \begin{center} $stmt_1\;stmt_2\; \ldots\; stmt_n$ @@ -329,7 +334,7 @@ added \pcode{""}), there are three keys. When running the factorial program and encountering a jump, then we only have to consult this snippets-map in order to find out what the -next instruction should be. +next statements should be. We should now be in the position to define how a program should be run. In the context of interpreters, this @@ -432,7 +437,8 @@ \noindent This environment $env$ also acts like a map: it associates variable with their current values. For example -such an environment might look as follows +߆after evaluating the first two lines in our factorial +program, such an environment might look as follows \begin{center} \begin{tabular}{ll} @@ -441,23 +447,27 @@ \end{tabular} \end{center} -\noindent Again I hilighted the keys. In the clause for +\noindent Again I highlighted the keys. In the clause for variables, we can therefore consult this environment and return whatever value is currently stored for this variable. This is written $env(x)$. If we query this map with $x$ we obtain the corresponding number. You might ask what happens if an environment does not contain any value for, say, the -variable $x$? Well, then our interpreter just crashes, or will -raise an exception. In this case we have a ``bad'' program -that tried to use a variable before it was initialised. The -programmer should not have don this. With the second version -of \textit{eval\_exp} we completed our definition for -evaluating expressions. +variable $x$? Well, then our interpreter just ``crashes'', or +more precisely will raise an exception. In this case we have a +``bad'' program that tried to use a variable before it was +initialised. The programmer should not have done this. In a +real programming language we would of course try a bit harder +and for example give an error at compile time, or design our +language in such a way that this can never happen. With the +second version of \textit{eval\_exp} we completed our +definition for evaluating expressions. Next comes the evaluation function for statements. We define -this function in such a way that we evaluate a whole sequence -of statements. Assume a program $p$ and its preprocessed -snippets $sn$. Then we define: +this function in such a way that we recursively evaluate a +whole sequence of statements. Assume a program $p$ (you want +to evaluate) and its pre-processed snippets $sn$. Then we can +define: \begin{center} \begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}} @@ -467,20 +477,63 @@ $\textit{eval\_stmts}(\texttt{x\,:=\,e}\;rest, env)$ & $\dn$ & $\textit{eval\_stmts}(rest, env[x := \textit{eval\_exp}(\texttt{e}, env)])$\\ +$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ + & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$\\ $\textit{eval\_stmts}(\texttt{jmp?\,e\,lbl}\;rest, env)$ & $\dn$ & $\begin{cases}\begin{array}{@{}l@{\hspace{-12mm}}r@{}} \textit{eval\_stmts}(sn(\texttt{lbl}), env)\\ & \text{if}\;\textit{eval\_exp}(\texttt{e}, env) = 1\\ \textit{eval\_stmts}(rest, env) & \text{otherwise}\\ - \end{array}\end{cases}$\\ -$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ - & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$ + \end{array}\end{cases}$ \end{tabular} \end{center} - +\noindent The first clause is for the empty program, or when +we arrived at the end of the program. In this case we just +return the environment. The second clause is for when the next +statement is a label. That means the program is of the form +$\texttt{label:}\;rest$ where the label is some string and +$rest$ stands for all following statement. This case is easy, +because our evaluation function just discards the label and +evaluates the rest of the statements (we already extracted all +important information about labels when we pre-processed our +programs and generated the snippets). The third clause is for +variable assignments. Again we just evaluate the rest for the +statements, but with a modified environment---since the +variable assignment is supposed to introduce a new variable or +change the current value of a variable. For this modification +of the environment we first evaluate the expression +$\texttt{e}$ using our evaluation function for expressions. +This gives us a number. Then we assign this number to the +variable $x$ in the environment. This modified environment +will be used to evaluate the rest of the program. The fourth +clause is for the unconditional jump to a label. That means we +have to look up in our snippets map $sn$ what are the next +statements for this label are. Therefore we will continue with +evaluating, not with the rest of the program, but with the +statements stored in the snippets-map under the label +$\texttt{lbl}$. The fifth clause for conditional jumps is +similar, but in order to make the jump we first need to +evaluate the expression $\texttt{e}$ in order to find out +whether it is $1$. If yes, we jump, otherwise we just continue +with evaluating the rest of the program. +Our interpreter works in two stages: First we pre-process our +program generating the snippets map $sn$, say. Second we call +the evaluation function with the default entry point and the +empty environment: +\begin{center} +$\textit{eval\_stmts}(sn(\pcode{""}), \varnothing)$ +\end{center} + +\noindent It is interesting to not that our interpreter when +it comes to the end of the program returns an environment. Our +programming language does not contain any constructs for input +and output. Therefore this environment is the only effect we +can observe when running the program (apart from that our +interpreter might need some time before finishing the +evaluation of the program)