handouts/ho09.tex
changeset 359 c90f803dc7ea
parent 357 5b91f5ad2772
child 360 eb2004430215
--- a/handouts/ho09.tex	Sat Dec 20 00:24:59 2014 +0000
+++ b/handouts/ho09.tex	Sat Dec 20 14:48:15 2014 +0000
@@ -11,7 +11,7 @@
 
 If we want to improve the safety and security of our programs,
 we need a more principled approach to programming. Testing is
-good, but as Dijkstra famously wrote: 
+good, but as Edsger Dijkstra famously wrote: 
 
 \begin{quote}\it 
 ``Program testing can be a very effective way to show the
@@ -77,9 +77,9 @@
 for an array, then the compiler does not need to generate code
 for an underflow-check. Remember some languages are immune to
 buffer-overflow attacks, but they need to add underflow and
-overflow checks everywhere. According to Regehr, an expert in
-the field of compilers, overflow checks can cause 5-10\%
-slowdown, and in some languages even 100\% for tight
+overflow checks everywhere. According to John Regher, an
+expert in the field of compilers, overflow checks can cause
+5-10\% slowdown, in some languages even 100\% for tight
 loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If
 the compiler can omit the underflow check, for example, then
 this can potentially drastically speed up the generated code. 
@@ -121,7 +121,7 @@
 operations). We assume we have negative and positive numbers,
 \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
 \pcode{2}\ldots{} An example program that calculates the
-factorial of 5 is in oure programming language as follows:
+factorial of 5 is in our programming language as follows:
 
 \begin{lstlisting}[language={},xleftmargin=10mm]
       a := 1
@@ -218,20 +218,25 @@
 run\ldots{}at least we will specify this for all \emph{good}
 programs. By good programs I mean where all variables are
 initialised, for example. Our interpreter will just crash if
-it cannot find out the value for a variable, in case it is not
+it cannot find out the value for a variable when it is not
 initialised. Also, we will assume that labels in good programs
 are unique, otherwise our programs will calculate ``garbage''.
 
 First we will pre-process our programs. This will simplify the
-definition of our interpreter later on. We will transform
-programs into \emph{snippets}. Their purpose is to simplify
-the definition of what the interpreter should do in case of a
-jump. A snippet is a label and all code that comes after the
-label. This essentially means a snippet is a \emph{map} from
-labels to code. Given that programs are sequences (or lists)
-of statements, we can easily calculate the snippets by just
-traversing this sequence and recursively generating the map.
-Suppose a program is of the general form
+definition of our interpreter later on. By pre-processing our
+programs we will transform programs into \emph{snippets}.
+Their purpose is to simplify the definition of what the
+interpreter should do in case of a jump. A snippet is a label
+and all the code that comes after the label. This essentially
+means a snippet is a \emph{map} from labels to
+code.\footnote{Be sure you know what maps are. In a
+programming context they are often represented as association
+list where some data is associated with a key.} 
+
+Given that programs are sequences (or lists) of statements, we
+can easily calculate the snippets by just traversing this
+sequence and recursively generating the map. Suppose a program
+is of the general form
 
 \begin{center}
 $stmt_1\;stmt_2\; \ldots\; stmt_n$
@@ -329,7 +334,7 @@
 added \pcode{""}), there are three keys. When running the
 factorial program and encountering a jump, then we only have
 to consult this snippets-map in order to find out what the
-next instruction should be.
+next statements should be.
 
 We should now be in the position to define how a program
 should be run. In the context of interpreters, this
@@ -432,7 +437,8 @@
  
 \noindent This environment $env$ also acts like a map: it
 associates variable with their current values. For example
-such an environment might look as follows
+߆after evaluating the first two lines in our factorial
+program, such an environment might look as follows
 
 \begin{center}
 \begin{tabular}{ll}
@@ -441,23 +447,27 @@
 \end{tabular}
 \end{center}
 
-\noindent Again I hilighted the keys. In the clause for
+\noindent Again I highlighted the keys. In the clause for
 variables, we can therefore consult this environment and
 return whatever value is currently stored for this variable.
 This is written $env(x)$. If we query this map with $x$ we
 obtain the corresponding number. You might ask what happens if
 an environment does not contain any value for, say, the
-variable $x$? Well, then our interpreter just crashes, or will
-raise an exception. In this case we have a ``bad'' program
-that tried to use a variable before it was initialised. The
-programmer should not have don this. With the second version
-of \textit{eval\_exp} we completed our definition for
-evaluating expressions.
+variable $x$? Well, then our interpreter just ``crashes'', or
+more precisely will raise an exception. In this case we have a
+``bad'' program that tried to use a variable before it was
+initialised. The programmer should not have done this. In a
+real programming language we would of course try a bit harder
+and for example give an error at compile time, or design our
+language in such a way that this can never happen. With the
+second version of \textit{eval\_exp} we completed our
+definition for evaluating expressions.
  
 Next comes the evaluation function for statements. We define
-this function in such a way that we evaluate a whole sequence
-of statements. Assume a program $p$ and its preprocessed 
-snippets $sn$. Then we define:
+this function in such a way that we recursively evaluate a
+whole sequence of statements. Assume a program $p$ (you want
+to evaluate) and its pre-processed snippets $sn$. Then we can
+define:
 
 \begin{center}
 \begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}}
@@ -467,20 +477,63 @@
 $\textit{eval\_stmts}(\texttt{x\,:=\,e}\;rest, env)$ & 
   $\dn$ & $\textit{eval\_stmts}(rest, 
            env[x := \textit{eval\_exp}(\texttt{e}, env)])$\\  
+$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ 
+ & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$\\
 $\textit{eval\_stmts}(\texttt{jmp?\,e\,lbl}\;rest, env)$ 
  & $\dn$ & $\begin{cases}\begin{array}{@{}l@{\hspace{-12mm}}r@{}}
  \textit{eval\_stmts}(sn(\texttt{lbl}), env)\\ 
  & \text{if}\;\textit{eval\_exp}(\texttt{e}, env) = 1\\
  \textit{eval\_stmts}(rest, env) & \text{otherwise}\\
- \end{array}\end{cases}$\\
-$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ 
- & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$            
+ \end{array}\end{cases}$  
 \end{tabular}
 \end{center}
 
-
+\noindent The first clause is for the empty program, or when
+we arrived at the end of the program. In this case we just
+return the environment. The second clause is for when the next
+statement is a label. That means the program is of the form
+$\texttt{label:}\;rest$ where the label is some string and
+$rest$ stands for all following statement. This case is easy,
+because our evaluation function just discards the label and
+evaluates the rest of the statements (we already extracted all
+important information about labels when we pre-processed our
+programs and generated the snippets). The third clause is for
+variable assignments. Again we just evaluate the rest for the
+statements, but with a modified environment---since the
+variable assignment is supposed to introduce a new variable or
+change the current value of a variable. For this modification
+of the environment we first evaluate the expression
+$\texttt{e}$ using our evaluation function for expressions.
+This gives us a number. Then we assign this number to the
+variable $x$ in the environment. This modified environment
+will be used to evaluate the rest of the program. The fourth
+clause is for the unconditional jump to a label. That means we
+have to look up in our snippets map $sn$ what are the next
+statements for this label are. Therefore we will continue with
+evaluating, not with the rest of the program, but with the
+statements stored in the snippets-map under the label
+$\texttt{lbl}$. The fifth clause for conditional jumps is
+similar, but in order to make the jump we first need to
+evaluate the expression $\texttt{e}$ in order to find out
+whether it is $1$. If yes, we jump, otherwise we just continue
+with evaluating the rest of the program.
 
+Our interpreter works in two stages: First we pre-process our
+program generating the snippets map $sn$, say. Second we call
+the evaluation function with the default entry point and the
+empty environment:
 
+\begin{center}
+$\textit{eval\_stmts}(sn(\pcode{""}), \varnothing)$
+\end{center}
+ 
+\noindent It is interesting to not that our interpreter when
+it comes to the end of the program returns an environment. Our
+programming language does not contain any constructs for input
+and output. Therefore this environment is the only effect we
+can observe when running the program (apart from that our
+interpreter might need some time before finishing the
+evaluation of the program)