afl-material: comparison handouts/ho07.tex

equal deleted inserted replaced

-:991849dfbcb1
+:8c7ccdebcb89
 \noindent The first instruction loads the constant $1$ onto
 the stack, the next one loads $2$, the third instruction adds both
 numbers together replacing the top two elements of the stack
 with the result $3$. For simplicity, we will throughout
-consider only integer numbers and results. Therefore we can
+consider only integer numbers. Therefore we can
 use the JVM instructions \code{iadd}, \code{isub},
 \code{imul}, \code{idiv} and so on. The \code{i} stands for
 integer instructions in the JVM (alternatives are \code{d} for
 doubles, \code{l} for longs and \code{f} for floats).
 a kind of environment where variables are associated to
 numbers. This association needs to be unique: if we muddle up
 the numbers, then we essentially confuse variables and the
 consequence will usually be an erroneous result. Our extended
 \textit{compile}-function for arithmetic expressions will
-therefore take two arguments: the abstract syntax tree and the
+therefore take two arguments: the abstract syntax tree and an
 environment, $E$, that maps identifiers to index-numbers.
 \begin{center}
 \begin{tabular}{lcl}
 $\textit{compile}(n, E)$ & $\dn$ & $\pcode{ldc}\;n$\\
 the index is and return the environment unchanged (that is in
 this case $E' = E$). However, if this is the first encounter
 of the variable $x$ in the program, then we have to augment
 the environment and assign $x$ with the largest index in $E$
 plus one (that is $E' = E(x \mapsto largest\_index + 1)$).
-This means for the assignment $x := x + 1$ we generate the
+To sum up, for the assignment $x := x + 1$ we generate the
 following code
 \begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
 iload $n_x$
 ldc 1
 iadd
 istore $n_x$
 \end{lstlisting}
 \noindent
-where $n_x$ is the index (or pointer to the memory) for the variable $x$ . The code for
+where $n_x$ is the index (or pointer to the memory) for the variable
-looking-up the index for the variable is as follow:
+$x$. The code for looking-up the index for the variable is as follow:
 \begin{center}
 \begin{tabular}{lcl}
 $index \;=\; E\textit{.getOrElse}(x, |E|)$
 \end{tabular}
 \noindent
 In case the environment $E$ contains an index for $x$, we return it.
 Otherwise we ``create'' a new index by returning the size $|E|$ of the
 environment (that will be an index that is guaranteed to be not used
-yet).
+yet). In all this we take advantage of the JVM which provides us with
+a potentially limitless supply of places where we can store values
-A bit more complicated is the generation of code for \pcode{if}-statements, say
+of variables.
+A bit more complicated is the generation of code for
+\pcode{if}-statements, say
 \begin{lstlisting}[mathescape,language={},numbers=none]
 if $b$ then $cs_1$ else $cs_2$
 \end{lstlisting}
-\noindent where $b$ is a boolean expression and the $cs_{1/2}$ are the
+\noindent where $b$ is a boolean expression and where both $cs_{1/2}$
-statements for each of the \pcode{if}-branches. Lets assume we already
+are the statements for each of the \pcode{if}-branches. Lets assume we
-generated code for $b$ and $cs_{1/2}$. Then in the true-case the
+already generated code for $b$ and $cs_{1/2}$. Then in the true-case the
-control-flow of the program needs to be
+control-flow of the program needs to behave as
 \begin{center}
 \begin{tikzpicture}[node distance=2mm and 4mm,
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
 $instr_1$
 $instr_2$
 $\vdots$
 \end{lstlisting}
-\noindent where a label is indicated by a colon.
+\noindent where a label is indicated by a colon. The task of the
+assmbler (in our case Jasmin or Krakatau) is to resolve the labels
+to actual addresses, for example jump 10 instructions forward,
+or 20 instructions backwards.
 Recall the ``trick'' with compiling boolean expressions: the
 \textit{compile}-function for boolean expressions takes three
 arguments: an abstract syntax tree, an environment for
 variable indices and also the label, $lab$, to where an conditional
 with the next instructions (see control-flow of ifs above).
 However if they are \emph{not} equal, then we need to
 (conditionally) jump to the label $lab$. This can be done with
 the instruction
-\begin{lstlisting}[mathescape,numbers=none]
+\begin{lstlisting}[mathescape,numbers=none,language=JVMIS]
 if_icmpne $lab$
 \end{lstlisting}
 \noindent Other jump instructions for boolean operators are
 $\le$ & $\Rightarrow$ & \pcode{if_icmpgt}\\
 \end{tabular}
 \end{center}
 \noindent and so on. I leave it to you to extend the
-\textit{compile}-function for the other boolean expressions.
+\textit{compile}-function for the other boolean expressions. Note that
-Note that we need to jump whenever the boolean is \emph{not}
+we need to jump whenever the boolean is \emph{not} true, which means we
-true, which means we have to ``negate'' the jump
+have to ``negate'' the jump condition---equals becomes not-equal, less
-condition---equals becomes not-equal, less becomes
+becomes greater-or-equal. If you do not like this design (it can be the
-greater-or-equal. If you do not like this design (it can be
+source of some nasty, hard-to-detect errors), you can also change the
-the source of some nasty, hard-to-detect errors), you can also
+layout of the code and first give the code for the else-branch and then
-change the layout of the code and first give the code for the
+for the if-branch. However in the case of while-loops this
-else-branch and then for the if-branch. However in the case
+``upside-down-inside-out'' way of generating code still seems the most
-of while-loops this way of generating code still seems
+convenient.
-the most convenient.
 We are now ready to give the compile function for
 if-statements---remember this function returns for statements a
 pair consisting of the code and an environment:
 \begin{lstlisting}[mathescape,language=While]
 x := 1 + 2;
 write x
 \end{lstlisting}
-\noindent Having this code at our disposal, we need the
+\noindent I let you read the code and make sure the code behaves as
-assembler to translate the generated code into JVM bytecode (a
+expected. Having this code at our disposal, we need the assembler to
-class file). This bytecode is understood by the JVM and can be
+translate the generated code into JVM bytecode (a class file). This
-run by just invoking the \pcode{java}-program.
+bytecode is then understood by the JVM and can be run by just invoking
+the \pcode{java}-program.
 \begin{figure}[p]
 \lstinputlisting[language=JVMIS]{../progs/test-small.j}
 \caption{Generated code for a test program. This code can be
 \noindent
 The first construct is for creating new arrays, in this instance the
 name of the array is \pcode{arr} and it can hold 15000 integers. The
 second is for referencing an array cell inside an arithmetic
 expression---we need to be able to look up the contents of an array at
-an index determined by an arithmetic expression. Similarly, we need to
+an index determined by an arithmetic expression. Similarly in the line
-be able to update the content of an array at an calculated index---the
+below, we need to be able to update the content of an array at an
-3rd feature.
+calculated index.
 For creating a new array we can generate the following three JVM
 instructions:
 \begin{lstlisting}[mathescape,language=JVMIS]
 \noindent
 First we need to put the dimension of the array onto the stack. The next
 instruction creates the array. With the last we can store the array as a
 local variable (like the ``simple'' variables from the previous
-section). The use of a local variable allows us to have multiple arrays
+section). The use of a local variable for each array allows us to have
-in a While-program. For looking up an element in an array we can use the
+multiple arrays in a While-program. For looking up an element in an
-following JVM code
+array we can use the following JVM code
 \begin{lstlisting}[mathescape,language=JVMIS]
 aload loc_var
 index_aexp
 iaload
 \end{lstlisting}
 \noindent
 Again the first instruction loads the ``pointer'' to the array onto the
 stack. Then we have some instructions corresponding to the index where
-we want to update the array. After that come the instructions for with what
+we want to update the array. After that come the instructions for with
-value we want to update the array. Finally the instruction for
+what value we want to update the array. The last line contains the
-updating the array.
+instruction for updating the array.
-Next we need to update our grammar rules: it seems best to extend the
+Next we need to modify our grammar rules for our While-language: it
-rule for factors in arithmetic expressions with a rule for looking up
+seems best to extend the rule for factors in arithmetic expressions with
-an array.
+a rule for looking up an array.
 \begin{plstx}[rhs style=, margin=3cm]
 : \meta{E} ::= \meta{T} $+$ \meta{E}
 | \meta{T} $-$ \meta{E}
 | \meta{T}\\
 | \meta{Num}\\
 \end{plstx}
 \noindent
 There is no problem with left-recursion as the \meta{E} is ``protected''
-by an identifier and brackets. There are two new rules in statements,
+by an identifier and the brackets. There are two new rules for statements,
-namely creation of an array and array assignment:
+one for creating an array and one for array assignment:
 \begin{plstx}[rhs style=, margin=2cm, one per line]
 : \meta{Stmt} ::=  \ldots
 | \texttt{new}\; \meta{Id}\,[\,\meta{Num}\,]
 | \meta{Id}\,[\,\meta{E}\,]\,:=\,\meta{E}\\
 \end{plstx}
+With this in place we can turn back to the idea of creating While
+programs by translating BF programs. This is a relatively easy task
+because BF only has eight instructions (we will actually only have seven
+because we can omit the read-in instruction from BF). But also translating
+BF-loops is going to be easy since they straightforwardly can be
+represented by While-loops. The Scala code for the translation is
+as follows:
+\begin{lstlisting}[language=Scala,numbers=left]
+def instr(c: Char) : String = c match {
+case '>' => "ptr := ptr + 1;"
+case '<' => "ptr := ptr - 1;"
+case '+' => "field[ptr] := field[ptr] + 1;"
+case '-' => "field[ptr] := field[ptr] - 1;"
+case '.' => "x := field[ptr]; write x;"
+case '['  => "while (field[ptr] != 0) do {"
+case ']'  => "skip};"
+case _ => ""
+}
+\end{lstlisting}
+\noindent
+The idea behind the translation is that BF-programs operate on an array,
+called \texttt{field}. The BP-memory pointer into this array is
+represented as the variable \texttt{ptr}. The BF-instructions \code{>}
+and \code{<} increase, respectively decrease, \texttt{ptr}. The
+instructions \code{+} and \code{-} update a cell in \texttt{field}. In
+Line 6 we need to first assign a field-cell to an auxiliary variable
+since we have not changed our write functions in order to cope with
+writing out any array-content directly. Lines 7 and 8 are for
+translating BF-loops. Line 8 is interesting in the sense that we need to
+generate a \code{skip} instruction just before finishing with the
+closing \code{"\}"}. The reason is that we are rather pedantic about
+semicolons in our While-grammar: the last command cannot have a
+semicolon---adding a \code{skip} works around this snag. Putting
+all this together is we can generate While-programs with more than
+400 instructions and then run the compiled JVM code for such programs.
 \end{document}
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: t

changeset 692	8c7ccdebcb89
parent 691	991849dfbcb1
child 705	bfc8703b1527