diff -r 991849dfbcb1 -r 8c7ccdebcb89 handouts/ho07.tex --- a/handouts/ho07.tex Sun Nov 17 09:18:47 2019 +0000 +++ b/handouts/ho07.tex Sun Nov 17 16:12:16 2019 +0000 @@ -58,7 +58,7 @@ the stack, the next one loads $2$, the third instruction adds both numbers together replacing the top two elements of the stack with the result $3$. For simplicity, we will throughout -consider only integer numbers and results. Therefore we can +consider only integer numbers. Therefore we can use the JVM instructions \code{iadd}, \code{isub}, \code{imul}, \code{idiv} and so on. The \code{i} stands for integer instructions in the JVM (alternatives are \code{d} for @@ -164,7 +164,7 @@ the numbers, then we essentially confuse variables and the consequence will usually be an erroneous result. Our extended \textit{compile}-function for arithmetic expressions will -therefore take two arguments: the abstract syntax tree and the +therefore take two arguments: the abstract syntax tree and an environment, $E$, that maps identifiers to index-numbers. \begin{center} @@ -234,7 +234,7 @@ of the variable $x$ in the program, then we have to augment the environment and assign $x$ with the largest index in $E$ plus one (that is $E' = E(x \mapsto largest\_index + 1)$). -This means for the assignment $x := x + 1$ we generate the +To sum up, for the assignment $x := x + 1$ we generate the following code \begin{lstlisting}[language=JVMIS,mathescape,numbers=none] @@ -245,8 +245,8 @@ \end{lstlisting} \noindent -where $n_x$ is the index (or pointer to the memory) for the variable $x$ . The code for -looking-up the index for the variable is as follow: +where $n_x$ is the index (or pointer to the memory) for the variable +$x$. The code for looking-up the index for the variable is as follow: \begin{center} \begin{tabular}{lcl} @@ -258,18 +258,21 @@ In case the environment $E$ contains an index for $x$, we return it. Otherwise we ``create'' a new index by returning the size $|E|$ of the environment (that will be an index that is guaranteed to be not used -yet). +yet). In all this we take advantage of the JVM which provides us with +a potentially limitless supply of places where we can store values +of variables. -A bit more complicated is the generation of code for \pcode{if}-statements, say +A bit more complicated is the generation of code for +\pcode{if}-statements, say \begin{lstlisting}[mathescape,language={},numbers=none] if $b$ then $cs_1$ else $cs_2$ \end{lstlisting} -\noindent where $b$ is a boolean expression and the $cs_{1/2}$ are the -statements for each of the \pcode{if}-branches. Lets assume we already -generated code for $b$ and $cs_{1/2}$. Then in the true-case the -control-flow of the program needs to be +\noindent where $b$ is a boolean expression and where both $cs_{1/2}$ +are the statements for each of the \pcode{if}-branches. Lets assume we +already generated code for $b$ and $cs_{1/2}$. Then in the true-case the +control-flow of the program needs to behave as \begin{center} \begin{tikzpicture}[node distance=2mm and 4mm, @@ -346,7 +349,10 @@ $\vdots$ \end{lstlisting} -\noindent where a label is indicated by a colon. +\noindent where a label is indicated by a colon. The task of the +assmbler (in our case Jasmin or Krakatau) is to resolve the labels +to actual addresses, for example jump 10 instructions forward, +or 20 instructions backwards. Recall the ``trick'' with compiling boolean expressions: the \textit{compile}-function for boolean expressions takes three @@ -372,7 +378,7 @@ (conditionally) jump to the label $lab$. This can be done with the instruction -\begin{lstlisting}[mathescape,numbers=none] +\begin{lstlisting}[mathescape,numbers=none,language=JVMIS] if_icmpne $lab$ \end{lstlisting} @@ -387,16 +393,15 @@ \end{center} \noindent and so on. I leave it to you to extend the -\textit{compile}-function for the other boolean expressions. -Note that we need to jump whenever the boolean is \emph{not} -true, which means we have to ``negate'' the jump -condition---equals becomes not-equal, less becomes -greater-or-equal. If you do not like this design (it can be -the source of some nasty, hard-to-detect errors), you can also -change the layout of the code and first give the code for the -else-branch and then for the if-branch. However in the case -of while-loops this way of generating code still seems -the most convenient. +\textit{compile}-function for the other boolean expressions. Note that +we need to jump whenever the boolean is \emph{not} true, which means we +have to ``negate'' the jump condition---equals becomes not-equal, less +becomes greater-or-equal. If you do not like this design (it can be the +source of some nasty, hard-to-detect errors), you can also change the +layout of the code and first give the code for the else-branch and then +for the if-branch. However in the case of while-loops this +``upside-down-inside-out'' way of generating code still seems the most +convenient. We are now ready to give the compile function for if-statements---remember this function returns for statements a @@ -669,10 +674,11 @@ write x \end{lstlisting} -\noindent Having this code at our disposal, we need the -assembler to translate the generated code into JVM bytecode (a -class file). This bytecode is understood by the JVM and can be -run by just invoking the \pcode{java}-program. +\noindent I let you read the code and make sure the code behaves as +expected. Having this code at our disposal, we need the assembler to +translate the generated code into JVM bytecode (a class file). This +bytecode is then understood by the JVM and can be run by just invoking +the \pcode{java}-program. \begin{figure}[p] @@ -700,9 +706,9 @@ name of the array is \pcode{arr} and it can hold 15000 integers. The second is for referencing an array cell inside an arithmetic expression---we need to be able to look up the contents of an array at -an index determined by an arithmetic expression. Similarly, we need to -be able to update the content of an array at an calculated index---the -3rd feature. +an index determined by an arithmetic expression. Similarly in the line +below, we need to be able to update the content of an array at an +calculated index. For creating a new array we can generate the following three JVM instructions: @@ -717,9 +723,9 @@ First we need to put the dimension of the array onto the stack. The next instruction creates the array. With the last we can store the array as a local variable (like the ``simple'' variables from the previous -section). The use of a local variable allows us to have multiple arrays -in a While-program. For looking up an element in an array we can use the -following JVM code +section). The use of a local variable for each array allows us to have +multiple arrays in a While-program. For looking up an element in an +array we can use the following JVM code \begin{lstlisting}[mathescape,language=JVMIS] aload loc_var @@ -746,13 +752,13 @@ \noindent Again the first instruction loads the ``pointer'' to the array onto the stack. Then we have some instructions corresponding to the index where -we want to update the array. After that come the instructions for with what -value we want to update the array. Finally the instruction for -updating the array. +we want to update the array. After that come the instructions for with +what value we want to update the array. The last line contains the +instruction for updating the array. -Next we need to update our grammar rules: it seems best to extend the -rule for factors in arithmetic expressions with a rule for looking up -an array. +Next we need to modify our grammar rules for our While-language: it +seems best to extend the rule for factors in arithmetic expressions with +a rule for looking up an array. \begin{plstx}[rhs style=, margin=3cm] : \meta{E} ::= \meta{T} $+$ \meta{E} @@ -769,8 +775,8 @@ \noindent There is no problem with left-recursion as the \meta{E} is ``protected'' -by an identifier and brackets. There are two new rules in statements, -namely creation of an array and array assignment: +by an identifier and the brackets. There are two new rules for statements, +one for creating an array and one for array assignment: \begin{plstx}[rhs style=, margin=2cm, one per line] : \meta{Stmt} ::= \ldots @@ -778,6 +784,45 @@ | \meta{Id}\,[\,\meta{E}\,]\,:=\,\meta{E}\\ \end{plstx} +With this in place we can turn back to the idea of creating While +programs by translating BF programs. This is a relatively easy task +because BF only has eight instructions (we will actually only have seven +because we can omit the read-in instruction from BF). But also translating +BF-loops is going to be easy since they straightforwardly can be +represented by While-loops. The Scala code for the translation is +as follows: + +\begin{lstlisting}[language=Scala,numbers=left] +def instr(c: Char) : String = c match { + case '>' => "ptr := ptr + 1;" + case '<' => "ptr := ptr - 1;" + case '+' => "field[ptr] := field[ptr] + 1;" + case '-' => "field[ptr] := field[ptr] - 1;" + case '.' => "x := field[ptr]; write x;" + case '[' => "while (field[ptr] != 0) do {" + case ']' => "skip};" + case _ => "" +} +\end{lstlisting} + +\noindent +The idea behind the translation is that BF-programs operate on an array, +called \texttt{field}. The BP-memory pointer into this array is +represented as the variable \texttt{ptr}. The BF-instructions \code{>} +and \code{<} increase, respectively decrease, \texttt{ptr}. The +instructions \code{+} and \code{-} update a cell in \texttt{field}. In +Line 6 we need to first assign a field-cell to an auxiliary variable +since we have not changed our write functions in order to cope with +writing out any array-content directly. Lines 7 and 8 are for +translating BF-loops. Line 8 is interesting in the sense that we need to +generate a \code{skip} instruction just before finishing with the +closing \code{"\}"}. The reason is that we are rather pedantic about +semicolons in our While-grammar: the last command cannot have a +semicolon---adding a \code{skip} works around this snag. Putting +all this together is we can generate While-programs with more than +400 instructions and then run the compiled JVM code for such programs. + + \end{document} %%% Local Variables: