updated
authorChristian Urban <urbanc@in.tum.de>
Sun, 17 Nov 2019 16:12:16 +0000
changeset 692 8c7ccdebcb89
parent 691 991849dfbcb1
child 693 605d971e98fd
updated
handouts/ho07.pdf
handouts/ho07.tex
progs/compile2.scala
Binary file handouts/ho07.pdf has changed
--- a/handouts/ho07.tex	Sun Nov 17 09:18:47 2019 +0000
+++ b/handouts/ho07.tex	Sun Nov 17 16:12:16 2019 +0000
@@ -58,7 +58,7 @@
 the stack, the next one loads $2$, the third instruction adds both
 numbers together replacing the top two elements of the stack
 with the result $3$. For simplicity, we will throughout
-consider only integer numbers and results. Therefore we can
+consider only integer numbers. Therefore we can
 use the JVM instructions \code{iadd}, \code{isub},
 \code{imul}, \code{idiv} and so on. The \code{i} stands for
 integer instructions in the JVM (alternatives are \code{d} for
@@ -164,7 +164,7 @@
 the numbers, then we essentially confuse variables and the
 consequence will usually be an erroneous result. Our extended
 \textit{compile}-function for arithmetic expressions will
-therefore take two arguments: the abstract syntax tree and the
+therefore take two arguments: the abstract syntax tree and an
 environment, $E$, that maps identifiers to index-numbers.
 
 \begin{center}
@@ -234,7 +234,7 @@
 of the variable $x$ in the program, then we have to augment 
 the environment and assign $x$ with the largest index in $E$
 plus one (that is $E' = E(x \mapsto largest\_index + 1)$). 
-This means for the assignment $x := x + 1$ we generate the
+To sum up, for the assignment $x := x + 1$ we generate the
 following code
 
 \begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
@@ -245,8 +245,8 @@
 \end{lstlisting}
 
 \noindent 
-where $n_x$ is the index (or pointer to the memory) for the variable $x$ . The code for 
-looking-up the index for the variable is as follow:
+where $n_x$ is the index (or pointer to the memory) for the variable
+$x$. The code for looking-up the index for the variable is as follow:
 
 \begin{center}
 \begin{tabular}{lcl}
@@ -258,18 +258,21 @@
 In case the environment $E$ contains an index for $x$, we return it.
 Otherwise we ``create'' a new index by returning the size $|E|$ of the
 environment (that will be an index that is guaranteed to be not used
-yet).
+yet). In all this we take advantage of the JVM which provides us with 
+a potentially limitless supply of places where we can store values
+of variables.
 
-A bit more complicated is the generation of code for \pcode{if}-statements, say
+A bit more complicated is the generation of code for
+\pcode{if}-statements, say
 
 \begin{lstlisting}[mathescape,language={},numbers=none]
 if $b$ then $cs_1$ else $cs_2$
 \end{lstlisting}
 
-\noindent where $b$ is a boolean expression and the $cs_{1/2}$ are the
-statements for each of the \pcode{if}-branches. Lets assume we already
-generated code for $b$ and $cs_{1/2}$. Then in the true-case the
-control-flow of the program needs to be
+\noindent where $b$ is a boolean expression and where both $cs_{1/2}$
+are the statements for each of the \pcode{if}-branches. Lets assume we
+already generated code for $b$ and $cs_{1/2}$. Then in the true-case the
+control-flow of the program needs to behave as 
 
 \begin{center}
 \begin{tikzpicture}[node distance=2mm and 4mm,
@@ -346,7 +349,10 @@
     $\vdots$
 \end{lstlisting}
  
-\noindent where a label is indicated by a colon. 
+\noindent where a label is indicated by a colon. The task of the
+assmbler (in our case Jasmin or Krakatau) is to resolve the labels
+to actual addresses, for example jump 10 instructions forward,
+or 20 instructions backwards.
  
 Recall the ``trick'' with compiling boolean expressions: the 
 \textit{compile}-function for boolean expressions takes three
@@ -372,7 +378,7 @@
 (conditionally) jump to the label $lab$. This can be done with
 the instruction
 
-\begin{lstlisting}[mathescape,numbers=none]
+\begin{lstlisting}[mathescape,numbers=none,language=JVMIS]
 if_icmpne $lab$
 \end{lstlisting}
 
@@ -387,16 +393,15 @@
 \end{center}
 
 \noindent and so on. I leave it to you to extend the
-\textit{compile}-function for the other boolean expressions.
-Note that we need to jump whenever the boolean is \emph{not}
-true, which means we have to ``negate'' the jump
-condition---equals becomes not-equal, less becomes
-greater-or-equal. If you do not like this design (it can be
-the source of some nasty, hard-to-detect errors), you can also
-change the layout of the code and first give the code for the
-else-branch and then for the if-branch. However in the case
-of while-loops this way of generating code still seems
-the most convenient.
+\textit{compile}-function for the other boolean expressions. Note that
+we need to jump whenever the boolean is \emph{not} true, which means we
+have to ``negate'' the jump condition---equals becomes not-equal, less
+becomes greater-or-equal. If you do not like this design (it can be the
+source of some nasty, hard-to-detect errors), you can also change the
+layout of the code and first give the code for the else-branch and then
+for the if-branch. However in the case of while-loops this
+``upside-down-inside-out'' way of generating code still seems the most
+convenient.
 
 We are now ready to give the compile function for 
 if-statements---remember this function returns for statements a 
@@ -669,10 +674,11 @@
 write x
 \end{lstlisting}
 
-\noindent Having this code at our disposal, we need the
-assembler to translate the generated code into JVM bytecode (a
-class file). This bytecode is understood by the JVM and can be
-run by just invoking the \pcode{java}-program.
+\noindent I let you read the code and make sure the code behaves as
+expected. Having this code at our disposal, we need the assembler to
+translate the generated code into JVM bytecode (a class file). This
+bytecode is then understood by the JVM and can be run by just invoking
+the \pcode{java}-program.
 
 
 \begin{figure}[p]
@@ -700,9 +706,9 @@
 name of the array is \pcode{arr} and it can hold 15000 integers. The
 second is for referencing an array cell inside an arithmetic
 expression---we need to be able to look up the contents of an array at
-an index determined by an arithmetic expression. Similarly, we need to
-be able to update the content of an array at an calculated index---the
-3rd feature.  
+an index determined by an arithmetic expression. Similarly in the line
+below, we need to be able to update the content of an array at an
+calculated index.  
 
 For creating a new array we can generate the following three JVM
 instructions:
@@ -717,9 +723,9 @@
 First we need to put the dimension of the array onto the stack. The next
 instruction creates the array. With the last we can store the array as a
 local variable (like the ``simple'' variables from the previous
-section). The use of a local variable allows us to have multiple arrays
-in a While-program. For looking up an element in an array we can use the
-following JVM code
+section). The use of a local variable for each array allows us to have
+multiple arrays in a While-program. For looking up an element in an
+array we can use the following JVM code
 
 \begin{lstlisting}[mathescape,language=JVMIS]
 aload loc_var 
@@ -746,13 +752,13 @@
 \noindent
 Again the first instruction loads the ``pointer'' to the array onto the
 stack. Then we have some instructions corresponding to the index where
-we want to update the array. After that come the instructions for with what
-value we want to update the array. Finally the instruction for 
-updating the array.
+we want to update the array. After that come the instructions for with
+what value we want to update the array. The last line contains the
+instruction for updating the array.
 
-Next we need to update our grammar rules: it seems best to extend the 
-rule for factors in arithmetic expressions with a rule for looking up
-an array.
+Next we need to modify our grammar rules for our While-language: it
+seems best to extend the rule for factors in arithmetic expressions with
+a rule for looking up an array.
 
 \begin{plstx}[rhs style=, margin=3cm]
 : \meta{E} ::= \meta{T} $+$ \meta{E}
@@ -769,8 +775,8 @@
 
 \noindent
 There is no problem with left-recursion as the \meta{E} is ``protected''
-by an identifier and brackets. There are two new rules in statements,
-namely creation of an array and array assignment:
+by an identifier and the brackets. There are two new rules for statements,
+one for creating an array and one for array assignment:
 
 \begin{plstx}[rhs style=, margin=2cm, one per line]
 : \meta{Stmt} ::=  \ldots
@@ -778,6 +784,45 @@
               | \meta{Id}\,[\,\meta{E}\,]\,:=\,\meta{E}\\
 \end{plstx}
 
+With this in place we can turn back to the idea of creating While
+programs by translating BF programs. This is a relatively easy task
+because BF only has eight instructions (we will actually only have seven
+because we can omit the read-in instruction from BF). But also translating
+BF-loops is going to be easy since they straightforwardly can be 
+represented by While-loops. The Scala code for the translation is
+as follows:
+
+\begin{lstlisting}[language=Scala,numbers=left]
+def instr(c: Char) : String = c match {
+  case '>' => "ptr := ptr + 1;"
+  case '<' => "ptr := ptr - 1;"
+  case '+' => "field[ptr] := field[ptr] + 1;"
+  case '-' => "field[ptr] := field[ptr] - 1;"
+  case '.' => "x := field[ptr]; write x;"
+  case '['  => "while (field[ptr] != 0) do {"
+  case ']'  => "skip};"
+  case _ => ""
+}
+\end{lstlisting}
+
+\noindent 
+The idea behind the translation is that BF-programs operate on an array,
+called \texttt{field}. The BP-memory pointer into this array is
+represented as the variable \texttt{ptr}. The BF-instructions \code{>}
+and \code{<} increase, respectively decrease, \texttt{ptr}. The
+instructions \code{+} and \code{-} update a cell in \texttt{field}. In
+Line 6 we need to first assign a field-cell to an auxiliary variable
+since we have not changed our write functions in order to cope with
+writing out any array-content directly. Lines 7 and 8 are for
+translating BF-loops. Line 8 is interesting in the sense that we need to
+generate a \code{skip} instruction just before finishing with the 
+closing \code{"\}"}. The reason is that we are rather pedantic about
+semicolons in our While-grammar: the last command cannot have a
+semicolon---adding a \code{skip} works around this snag. Putting
+all this together is we can generate While-programs with more than
+400 instructions and then run the compiled JVM code for such programs.
+
+
 \end{document}
 
 %%% Local Variables: 
--- a/progs/compile2.scala	Sun Nov 17 09:18:47 2019 +0000
+++ b/progs/compile2.scala	Sun Nov 17 16:12:16 2019 +0000
@@ -481,26 +481,27 @@
   "\n" ++
   //"new field[30000];\n" ++
   "ptr := 15000;" ++
-  instrs2(prog) ++
+  instrs(prog) ++
   "skip"
 }
 
 def bf_run(prog: String, name: String) = {
   println("BF processing start")
   val bf_string = bf_str(prog).replaceAll("\\s", "")
+  
   println(s"BF parsing start (string length ${bf_string.length})")
   val bf_prog = Stmts.parse_all(bf_string).toList.head
-  println("BF Compile start")
+  println(s"BF Compile start ${bf_string.toList.length} characters")
   compile_run(Array("field", 30000) :: bf_prog, name)
 }
 
 // a benchmark program (counts down from 'Z' to 'A')
-val b0 = """>++[<+++++++++++++>-]<[[>+>+<<-]>[<+>-]++++++++
+val bf0 = """>++[<+++++++++++++>-]<[[>+>+<<-]>[<+>-]++++++++
             [>++++++++<-]>.[-]<<>++++++++++[>++++++++++[>++
             ++++++++[>++++++++++[>++++++++++[>++++++++++[>+
             +++++++++[-]<-]<-]<-]<-]<-]<-]<-]++++++++++."""
 
-bf_run(b0, "bench")
+bf_run(bf0, "bench")
 
 
 
@@ -513,7 +514,11 @@
 bf_run("""++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++
        ..+++.>>.<-.<.+++.------.--------.>>+.>++.""", "hello")
 
-bf_run("""+++++++++++
+println("BF SIER Prog")
+println(bf_str(bf1).replaceAll("\\s", "").split(";").toList.length)
+
+
+val bf2 = """+++++++++++
       >+>>>>++++++++++++++++++++++++++++++++++++++++++++
       >++++++++++++++++++++++++++++++++<<<<<<[>[>>>>>>+>
       +<<<<<<<-]>>>>>>>[<<<<<<<+>>>>>>>-]<[>++++++++++[-
@@ -523,7 +528,12 @@
       +++++++++++++++++++++++.[-]]++++++++++<[->-<]>++++
       ++++++++++++++++++++++++++++++++++++++++++++.[-]<<
       <<<<<<<<<<[>>>+>+<<<<-]>>>>[<<<<+>>>>-]<-[>>.>.<<<
-      [-]]<<[>>+>+<<<-]>>>[<<<+>>>-]<<[<+>-]>[<+>-]<<<-]""", "fibs")
+      [-]]<<[>>+>+<<<-]>>>[<<<+>>>-]<<[<+>-]>[<+>-]<<<-]"""
+
+bf_run(bf2, "fibs")
+
+println("BF FIB Prog")
+println(bf_str(bf2).replaceAll("\\s", "").split(";").toList.length)
 
 /*