afl-material: changeset 380:1e88390e81aa

Binary file handouts/ho08.pdf has changed

--- a/handouts/ho08.tex	Tue Nov 17 19:12:51 2015 +0000
+++ b/handouts/ho08.tex	Wed Nov 18 01:53:01 2015 +0000
@@ -13,10 +13,13 @@
 \section*{Handout 8 (A Functional Language)}
 
 
-The language we looked at in the previous lecture was rather 
-primitive and the compiler rather crude. In this handout
-we like to have a look at a slightly more comfortable
-language and a tiny-teeny bit more realistic compiler. A small
+The language we looked at in the previous lecture was rather
+primitive and the compiler rather crude---everything was
+essentially compiled into a big monolithic chunk of code
+inside the main function. In this handout we like to have a
+look at a slightly more comfortable language, which I call
+Fun-language, and a tiny-teeny bit more realistic compiler.
+The Fun-language is a functional programming language. A small
 collection of programs we want to be able to write and compile
 is as follows:
 
@@ -34,16 +37,25 @@
 def gcd(a, b) = if b == 0 then a else gcd(b, a % b);                
 \end{lstlisting}
 
-\noindent We will still restrict us to just programs about
-integers, that means for example that every function needs to
-return an integer. The grammar of the Fun-language is slightly 
-simpler than the While-language, because almost everything is 
-an expression. The grammar rules are as follows:
+\noindent Compare the code of the fib-program with the same
+program written in the While-language\ldots Fun is definitely
+more comfortable. We will still focus on programs involving
+integers only, that means for example that every function is
+expected to return an integer. The point of the Fun language
+is to compile each function to a separate method in JVM
+bytecode (not just a big monolithic code chunk). The means we
+need to adapt to some of the conventions of the JVM about
+methods.
+
+The grammar of the Fun-language is slightly simpler than the
+While-language, because the main syntactic category are
+expressions (we do not have statements). The grammar rules are
+as follows:
 
 
 \begin{plstx}[rhs style=,margin=1.5cm]
-: \meta{Exp} ::= \meta{Var} | \meta{Num} {\hspace{3cm}}
-             |   \meta{Exp} + \meta{Exp} | ... | (\meta{ExP})
+: \meta{Exp} ::= \meta{Id} | \meta{Num} {\hspace{3cm}}
+             |   \meta{Exp} + \meta{Exp} | ... | (\meta{Exp})
              |   \code{if} \meta{BExp} \code{then} \meta{Exp} \code{else} \meta{Exp}
              |   \code{write} \meta{Exp} {\hspace{5cm}}
              |   \meta{Exp} ; \meta{Exp}  {\hspace{5cm}}
@@ -51,587 +63,151 @@
 : \meta{BExp} ::= ...\\
 : \meta{Decl} ::= \meta{Def} ; \meta{Decl}
              | \meta{Exp}\\
-: \meta{Def} ::= \code{def} \textit{FunName} ($x_1$, ..., $x_2$)\\               
+: \meta{Def} ::= \code{def} \textit{FunName} ($x_1$, ..., $x_n$) = \meta{Exp}\\               
 \end{plstx}
 
-\noindent where \meta{Id} stands for variables and \meta{Num}
-for numbers. For the moment let us omit variables from
-arithmetic expressions. Our parser will take this grammar and
-given an input produce abstract syntax trees. For example for
-the expression $1 + ((2 * 3) + (4 - 3))$ it will produce the
-following tree.
-
-\begin{center}
-\begin{tikzpicture}
\Tree [.$+$ [.$1$ ] [.$+$ [.$*$ $2$ $3$ ] [.$-$ $4$ $3$ ]]]
\end{tikzpicture}
-\end{center}
-
-\noindent To generate code for this expression, we need to
-traverse this tree in post-order fashion and emit code for
-each node---this traversal in post-order fashion will produce
-code for a stack-machine (what the JVM is). Doing so for the
-tree above generates the instructions
+\noindent where, as usual, \meta{Id} stands for variables and
+\meta{Num} for numbers. We can call a function by applying the
+arguments to a function name (as shown in the last clause of
+\meta{Exp}). The arguments in such a function call can be
+again expressions, including other function calls. In
+contrast, when defining a function (see \meta{Def}-clause) the
+arguments need to be variables, say $x_1$ to $x_n$. We call
+the expression on the right of = in a function definition as
+the \emph{body of the function}. We have the restriction that
+the variables inside a function body can only be those that
+are mentioned as arguments of the function. A Fun-program is
+then a sequence of function definitions separated by
+semicolons, and a final ``main'' call of a function that
+starts the computation in the program. For example
 
 \begin{lstlisting}[numbers=none]
-ldc 1 
-ldc 2 
-ldc 3 
-imul 
-ldc 4 
-ldc 3 
-isub 
-iadd 
-iadd
-\end{lstlisting}
-
-\noindent If we ``run'' these instructions, the result $8$
-will be on top of the stack (I leave this to you to verify;
-the meaning of each instruction should be clear). The result
-being on the top of the stack will be a convention we always
-observe in our compiler, that is the results of arithmetic
-expressions will always be on top of the stack. Note, that a
-different bracketing of the expression, for example $(1 + (2 *
-3)) + (4 - 3)$, produces a different abstract syntax tree and
-thus potentially also a different list of instructions.
-Generating code in this fashion is rather easy to implement:
-it can be done with the following recursive
-\textit{compile}-function, which takes the abstract syntax
-tree as argument:
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(n)$ & $\dn$ & $\pcode{ldc}\; n$\\
-$\textit{compile}(a_1 + a_2)$ & $\dn$ &
-$\textit{compile}(a_1) \;@\;\textit{compile}(a_2)\;@\; \pcode{iadd}$\\
-$\textit{compile}(a_1 - a_2)$ & $\dn$ & 
-$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{isub}$\\
-$\textit{compile}(a_1 * a_2)$ & $\dn$ & 
-$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{imul}$\\
-$\textit{compile}(a_1 \backslash a_2)$ & $\dn$ & 
-$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{idiv}$\\
-\end{tabular}
-\end{center}
-
-However, our arithmetic expressions can also contain
-variables. We will represent them as \emph{local variables} in
-the JVM. Essentially, local variables are an array or pointers
-to memory cells, containing in our case only integers. Looking
-up a variable can be done with the instruction
-
-\begin{lstlisting}[mathescape,numbers=none]
-iload $index$
-\end{lstlisting}
-
-\noindent 
-which places the content of the local variable $index$ onto 
-the stack. Storing the top of the stack into a local variable 
-can be done by the instruction
-
-\begin{lstlisting}[mathescape,numbers=none]
-istore $index$
-\end{lstlisting}
-
-\noindent Note that this also pops off the top of the stack.
-One problem we have to overcome, however, is that local
-variables are addressed, not by identifiers, but by numbers
-(starting from $0$). Therefore our compiler needs to maintain
-a kind of environment where variables are associated to
-numbers. This association needs to be unique: if we muddle up
-the numbers, then we essentially confuse variables and the
-consequence will usually be an erroneous result. Our extended
-\textit{compile}-function for arithmetic expressions will
-therefore take two arguments: the abstract syntax tree and the
-environment, $E$, that maps identifiers to index-numbers.
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(n, E)$ & $\dn$ & $\pcode{ldc}\;n$\\
-$\textit{compile}(a_1 + a_2, E)$ & $\dn$ & 
-$\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{iadd}$\\
-$\textit{compile}(a_1 - a_2, E)$ & $\dn$ &
-$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{isub}$\\
-$\textit{compile}(a_1 * a_2, E)$ & $\dn$ &
-$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{imul}$\\
-$\textit{compile}(a_1 \backslash a_2, E)$ & $\dn$ & 
-$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{idiv}$\\
-$\textit{compile}(x, E)$ & $\dn$ & $\pcode{iload}\;E(x)$\\
-\end{tabular}
-\end{center}
-
-\noindent In the last line we generate the code for variables
-where $E(x)$ stands for looking up the environment to which
-index the variable $x$ maps to.
-
-There is a similar \textit{compile}-function for boolean
-expressions, but it includes a ``trick'' to do with
-\pcode{if}- and \pcode{while}-statements. To explain the issue
-let us first describe the compilation of statements of the
-While-language. The clause for \pcode{skip} is trivial, since
-we do not have to generate any instruction
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(\pcode{skip}, E)$ & $\dn$ & $([], E)$\\
-\end{tabular}
-\end{center}
-
-\noindent $[]$ is the empty list of instructions. Note that
-the \textit{compile}-function for statements returns a pair, a
-list of instructions (in this case the empty list) and an
-environment for variables. The reason for the environment is
-that assignments in the While-language might change the
-environment---clearly if a variable is used for the first
-time, we need to allocate a new index and if it has been used
-before, we need to be able to retrieve the associated index.
-This is reflected in the clause for compiling assignments:
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(x := a, E)$ & $\dn$ & 
-$(\textit{compile}(a, E) \;@\;\pcode{istore}\;index, E')$
-\end{tabular}
-\end{center}
-
-\noindent We first generate code for the right-hand side of
-the assignment and then add an \pcode{istore}-instruction at
-the end. By convention the result of the arithmetic expression
-$a$ will be on top of the stack. After the \pcode{istore}
-instruction, the result will be stored in the index
-corresponding to the variable $x$. If the variable $x$ has
-been used before in the program, we just need to look up what
-the index is and return the environment unchanged (that is in
-this case $E' = E$). However, if this is the first encounter 
-of the variable $x$ in the program, then we have to augment 
-the environment and assign $x$ with the largest index in $E$
-plus one (that is $E' = E(x \mapsto largest\_index + 1)$). 
-That means for the assignment $x := x + 1$ we generate the
-following code
-
-\begin{lstlisting}[mathescape,numbers=none]
-iload $n_x$
-ldc 1
-iadd
-istore $n_x$
-\end{lstlisting}
-
-\noindent 
-where $n_x$ is the index for the variable $x$.
-
-More complicated is the code for \pcode{if}-statments, say
-
-\begin{lstlisting}[mathescape,language={},numbers=none]
-if $b$ then $cs_1$ else $cs_2$
+def fact(n) = if n == 0 then 1 else n * fact(n - 1);
+write(fact(5))
 \end{lstlisting}
 
-\noindent where $b$ is a boolean expression and the $cs_{1/2}$
-are the statements for each \pcode{if}-branch. Lets assume
-we already generated code for $b$ and $cs_{1/2}$. Then in the
-true-case the control-flow of the program needs to be
-
-\begin{center}
-\begin{tikzpicture}[node distance=2mm and 4mm,
- block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
- point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
- skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
-\node (A1) [point] {};
-\node (b) [block, right=of A1] {code of $b$};
-\node (A2) [point, right=of b] {};
-\node (cs1) [block, right=of A2] {code of $cs_1$};
-\node (A3) [point, right=of cs1] {};
-\node (cs2) [block, right=of A3] {code of $cs_2$};
-\node (A4) [point, right=of cs2] {};
+\noindent would be a valid Fun-program. The parser of the
+Fun-language produces abstract syntax trees which in Scala
+can be represented as follows:
 
-\draw (A1) edge [->, black, line width=1mm] (b);
-\draw (b) edge [->, black, line width=1mm] (cs1);
-\draw (cs1) edge [->, black, line width=1mm] (A3);
-\draw (A3) edge [->, black, skip loop] (A4);
-\node [below=of cs2] {\raisebox{-5mm}{\small{}jump}};
-\end{tikzpicture}
-\end{center}
-
-\noindent where we start with running the code for $b$; since
-we are in the true case we continue with running the code for
-$cs_1$. After this however, we must not run the code for
-$cs_2$, but always jump after the last instruction of $cs_2$
-(the code for the \pcode{else}-branch). Note that this jump is
-unconditional, meaning we always have to jump to the end of
-$cs_2$. The corresponding instruction of the JVM is
-\pcode{goto}. In case $b$ turns out to be false we need the
-control-flow
 
-\begin{center}
-\begin{tikzpicture}[node distance=2mm and 4mm,
- block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
- point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
- skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
-\node (A1) [point] {};
-\node (b) [block, right=of A1] {code of $b$};
-\node (A2) [point, right=of b] {};
-\node (cs1) [block, right=of A2] {code of $cs_1$};
-\node (A3) [point, right=of cs1] {};
-\node (cs2) [block, right=of A3] {code of $cs_2$};
-\node (A4) [point, right=of cs2] {};
+{\small
+\begin{lstlisting}[numbers=none]
+abstract class Exp
+abstract class BExp 
+abstract class Decl
 
-\draw (A1) edge [->, black, line width=1mm] (b);
-\draw (b) edge [->, black, line width=1mm] (A2);
-\draw (A2) edge [skip loop] (A3);
-\draw (A3) edge [->, black, line width=1mm] (cs2);
-\draw (cs2) edge [->,black, line width=1mm] (A4);
-\node [below=of cs1] {\raisebox{-5mm}{\small{}conditional jump}};
-\end{tikzpicture}
-\end{center}
+case class Var(s: String) extends Exp
+case class Num(i: Int) extends Exp
+case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
+case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
+case class Write(e: Exp) extends Exp
+case class Sequ(e1: Exp, e2: Exp) extends Exp
+case class Call(name: String, args: List[Exp]) extends Exp
 
-\noindent where we now need a conditional jump (if the
-if-condition is false) from the end of the code for the 
-boolean to the beginning of the instructions $cs_2$. Once we 
-are finished with running $cs_2$ we can continue with whatever
-code comes after the if-statement.
-
-The \pcode{goto} and the conditional jumps need addresses to
-where the jump should go. Since we are generating assembly
-code for the JVM, we do not actually have to give (numeric)
-addresses, but can just attach (symbolic) labels to our code.
-These labels specify a target for a jump. Therefore the labels
-need to be unique, as otherwise it would be ambiguous where a
-jump should go to. A label, say \pcode{L}, is attached to code
-like
+case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
 
-\begin{lstlisting}[mathescape,numbers=none]
-L:
-  $instr_1$
-  $instr_2$
-    $\vdots$
-\end{lstlisting}
- 
-\noindent where a label is indicated by a colon. 
- 
-Recall the ``trick'' with compiling boolean expressions: the 
-\textit{compile}-function for boolean expressions takes three
-arguments: an abstract syntax tree, an environment for 
-variable indices and also the label, $lab$, to where an conditional 
-jump needs to go. The clause for the expression $a_1 = a_2$, 
-for example, is as follows:
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(a_1 = a_2, E, lab)$ & $\dn$\\ 
-\multicolumn{3}{l}{$\qquad\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{if_icmpne}\;lab$}
-\end{tabular}
-\end{center}
-
-\noindent where we are first generating code for the
-subexpressions $a_1$ and $a_2$. This will mean after running
-the corresponding code there will be two integers on top of
-the stack. If they are equal, we do not have to do anything
-(except for popping them off from the stack) and just continue
-with the next instructions (see control-flow of ifs above).
-However if they are \emph{not} equal, then we need to
-(conditionally) jump to the label $lab$. This can be done with
-the instruction
-
-\begin{lstlisting}[mathescape,numbers=none]
-if_icmpne $lab$
-\end{lstlisting}
-
-\noindent Other jump instructions for boolean operators are
+case class Def(name: String, 
+               args: List[String], 
+               body: Exp) extends Decl
+case class Main(e: Exp) extends Decl
+\end{lstlisting}}
 
-\begin{center}
-\begin{tabular}{l@{\hspace{10mm}}c@{\hspace{10mm}}l}
-$\not=$ & $\Rightarrow$ & \pcode{if_icmpeq}\\
-$<$ & $\Rightarrow$ & \pcode{if_icmpge}\\
-$\le$ & $\Rightarrow$ & \pcode{if_icmpgt}\\
-\end{tabular}
-\end{center}
-
-\noindent and so on. I leave it to you to extend the
-\textit{compile}-function for the other boolean expressions.
-Note that we need to jump whenever the boolean is \emph{not}
-true, which means we have to ``negate'' the jump
-condition---equals becomes not-equal, less becomes
-greater-or-equal. If you do not like this design (it can be
-the source of some nasty, hard-to-detect errors), you can also
-change the layout of the code and first give the code for the
-else-branch and then for the if-branch. However in the case
-of while-loops this way of generating code still seems
-the most convenient.
-
-We are now ready to give the compile function for 
-if-statments---remember this function returns for staments a 
-pair consisting of the code and an environment:
+Let us first look at some clauses for compiling expressions.
+The compilation of arithmetic and boolean expressions is just
+like for the While-language and do not need any modification.
+(recall that the \textit{compile}-function for boolean
+expression takes a third argument for the label where the
+contro-flow should jump when the boolean expression is
+\emph{not} true---this is needed for compiling \pcode{if}s).
+One additional feature in the Fun-language are sequences.
+Their purpose is to do one calculation after another. The
+reason why we need to be careful however is the convention
+that every expression can only produce s single result
+(including sequences). Since this result will be on the
+top of the stack, we need to generate a 
+\pcode{pop}-instruction in order to clean-up the stack. Given 
+the expression of the form \pcode{exp1 ; exp2} we need to 
+generate code where after the first code chunk
+a \pcode{pop}-instruction is needed. 
 
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(\pcode{if}\;b\;\pcode{then}\; cs_1\;\pcode{else}\; cs_2, E)$ & $\dn$\\ 
-\multicolumn{3}{l}{$\qquad L_\textit{ifelse}\;$ (fresh label)}\\
-\multicolumn{3}{l}{$\qquad L_\textit{ifend}\;$ (fresh label)}\\
-\multicolumn{3}{l}{$\qquad (is_1, E') = \textit{compile}(cs_1, E)$}\\
-\multicolumn{3}{l}{$\qquad (is_2, E'') = \textit{compile}(cs_2, E')$}\\
-\multicolumn{3}{l}{$\qquad(\textit{compile}(b, E, L_\textit{ifelse})$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_1$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\; \pcode{goto}\;L_\textit{ifend}$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_\textit{ifelse}:$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_2$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_\textit{ifend}:, E'')$}\\
-\end{tabular}
-\end{center}
-
-\noindent In the first two lines we generate two fresh labels
-for the jump addresses (just before the else-branch and just
-after). In the next two lines we generate the instructions for
-the two branches, $is_1$ and $is_2$. The final code will
-be first the code for $b$ (including the label 
-just-before-the-else-branch), then the \pcode{goto} for after
-the else-branch, the label $L_\textit{ifesle}$, followed by
-the instructions for the else-branch, followed by the 
-after-the-else-branch label. Consider for example the 
-if-statement:
-
-\begin{lstlisting}[mathescape,numbers=none,language={}]
-if 1 = 1 then x := 2 else y := 3
+\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
+$\textrm{\textit{compile}}($exp1$)$
+pop
+$\textrm{\textit{compile}}($exp2$)$
 \end{lstlisting}
 
-\noindent 
-The generated code is as follows:
-
-\begin{lstlisting}[mathescape,language={}]
-   ldc 1
-   ldc 1
-   if_icmpne L_ifelse $\quad\tikz[remember picture] \node (C) {\mbox{}};$
-   ldc 2
-   istore 0
-   goto L_ifend $\quad\tikz[remember picture] \node (A) {\mbox{}};$
-L_ifelse: $\quad\tikz[remember picture] \node[] (D) {\mbox{}};$
-   ldc 3
-   istore 1
-L_ifend: $\quad\tikz[remember picture] \node[] (B) {\mbox{}};$
-\end{lstlisting}
-
-\begin{tikzpicture}[remember picture,overlay]
  \draw[->,very thick] (A) edge [->,to path={-- ++(10mm,0mm) 
-           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (B.east);
-  \draw[->,very thick] (C) edge [->,to path={-- ++(10mm,0mm) 
-           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (D.east);
\end{tikzpicture}
+\noindent In effect we ``forget'' about the result the first
+expression calculates. I leave you to think about why this
+sequence operator is still useful in the Fun-language, even 
+if the first result is just ``discarded''. 
 
-\noindent The first three lines correspond to the the boolean
-expression $1 = 1$. The jump for when this boolean expression
-is false is in Line~3. Lines 4-6 corresponds to the if-branch;
-the else-branch is in Lines 8 and 9. Note carefully how the
-environment $E$ is threaded through the recursive calls of
-\textit{compile}. The function receives an environment $E$,
-but it might extend it when compiling the if-branch, yielding
-$E'$. This happens for example in the if-statement above
-whenever the variable \code{x} has not been used before.
-Similarly with the environment $E''$ for the second call to
-\textit{compile}. $E''$ is also the environment that needs to
-be returned as part of the answer.
-
-The compilation of the while-loops, say 
-\pcode{while} $b$ \pcode{do} $cs$, is very similar. In case
-the condition is true and we need to do another iteration, 
-and the control-flow needs to be as follows
-
-\begin{center}
-\begin{tikzpicture}[node distance=2mm and 4mm,
- block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
- point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
- skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
-\node (A0) [point, left=of A1] {};
-\node (A1) [point] {};
-\node (b) [block, right=of A1] {code of $b$};
-\node (A2) [point, right=of b] {};
-\node (cs1) [block, right=of A2] {code of $cs$};
-\node (A3) [point, right=of cs1] {};
-\node (A4) [point, right=of A3] {};
-
-\draw (A0) edge [->, black, line width=1mm] (b);
-\draw (b) edge [->, black, line width=1mm] (cs1);
-\draw (cs1) edge [->, black, line width=1mm] (A3);
-\draw (A3) edge [->,skip loop] (A1);
-\end{tikzpicture}
-\end{center}
-
-\noindent Whereas if the condition is \emph{not} true, we
-need to jump out of the loop, which gives the following
-control flow.
+There is also one small modification we have to perform when
+calling the write method. Remember in the Fun-language we have
+the convention that every expression needs to return an
+integer as a result (located on the top of the stack). Our
+helper function implementing write, however, ``consumes'' the
+top element of the stack and violates this convention.
+Therefore before we call, say, \pcode{write(1+2)}, we need to
+duplicate the top element of the stack like so
 
-\begin{center}
-\begin{tikzpicture}[node distance=2mm and 4mm,
- block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
- point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
- skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
-\node (A0) [point, left=of A1] {};
-\node (A1) [point] {};
-\node (b) [block, right=of A1] {code of $b$};
-\node (A2) [point, right=of b] {};
-\node (cs1) [block, right=of A2] {code of $cs$};
-\node (A3) [point, right=of cs1] {};
-\node (A4) [point, right=of A3] {};
-
-\draw (A0) edge [->, black, line width=1mm] (b);
-\draw (b) edge [->, black, line width=1mm] (A2);
-\draw (A2) edge [skip loop] (A3);
-\draw (A3) edge [->, black, line width=1mm] (A4);
-\end{tikzpicture}
-\end{center}
-
-\noindent Again we can use the \textit{compile}-function for
-boolean expressions to insert the appropriate jump to the
-end of the loop (label $L_{wend}$ below).
-
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{compile}(\pcode{while}\; b\; \pcode{do} \;cs, E)$ & $\dn$\\ 
-\multicolumn{3}{l}{$\qquad L_{wbegin}\;$ (fresh label)}\\
-\multicolumn{3}{l}{$\qquad L_{wend}\;$ (fresh label)}\\
-\multicolumn{3}{l}{$\qquad (is, E') = \textit{compile}(cs_1, E)$}\\
-\multicolumn{3}{l}{$\qquad(L_{wbegin}:$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;\textit{compile}(b, E, L_{wend})$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;is$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\; \text{goto}\;L_{wbegin}$}\\
-\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_{wend}:, E')$}\\
-\end{tabular}
-\end{center}
-
-\noindent I let you go through how this clause works. As an example
-you can consider the while-loop
-
-\begin{lstlisting}[mathescape,numbers=none,language={}]
-while x <= 10 do x := x + 1
+\begin{figure}[t]
+\begin{lstlisting}[language=JVMIS, 
+                   xleftmargin=2mm, 
+                   numbers=none]
+.method public static write(I)V 
+   .limit locals 1
+   .limit stack 2
+   getstatic java/lang/System/out Ljava/io/PrintStream; 
+   iload 0 
+   invokevirtual java/io/PrintStream/println(I)V 
+   return 
+.end method
 \end{lstlisting}
-
-\noindent yielding the following code
-
-\begin{lstlisting}[mathescape,language={}]
-L_wbegin: $\quad\tikz[remember picture] \node[] (LB) {\mbox{}};$
-   iload 0
-   ldc 10
-   if_icmpgt L_wend $\quad\tikz[remember picture] \node (LC) {\mbox{}};$
-   iload 0
-   ldc 1
-   iadd
-   istore 0
-   goto L_wbegin $\quad\tikz[remember picture] \node (LA) {\mbox{}};$
-L_wend: $\quad\tikz[remember picture] \node[] (LD) {\mbox{}};$
-\end{lstlisting}
- 
-\begin{tikzpicture}[remember picture,overlay]
  \draw[->,very thick] (LA) edge [->,to path={-- ++(10mm,0mm) 
-           -- ++(0mm,17.3mm) |- (\tikztotarget)},line width=1mm] (LB.east);
-  \draw[->,very thick] (LC) edge [->,to path={-- ++(10mm,0mm) 
-           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (LD.east);
\end{tikzpicture}
+\caption{The helper function for printing out integers.\label{write}}
+\end{figure}
 
 
-Next we need to consider the statement \pcode{write x}, which
-can be used to print out the content of a variable. For this
-we need to use a Java library function. In order to avoid
-having to generate a lot of code for each
-\pcode{write}-command, we use a separate helper-method and
-just call this method with an argument (which needs to be
-placed onto the stack). The code of the helper-method is as
-follows.
-
-
-\begin{lstlisting}[language=JVMIS]
-.method public static write(I)V 
-    .limit locals 1 
-    .limit stack 2 
-    getstatic java/lang/System/out Ljava/io/PrintStream; 
-    iload 0
-    invokevirtual java/io/PrintStream/println(I)V 
-    return 
-.end method
-\end{lstlisting}
-
-\noindent The first line marks the beginning of the method,
-called \pcode{write}. It takes a single integer argument
-indicated by the \pcode{(I)} and returns no result, indicated
-by the \pcode{V}. Since the method has only one argument, we
-only need a single local variable (Line~2) and a stack with
-two cells will be sufficient (Line 3). Line 4 instructs the
-JVM to get the value of the field \pcode{out} of the class
-\pcode{java/lang/System}. It expects the value to be of type
-\pcode{java/io/PrintStream}. A reference to this value will be
-placed on the stack. Line~5 copies the integer we want to
-print out onto the stack. In the next line we call the method
-\pcode{println} (from the class \pcode{java/io/PrintStream}).
-We want to print out an integer and do not expect anything
-back (that is why the type annotation is \pcode{(I)V}). The
-\pcode{return}-instruction in the next line changes the
-control-flow back to the place from where \pcode{write} was
-called. This method needs to be part of a header that is
-included in any code we generate. The helper-method
-\pcode{write} can be invoked with the two instructions
-
-\begin{lstlisting}[mathescape,language=JVMIS]
-iload $E(x)$ 
+\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
+$\textrm{\textit{compile}}($1+2$)$
+dup
 invokestatic XXX/XXX/write(I)V
 \end{lstlisting}
 
-\noindent where we first place the variable to be printed on
-top of the stack and then call \pcode{write}. The \pcode{XXX}
-need to be replaced by an appropriate class name (this will be
-explained shortly).
-
-
-\begin{figure}[t]
-\begin{lstlisting}[mathescape,language=JVMIS]
-.class public XXX.XXX
-.super java/lang/Object
-
-.method public <init>()V
-    aload_0
-    invokenonvirtual java/lang/Object/<init>()V
-    return
-.end method
-
-.method public static main([Ljava/lang/String;)V
-    .limit locals 200
-    .limit stack 200
-
-      $\textit{\ldots{}here comes the compiled code\ldots}$
+\noindent We also need to first generate code for the
+argument-expression of write, which in the While-language was
+only allowed to be a single variable.
 
-    return
-.end method
-\end{lstlisting}
-\caption{Boilerplate code needed for running generated code.\label{boiler}}
-\end{figure}
-
+Most of the new code in the compiler for the Fun-language
+comes from function definitions and function calls. For this
+have a look again at the helper function in
+Figure~\ref{write}. Assuming we have a function definition
 
-By generating code for a While-program, we end up with a list
-of (JVM assembly) instructions. Unfortunately, there is a bit
-more boilerplate code needed before these instructions can be
-run. The complete code is shown in Figure~\ref{boiler}. This
-boilerplate code is very specific to the JVM. If we target any
-other virtual machine or a machine language, then we would
-need to change this code. Lines 4 to 8 in Figure~\ref{boiler}
-contain a method for object creation in the JVM; this method
-is called \emph{before} the \pcode{main}-method in Lines 10 to
-17. Interesting are the Lines 11 and 12 where we hardwire that
-the stack of our programs will never be larger than 200 and
-that the maximum number of variables is also 200. This seem to
-be conservative default values that allow is to run some
-simple While-programs. In a real compiler, we would of course
-need to work harder and find out appropriate values for the
-stack and local variables.
-
-To sum up, in Figure~\ref{test} is the complete code generated
-for the slightly non-sensical program
-
-\begin{lstlisting}[mathescape,language=While]
-x := 1 + 2;
-write x
+\begin{lstlisting}[numbers=none,mathescape]
+def fname (x1, ... , xn) = ...
 \end{lstlisting}
 
-\noindent Having this code at our disposal, we need the
-assembler to translate the generated code into JVM bytecode (a
-class file). This bytecode is understood by the JVM and can be
-run by just invoking the \pcode{java}-program.
+\noindent then we have to generate
 
+\begin{lstlisting}[numbers=none,mathescape,language=JVMIS]
+.method public static fname (I...I)I
+   .limit locals ??
+   .limit stack ?? 
+   ...
+  ireturn
+.method end  
+\end{lstlisting}
 
-\begin{figure}[p]
-\lstinputlisting{../progs/test-small.j}
-\caption{Generated code for a test program. This code can be 
-processed by an Java assembler producing a class-file, which
-can be run by the \pcode{java}-program.\label{test}}
-\end{figure}
-
+\noindent where the number of \pcode{I}s corresponds to the
+number of arguments the function has, say \pcode{x1} to
+\pcode{xn}. The final \pcode{I} is needed in order to indicate
+that the function returns an integer. Therefore we also have
+to use \pcode{ireturn} instead of \pcode{return}. However,
+more interesting are the two \pcode{.limit} lines. Locals
+refers to the local variables of the method, which can be
+queried and overwritten using \pcode{iload} and
+\pcode{istore}, respectively.
+ 
 \end{document}
 
 %%% Local Variables:

--- a/progs/fun.scala	Tue Nov 17 19:12:51 2015 +0000
+++ b/progs/fun.scala	Wed Nov 18 01:53:01 2015 +0000
@@ -354,11 +354,10 @@
 .end method
 
 .method public static write(I)V 
-        .limit locals 5 
-        .limit stack 5 
-        iload 0 
+        .limit locals 1 
+        .limit stack 2 
         getstatic java/lang/System/out Ljava/io/PrintStream; 
-        swap 
+        iload 0
         invokevirtual java/io/PrintStream/println(I)V 
         return 
 .end method

Binary file slides/slides09.pdf has changed

--- a/slides/slides09.tex	Tue Nov 17 19:12:51 2015 +0000
+++ b/slides/slides09.tex	Wed Nov 18 01:53:01 2015 +0000
@@ -91,15 +91,15 @@
 \bl{
 \begin{plstx}[rhs style=]
 : \meta{Exp} ::= \meta{Var} | \meta{Num}{\hspace{3cm}}
-             |   \meta{Exp} + \meta{Exp} | ... | (\meta{ExP})
+             |   \meta{Exp} + \meta{Exp} | ... | (\meta{Exp})
              |   \code{if} \meta{BExp} \code{then} \meta{Exp} \code{else} \meta{Exp}
              |   \code{write} \meta{Exp} {\hspace{3cm}}
              |   \meta{Exp} ; \meta{Exp}
-             |   \textit{FunName} (\meta{Exp}, ..., \meta{Exp})\\
+             |   \textit{FunName} (\meta{Exp}, ... , \meta{Exp})\\
 : \meta{BExp} ::= ...\\
 : \meta{Decl} ::= \meta{Def} ; \meta{Decl}
              | \meta{Exp}\\
-: \meta{Def} ::= \code{def} \textit{FunName} ($\hspace{0.4mm}x_1$, ..., $x_2$)\\               
+: \meta{Def} ::= \code{def} \textit{FunName} ($\hspace{0.4mm}x_1$, ... , $x_n$) = \meta{Exp}\\               
 \end{plstx}}
 
 
@@ -107,15 +107,12 @@
 \end{frame}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%   
 
-
-
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{frame}[c, fragile]
-\frametitle{Abstract Syntax Tree}
+\frametitle{Abstract Syntax Trees}
 
 \footnotesize
-\begin{textblock}{13}(0.3,2)
-\begin{lstlisting}[language=Scala,basicstyle=\ttfamily, numbers=none]
+\begin{lstlisting}[language=Scala, numbers=none, xleftmargin=-1mm]
 abstract class Exp
 abstract class BExp 
 abstract class Decl
@@ -134,16 +131,15 @@
 case class Sequ(e1: Exp, e2: Exp) extends Exp
 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
 \end{lstlisting}
-\end{textblock}
 
 \end{frame}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[c, fragile]
-\frametitle{Mathematical Functions}
+\begin{frame}[c]
+\frametitle{Arithmetic Functions}
 
-Compilation of some mathematical functions:
+Compilation of some aritmetic functions:
 
 \begin{center}
 \begin{tabular}{lcl}
@@ -159,10 +155,10 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[c, fragile]
+\begin{frame}[c]
 \frametitle{Boolean Expressions}
 
-Compilation of boolean expressions:
+Compilation of Boolean expressions:
 
 \begin{center}
 \begin{tikzpicture}[node distance=2mm and 4mm,
@@ -200,57 +196,37 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[t, fragile]
+\begin{frame}[c, fragile]
 \frametitle{Sequences}
 
-Compiling \texttt{arg1 ; arg2}:
+Compiling \texttt{arg1 ; arg2}:\bigskip
 
 
-\begin{textblock}{7}(2,5)\footnotesize
-\begin{minipage}{6cm}
-\begin{lstlisting}[language=JVMIS,basicstyle=\ttfamily, numbers=none]
+\begin{lstlisting}[language=JVMIS, numbers=none]
 ...arg1...
 pop
 ...arg1...
 \end{lstlisting}
-\end{minipage}
-\end{textblock}
-
   
 \end{frame}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[t, fragile]
+\begin{frame}[c, fragile]
 \frametitle{Write}
 
-Compiling \texttt{write(arg)}:
+Compiling call to \texttt{write(arg)}:\bigskip
 
 
-\begin{textblock}{7}(2,4)\footnotesize
-\begin{minipage}{6cm}
-\begin{lstlisting}[language=JVMIS,basicstyle=\ttfamily, numbers=none]
+\begin{lstlisting}[language=JVMIS, numbers=none]
 ...arg...
 dup
 invokestatic XXX/XXX/write(I)V
-\end{lstlisting}
-\end{minipage}
-\end{textblock}
-
-
+\end{lstlisting}\bigskip
 
-\begin{textblock}{7}(2,8)\footnotesize
-\begin{minipage}{6cm}
-\begin{lstlisting}[language=Scala,basicstyle=\ttfamily, numbers=none]
-case Write(a1) => {
-    compile_exp(a1, env) ++
-    List("dup\n",
-         "invokestatic XXX/XXX/write(I)V\n")
-  }
-\end{lstlisting}
-\end{minipage}
-\end{textblock}
-  
+\small
+needs a helper function
+
 \end{frame}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 
@@ -258,37 +234,33 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{frame}[c, fragile]
-\frametitle{Functions}
+\frametitle{Function Definitions}
 
-\begin{textblock}{7}(1,2)\footnotesize
-\begin{minipage}{6cm}
-\begin{lstlisting}[language=JVMIS,basicstyle=\ttfamily, numbers=none]
+\footnotesize
+\begin{lstlisting}[language=JVMIS, 
+                   xleftmargin=2mm, 
+                   numbers=none]
 .method public static write(I)V 
-   .limit locals 5 
-   .limit stack 5 
+   .limit locals 1
+   .limit stack 2
+   getstatic java/lang/System/out Ljava/io/PrintStream; 
    iload 0 
-   getstatic java/lang/System/out Ljava/io/PrintStream; 
-   swap 
    invokevirtual java/io/PrintStream/println(I)V 
    return 
 .end method
-\end{lstlisting}
-\end{minipage}
-\end{textblock}
+\end{lstlisting}\bigskip
 
+\small We will need for definitions\footnotesize\medskip
 
-\begin{textblock}{10}(1,9.8)
-\small We will need for definitions\\[-8mm]\mbox{}\footnotesize
-\begin{center}
-\begin{lstlisting}[language=JVMIS,basicstyle=\ttfamily, numbers=none]
+\begin{lstlisting}[language=JVMIS, 
+                   xleftmargin=2mm, 
+                   numbers=none]
 .method public static f (I...I)I
   .limit locals ??
   .limit stack ?? 
   ??
 .end method
 \end{lstlisting}
-\end{center}
-\end{textblock}
 
 \end{frame}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%   
@@ -302,19 +274,19 @@
 def max_stack_exp(e: Exp): Int = e match {
   case Call(_, args) => args.map(max_stack_exp).sum
   case If(a, e1, e2) => max_stack_bexp(a) + 
-  	(List(max_stack_exp(e1), max_stack_exp(e1)).max)
+    (List(max_stack_exp(e1), max_stack_exp(e1)).max)
   case Write(e) => max_stack_exp(e) + 1
   case Var(_) => 1
   case Num(_) => 1
   case Aop(_, a1, a2) => 
-  	max_stack_exp(a1) + max_stack_exp(a2)
+    max_stack_exp(a1) + max_stack_exp(a2)
   case Sequ(e1, e2) => 
-  	List(max_stack_exp(e1), max_stack_exp(e2)).max
+    List(max_stack_exp(e1), max_stack_exp(e2)).max
 }
 
 def max_stack_bexp(e: BExp): Int = e match {
   case Bop(_, a1, a2) => 
-  	max_stack_exp(a1) + max_stack_exp(a2)
+    max_stack_exp(a1) + max_stack_exp(a2)
 }
 \end{lstlisting}
 \end{center}
@@ -324,7 +296,7 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{frame}[fragile]
-\frametitle{Successor}
+\frametitle{Successor Function}
 
 \begin{textblock}{7}(1,2.5)\footnotesize
 \begin{minipage}{6cm}
@@ -357,9 +329,9 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{frame}[fragile]
-\frametitle{Addition}
+\frametitle{Addition Function}
 
-\begin{textblock}{7}(1,1.5)\footnotesize
+\begin{textblock}{7}(1,1.9)\footnotesize
 \begin{minipage}{6cm}
 \begin{lstlisting}[language=JVMIS,basicstyle=\ttfamily, numbers=none]
 .method public static add(II)I 
@@ -367,24 +339,24 @@
 .limit stack 6
   iload 0
   ldc 0
-  if_icmpne If_else_2
+  if_icmpne If_else
   iload 1
-  goto If_end_3
-If_else_2:
+  goto If_end
+If_else:
   iload 0
   ldc 1
   isub
   iload 1
-  invokestatic defs/defs/add(II)I
-  invokestatic defs/defs/suc(I)I
-If_end_3:
+  invokestatic XXX/XXX/add(II)I
+  invokestatic XXX/XXX/suc(I)I
+If_end:
   ireturn
 .end method
 \end{lstlisting}
 \end{minipage}
 \end{textblock}
 
-\begin{textblock}{7}(6,6.2)
+\begin{textblock}{7}(6,6.6)
 \begin{bubble}[7cm]\small
 \begin{lstlisting}[language=Lisp,
                    basicstyle=\ttfamily,

author	Christian Urban <christian dot urban at kcl dot ac dot uk>
	Wed, 18 Nov 2015 01:53:01 +0000
changeset 380	1e88390e81aa
parent 379	fa2589ec0fae
child 381	47eceea734c5

handouts/ho08.pdf		file \| annotate \| diff \| comparison \| revisions
handouts/ho08.tex		file \| annotate \| diff \| comparison \| revisions
progs/fun.scala		file \| annotate \| diff \| comparison \| revisions
slides/slides09.pdf		file \| annotate \| diff \| comparison \| revisions
slides/slides09.tex		file \| annotate \| diff \| comparison \| revisions