--- a/handouts/ho08.tex Wed Nov 18 01:53:01 2015 +0000
+++ b/handouts/ho08.tex Wed Nov 18 02:59:49 2015 +0000
@@ -3,7 +3,7 @@
\usepackage{../langs}
\usepackage{../grammar}
\usepackage{../graphics}
-
+\usepackage{amssymb}
\begin{document}
@@ -14,16 +14,16 @@
The language we looked at in the previous lecture was rather
-primitive and the compiler rather crude---everything was
-essentially compiled into a big monolithic chunk of code
+primitive and the estimater rather crude---everything was
+essentially estimated into a big monolithic chunk of code
inside the main function. In this handout we like to have a
look at a slightly more comfortable language, which I call
-Fun-language, and a tiny-teeny bit more realistic compiler.
+Fun-language, and a tiny-teeny bit more realistic estimater.
The Fun-language is a functional programming language. A small
-collection of programs we want to be able to write and compile
+collection of programs we want to be able to write and estimate
is as follows:
-\begin{lstlisting}[basicstyle=\ttfamily, numbers=none]
+\begin{lstlisting}[numbers=none]
def fib(n) = if n == 0 then 0
else if n == 1 then 1
else fib(n - 1) + fib(n - 2);
@@ -42,7 +42,7 @@
more comfortable. We will still focus on programs involving
integers only, that means for example that every function is
expected to return an integer. The point of the Fun language
-is to compile each function to a separate method in JVM
+is to estimate each function to a separate method in JVM
bytecode (not just a big monolithic code chunk). The means we
need to adapt to some of the conventions of the JVM about
methods.
@@ -116,7 +116,7 @@
Let us first look at some clauses for compiling expressions.
The compilation of arithmetic and boolean expressions is just
like for the While-language and do not need any modification.
-(recall that the \textit{compile}-function for boolean
+(recall that the \textit{estimate}-function for boolean
expression takes a third argument for the label where the
contro-flow should jump when the boolean expression is
\emph{not} true---this is needed for compiling \pcode{if}s).
@@ -132,9 +132,9 @@
a \pcode{pop}-instruction is needed.
\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
-$\textrm{\textit{compile}}($exp1$)$
+$\textrm{\textit{estimate}}($exp1$)$
pop
-$\textrm{\textit{compile}}($exp2$)$
+$\textrm{\textit{estimate}}($exp2$)$
\end{lstlisting}
\noindent In effect we ``forget'' about the result the first
@@ -169,7 +169,7 @@
\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
-$\textrm{\textit{compile}}($1+2$)$
+$\textrm{\textit{estimate}}($1+2$)$
dup
invokestatic XXX/XXX/write(I)V
\end{lstlisting}
@@ -178,7 +178,7 @@
argument-expression of write, which in the While-language was
only allowed to be a single variable.
-Most of the new code in the compiler for the Fun-language
+Most of the new code in the estimater for the Fun-language
comes from function definitions and function calls. For this
have a look again at the helper function in
Figure~\ref{write}. Assuming we have a function definition
@@ -203,10 +203,127 @@
\pcode{xn}. The final \pcode{I} is needed in order to indicate
that the function returns an integer. Therefore we also have
to use \pcode{ireturn} instead of \pcode{return}. However,
-more interesting are the two \pcode{.limit} lines. Locals
-refers to the local variables of the method, which can be
-queried and overwritten using \pcode{iload} and
-\pcode{istore}, respectively.
+more interesting are the two \pcode{.limit} lines.
+\pcode{Locals} refers to the local variables of the method,
+which can be queried and overwritten using the JVM
+instructions \pcode{iload} and \pcode{istore}, respectively.
+Before we call a function with, say, three arguments, we need
+to ensure that these three arguments are pushed onto the stack
+(we will come to the corresponding code shortly). Once we are
+inside the method, the arguments on the stack turn into local
+variables. So in case we have three arguments on the stack, we
+will have inside the function three local variables that can
+be referenced by the indices $0..2$. Determining the limit for
+local variables is the easy bit. Harder is the stack limit.
+
+Calculating how much stack a program needs is equivalent to
+the Halting problem, and thus undecidable in general.
+Fortunately, we are only asked how much stack a \emph{single}
+call of the function requires. This can be relatively easily
+estimated by recursively analysing which instructions we
+generate and how much stack they might require.
+
+\begin{center}
+\begin{tabular}{lcl}
+$\textit{estimate}(n)$ & $\dn$ & $1$\\
+$\textit{estimate}(x)$ & $\dn$ & $1$\\
+$\textit{estimate}(a_1\;aop\;a_2)$ & $\dn$ &
+$\textit{estimate}(a_1) + \textit{estimate}(a_2)$\\
+$\textit{estimate}(\pcode{if}\;b\;\pcode{then}\;e_1\;\pcode{else}\;e_2)$ & $\dn$ &
+$\textit{estimate}(b) +$\\
+& & $\quad max(\textit{estimate}(e_1), \textit{estimate}(e_2))$\\
+$\textit{estimate}(\pcode{write}(e))$ & $\dn$ &
+$\textit{estimate}(e) + 1$\\
+$\textit{estimate}(e_1 ; e_2)$ & $\dn$ &
+$max(\textit{estimate}(e_1), \textit{estimate}(e_2))$\\
+$\textit{estimate}(f(e_1, ..., e_n))$ & $\dn$ &
+$\sum_{i = 1..n}\;estimate(e_i)$\medskip\\
+$\textit{estimate}(a_1\;bop\;a_2)$ & $\dn$ &
+$\textit{estimate}(a_1) + \textit{estimate}(a_2)$\\
+\end{tabular}
+\end{center}
+
+\noindent This function overestimates the stack size, for
+example, in the case of \pcode{if}s. Since we cannot predict
+which branch will be run, we have to allocate the maximum
+of stack each branch might take. I leave you also to think
+about whether the estimate in case of function calls is the
+best possible estimate. Note also that in case of \pcode{write}
+we need to add one, because we duplicate the top-most element
+in the stack.
+
+With this all in place, we can start generating code, for
+example, for the two functions:
+
+\begin{lstlisting}[numbers=none]
+def suc(x) = x + 1;
+
+def add(x, y) = if x == 0 then y
+ else suc(add(x - 1, y));
+\end{lstlisting}
+
+\noindent The succesor function is a simple loading of the
+argument $x$ (index $0$) onto the stack, as well as the number
+$1$. Then we add both elements leaving the result of the
+addition on top of the stack. This value will be returned by
+the \pcode{suc}-function. See below:
+
+\begin{lstlisting}[language=JVMIS]
+.method public static suc(I)I
+.limit locals 1
+.limit stack 2
+ iload 0
+ ldc 1
+ iadd
+ ireturn
+.end method
+\end{lstlisting}
+
+\noindent The addition function is a bit more interesting
+since in the last line we have to call the function
+recursively and ``wrap around'' a call to the successor
+function. The code is as follows:
+
+\begin{lstlisting}[language=JVMIS]
+.method public static add(II)I
+.limit locals 2
+.limit stack 5
+ iload 0
+ ldc 0
+ if_icmpne If_else
+ iload 1
+ goto If_end
+If_else:
+ iload 0
+ ldc 1
+ isub
+ iload 1
+ invokestatic XXX/XXX/add(II)I
+ invokestatic XXX/XXX/suc(I)I
+If_end:
+ ireturn
+.end method
+\end{lstlisting}
+
+\noindent The local limit is because add takes two arguments.
+The stack limit is a simple calculation using the estimate
+function. We first generate code for the boolean expression
+\pcode{x == 0}, that is loading local variable 0 and the
+number 0 onto the stack (Lines 4 and 5). If the not-equality
+test fails we continue with returning $y$, which is the local
+variable 1 (followed by a jump to the return instruction). If
+the not-equality test succeeds then we jump to the label
+\pcode{If_else} (Line 9). After that label is the code for
+\pcode{suc(add(x - 1, y))}. We first have to evaluate the
+argument of the suc-function. But this means we first have to
+evaluate the two arguments of the add-function. This means
+loading $x$ and $1$ onto the stack and subtracting them.
+Then loading $y$ onto the stack. We can then make a recursive
+call to add (its two arguments are on the stack). When this
+call returns we have the result of the addition on the top of
+the stack and just need to call suc. Finally, we can return
+the result on top of the stack as the result of the
+add-function.
\end{document}