handouts/ho08.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Wed, 18 Nov 2015 01:53:01 +0000
changeset 380 1e88390e81aa
parent 379 fa2589ec0fae
child 381 47eceea734c5
permissions -rw-r--r--
updated

\documentclass{article}
\usepackage{../style}
\usepackage{../langs}
\usepackage{../grammar}
\usepackage{../graphics}



\begin{document}
\lstset{morekeywords={def,if,then,else,write,read},keywordstyle=\color{codepurple}\bfseries}


\section*{Handout 8 (A Functional Language)}


The language we looked at in the previous lecture was rather
primitive and the compiler rather crude---everything was
essentially compiled into a big monolithic chunk of code
inside the main function. In this handout we like to have a
look at a slightly more comfortable language, which I call
Fun-language, and a tiny-teeny bit more realistic compiler.
The Fun-language is a functional programming language. A small
collection of programs we want to be able to write and compile
is as follows:

\begin{lstlisting}[basicstyle=\ttfamily, numbers=none]
def fib(n) = if n == 0 then 0 
             else if n == 1 then 1 
             else fib(n - 1) + fib(n - 2);

def fact(n) = if n == 0 then 1 else n * fact(n - 1);

def ack(m, n) = if m == 0 then n + 1
                else if n == 0 then ack(m - 1, 1)
                else ack(m - 1, ack(m, n - 1));
                
def gcd(a, b) = if b == 0 then a else gcd(b, a % b);                
\end{lstlisting}

\noindent Compare the code of the fib-program with the same
program written in the While-language\ldots Fun is definitely
more comfortable. We will still focus on programs involving
integers only, that means for example that every function is
expected to return an integer. The point of the Fun language
is to compile each function to a separate method in JVM
bytecode (not just a big monolithic code chunk). The means we
need to adapt to some of the conventions of the JVM about
methods.

The grammar of the Fun-language is slightly simpler than the
While-language, because the main syntactic category are
expressions (we do not have statements). The grammar rules are
as follows:


\begin{plstx}[rhs style=,margin=1.5cm]
: \meta{Exp} ::= \meta{Id} | \meta{Num} {\hspace{3cm}}
             |   \meta{Exp} + \meta{Exp} | ... | (\meta{Exp})
             |   \code{if} \meta{BExp} \code{then} \meta{Exp} \code{else} \meta{Exp}
             |   \code{write} \meta{Exp} {\hspace{5cm}}
             |   \meta{Exp} ; \meta{Exp}  {\hspace{5cm}}
             |   \textit{FunName} (\meta{Exp}, ..., \meta{Exp})\\
: \meta{BExp} ::= ...\\
: \meta{Decl} ::= \meta{Def} ; \meta{Decl}
             | \meta{Exp}\\
: \meta{Def} ::= \code{def} \textit{FunName} ($x_1$, ..., $x_n$) = \meta{Exp}\\               
\end{plstx}

\noindent where, as usual, \meta{Id} stands for variables and
\meta{Num} for numbers. We can call a function by applying the
arguments to a function name (as shown in the last clause of
\meta{Exp}). The arguments in such a function call can be
again expressions, including other function calls. In
contrast, when defining a function (see \meta{Def}-clause) the
arguments need to be variables, say $x_1$ to $x_n$. We call
the expression on the right of = in a function definition as
the \emph{body of the function}. We have the restriction that
the variables inside a function body can only be those that
are mentioned as arguments of the function. A Fun-program is
then a sequence of function definitions separated by
semicolons, and a final ``main'' call of a function that
starts the computation in the program. For example

\begin{lstlisting}[numbers=none]
def fact(n) = if n == 0 then 1 else n * fact(n - 1);
write(fact(5))
\end{lstlisting}

\noindent would be a valid Fun-program. The parser of the
Fun-language produces abstract syntax trees which in Scala
can be represented as follows:


{\small
\begin{lstlisting}[numbers=none]
abstract class Exp
abstract class BExp 
abstract class Decl

case class Var(s: String) extends Exp
case class Num(i: Int) extends Exp
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
case class Write(e: Exp) extends Exp
case class Sequ(e1: Exp, e2: Exp) extends Exp
case class Call(name: String, args: List[Exp]) extends Exp

case class Bop(o: String, a1: Exp, a2: Exp) extends BExp

case class Def(name: String, 
               args: List[String], 
               body: Exp) extends Decl
case class Main(e: Exp) extends Decl
\end{lstlisting}}

Let us first look at some clauses for compiling expressions.
The compilation of arithmetic and boolean expressions is just
like for the While-language and do not need any modification.
(recall that the \textit{compile}-function for boolean
expression takes a third argument for the label where the
contro-flow should jump when the boolean expression is
\emph{not} true---this is needed for compiling \pcode{if}s).
One additional feature in the Fun-language are sequences.
Their purpose is to do one calculation after another. The
reason why we need to be careful however is the convention
that every expression can only produce s single result
(including sequences). Since this result will be on the
top of the stack, we need to generate a 
\pcode{pop}-instruction in order to clean-up the stack. Given 
the expression of the form \pcode{exp1 ; exp2} we need to 
generate code where after the first code chunk
a \pcode{pop}-instruction is needed. 

\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
$\textrm{\textit{compile}}($exp1$)$
pop
$\textrm{\textit{compile}}($exp2$)$
\end{lstlisting}

\noindent In effect we ``forget'' about the result the first
expression calculates. I leave you to think about why this
sequence operator is still useful in the Fun-language, even 
if the first result is just ``discarded''. 

There is also one small modification we have to perform when
calling the write method. Remember in the Fun-language we have
the convention that every expression needs to return an
integer as a result (located on the top of the stack). Our
helper function implementing write, however, ``consumes'' the
top element of the stack and violates this convention.
Therefore before we call, say, \pcode{write(1+2)}, we need to
duplicate the top element of the stack like so

\begin{figure}[t]
\begin{lstlisting}[language=JVMIS, 
                   xleftmargin=2mm, 
                   numbers=none]
.method public static write(I)V 
   .limit locals 1
   .limit stack 2
   getstatic java/lang/System/out Ljava/io/PrintStream; 
   iload 0 
   invokevirtual java/io/PrintStream/println(I)V 
   return 
.end method
\end{lstlisting}
\caption{The helper function for printing out integers.\label{write}}
\end{figure}


\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
$\textrm{\textit{compile}}($1+2$)$
dup
invokestatic XXX/XXX/write(I)V
\end{lstlisting}

\noindent We also need to first generate code for the
argument-expression of write, which in the While-language was
only allowed to be a single variable.

Most of the new code in the compiler for the Fun-language
comes from function definitions and function calls. For this
have a look again at the helper function in
Figure~\ref{write}. Assuming we have a function definition

\begin{lstlisting}[numbers=none,mathescape]
def fname (x1, ... , xn) = ...
\end{lstlisting}

\noindent then we have to generate

\begin{lstlisting}[numbers=none,mathescape,language=JVMIS]
.method public static fname (I...I)I
   .limit locals ??
   .limit stack ?? 
   ...
  ireturn
.method end  
\end{lstlisting}

\noindent where the number of \pcode{I}s corresponds to the
number of arguments the function has, say \pcode{x1} to
\pcode{xn}. The final \pcode{I} is needed in order to indicate
that the function returns an integer. Therefore we also have
to use \pcode{ireturn} instead of \pcode{return}. However,
more interesting are the two \pcode{.limit} lines. Locals
refers to the local variables of the method, which can be
queried and overwritten using \pcode{iload} and
\pcode{istore}, respectively.
 
\end{document}

%%% Local Variables: 
%%% mode: latex  
%%% TeX-master: t
%%% End: