afl-material: comparison handouts/ho09.tex

equal deleted inserted replaced

-:df8660143051
+:e32802acf952
 allow us to benefit from the modular structure of the LLVM compiler
 and we can let the compiler generate code for different CPUs, for
 example X86 or ARM.  That means we can be agnostic about where our
 code is actually going to run.\footnote{Anybody want to try to run our
 programs on Android phones?}  We can also be somewhat ignorant about
-optimising our code and about allocating memory efficiently: the LLVM
+optimising our code and about allocating memory efficiently---the LLVM
 tools will take care of all this.
 However, what we have to do in order to make LLVM to play ball is to
-generate code in \emph{Static Single-Assignment} format (short SSA),
+generate code in \emph{Static Single-Assignment} format (short SSA). A
-because that is what the LLVM-IR expects from us. A reason why LLVM
+reason why LLVM uses the SSA-format, rather than JVM-like stack
-uses the SSA-format, rather than JVM-like stack instructions, is that
+instructions, is that stack instructions are difficult to
-stack instructions are difficult to optimise---you cannot just
+optimise---you cannot just re-arrange instructions without messing
-re-arrange instructions without messing about with what is calculated
+about with what is calculated on the stack. Also it is hard to find
-on the stack. Also it is hard to find out if all the calculations on
+out if all the calculations on the stack are actually necessary and
-the stack are actually necessary and not by chance dead code. The JVM
+not by chance dead code. The JVM has for all these obstacles
-has for all these obstacles sophisticated machinery to make such
+sophisticated machinery to make such ``high-level'' code still run
-``high-level'' code still run fast, but let's say that for the sake of
+fast, but let's say that for the sake of argument we do not want to
-argument we do not want to rely on it. We want to generate fast code
+rely on it. We want to generate fast code ourselves. This means we
-ourselves. This means we have to work around the intricacies of what
+have to work around the intricacies of what instructions CPUs can
-instructions CPUs can actually process fast. This is what the SSA
+actually process fast. This is what the SSA format is designed for.
-format is designed for.
 The main idea behind the SSA-format is to have sequences of very
 simple variable assignments where every (tmp)variable is assigned only
 once. The assignments need to be simple in the sense that they can be
 let tmp5 = add tmp0 tmp4 in
 return tmp5
 \end{lstlisting}
 \noindent where every tmpX-variable is used only once (we could, for
-example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
+example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if
-would not change the overall result). At the end we have a return-instruction
+this would not change the overall result). At the end we have a
-wich contains the final result of the expression.
+return-instruction wich contains the final result of the
+expression. As can be seen, the task we have to solve is to take apart
+compound expressions as shown above and transfrom them into a sequence
+of simple assignments. Note that this for example means we have to
+fix the order in which the expression is calculated.
 There are sophisticated algorithms for imperative languages, like C,
 that efficiently transform high-level programs into SSA-format. But
 we can ignore them here. We want to compile a functional language and
 there things get much more interesting than just sophisticated. We
 addition and multiplication, are also prefixed with the type they
 operate on. Obviously these types need to match up\ldots{} but since
 we have in our programs only integers, for the moment \texttt{i32}
 everywhere will do. We do not have to generate any other types, but
 obviously this is a limitation in our Fun-language (and which we
-lift in the final CW).
+are going to lift in the final CW).
 There are a few interesting instructions in the LLVM-IR which are quite
 different than what we have seen in the JVM. Can you remember the
 kerfuffle we had to go through with boolean expressions and negating the
 condition? In the LLVM-IR, branching  if-conditions are implemented
 \noindent
 where the arguments can only be simple variables and numbers, but not compound
 expressions.
 Conveniently, you can use the program \texttt{lli}, which comes with
-LLVM, to interpret programs written in the LLVM-IR. So you can easily
+LLVM, to interpret programs written in the LLVM-IR. Type on the command line
-check whether the code you produced actually works. To get a running
-program that does something interesting you need to add some boilerplate
+\begin{lstlisting}[language=bash,numbers=none]
-about printing out numbers and a main-function that is the entry point
+lli sqr.ll
-for the program (see Figure~\ref{lli} for a complete listing of the square function). Again
+\end{lstlisting}
+\noindent
+and you can see the output of the pragram generated.
+This means you can easily check whether the code you produced actually
+works. To get a running program that does something interesting you
+need to add some boilerplate about printing out numbers and a
+main-function that is the entry point for the program (see
+Figure~\ref{lli} for a complete listing of the square function). Again
 this is very similar to the boilerplate we needed to add in our JVM
 compiler.
 You can generate a binary for the program in Figure~\ref{lli} by using
 the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
 an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
 executable binary:
 \begin{figure}[t]\small
 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
 \caption{An LLVM-IR program for calculating the square function. It
 calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
 (Line 20). The
-code for the \texttt{sqr}-function is in Lines 13 -- 16. It stores
+code for the \texttt{sqr}-function is in Lines 13 -- 16. The main-function
+stores
 the result of sqr in the variable called \texttt{\%1} and then
 prints out the contents of this variable in Line 21. The other
 code is boilerplate for printing out integers.\label{lli}}
 \end{figure}
 \noindent
 where a K-value can be a variable, a number or a ``trivial'' binary
 operation, such as addition or multiplication. The two arguments of a
 binary operation need to be K-values as well. Finally, we have
 function calls, but again each argument of the function call needs to
-be a K-value.  One point we need to be careful, however, is that the
+be a K-value.  One constraint we need to be careful about is that the
-arguments of the binary operations and function calls can in fact only
+arguments of the binary operations and function calls can only be
-be variables or numbers. To encode this constraint into the type-system
+variables or numbers. For example
-of Scala, however, would make things too complicated---to satisfy this
-constraint is therefore on us. For our Fun-language, we will later on
+\begin{lstlisting}[language=Scala,numbers=none]
-consider some further K-values.
+KAop("+", KAop("*", KNum(1), KNum(2)), KNum(3))
+KCall("foo", List(KAop("+", KNum(4), KNum(5)))
+\end{lstlisting}
+\noindent
+while perfect instances of K-values according to the types, are not
+allowed in the LLVM-IR. To encode this constraint into the type-system
+of Scala, however, would make things overly complicated---to satisfy
+this constraint is therefore on us. For the moment this will be all
+K-values we are considdering. Later on, we will have some more for our
+Fun-language.
 Our K-expressions will encode the \texttt{let} and the \texttt{return}
 from the SSA-format (again for the moment we only consider these two
 constructors---later on we will have if-branches as well).
 case KReturn(v: KVal)
 }
 \end{lstlisting}
 \noindent
-By having in \texttt{KLet} taking first a string (standing for an
+By having in \texttt{KLet} taking first a string (standing for a
-intermediate variable) and second a value, we can fulfil the SSA
+tmp-variable) and second a value, we can fulfil the SSA constraint in
-constraint in assignments ``by con\-struction''---there is no way we
+assignments ``by con\-struction''---there is no way we could write
-could write anything else than a K-value.  Note that the third
+anything else than a K-value.  Note that the third argument of a
-argument of a \texttt{KLet} is again a K-expression, meaning either
+\texttt{KLet} is again a K-expression, meaning either another
-another \texttt{KLet} or a \texttt{KReturn}. In this way we can
+\texttt{KLet} or a \texttt{KReturn}. In this way we can construct a
-construct a sequence of computations and return a final result.  Of
+sequence of computations and indicate what is the final result of the
-course we also have to ensure that all intermediary computations are
+computations.  According to the SSA-format, we also have to ensure
-given (fresh) names---we will use an (ugly) counter for this.
+that all intermediary computations are given (fresh) names---we will
+use an (ugly) counter for this.
 To sum up, K-values are the atomic operations that can be on the
 right-hand side of assignemnts. The K-language is restricted such that
 it is easy to generate the SSA-format for the LLVM-IR. What remains to
 be done is a translation of our Fun-language into the K-language. The
 main difficulty is that we need to order the computation---in the
 K-language we only have linear sequence of instructions. Before we
-explain this, we have a look at some CPS-translation.
+explain this, we have a look at some programs in CPS-style.
 \subsection*{CPS-Translations}
-CPS stands for \emph{Continuation-Passing-Style}. It is a kind of programming
+CPS stands for \emph{Continuation-Passing-Style}. It is a kind of
-technique often used in advanced functional programming. Before we delve
+programming technique often used in advanced functional programming
-into the CPS-translation for our Fun-language compiler, let us look at
+and in particular in compilers. Before we delve into the
+CPS-translation for our Fun-language compiler, let us look at
 CPS-versions of some well-known functions. Consider
 \begin{lstlisting}[language=Scala, numbers=none]
 def fact(n: Int) : Int =
 if (n == 0) 1 else n * fact(n - 1)
 \end{lstlisting}
 \noindent
 This is clearly the usual factorial function. But now consider the
-following version of the factorial function:
+following slight variant of the factorial function:
+\begin{lstlisting}[language=Scala, numbers=left]
+def factC(n: Int, k: Int => Int) : Int =
+if (n == 0) k(1)
+else factC(n - 1, x => k(n * x))
+factC(3, id)
+\end{lstlisting}
+\noindent
+The function is is Lines 1--3. The function call is in Line 5.  We
+call this function with a number, in this case 3, and the
+identity-function (which returns just its input). The \texttt{k} is
+the continuation in this function. The recursive calls are:
 \begin{lstlisting}[language=Scala, numbers=none]
-def factC(n: Int, ret: Int => Int) : Int =
+factC(2, x => id(3 * x))
-if (n == 0) ret(1)
+factC(1, x => id(3 * (2 * x)))
-else factC(n - 1, x => ret(n * x))
+factC(0, x => id(3 * (2 * (1 * x))))
+\end{lstlisting}
-factC(3, identity)
-\end{lstlisting}
+\noindent
+Having reached 0, we get out of the recursion and apply 1 to the
-\noindent
+continuation (see if-branch above in Line 2). This gives
-This function is called with a number, in this case 3, and the
-identity-function (which returns just its input). The recursive
-calls are:
 \begin{lstlisting}[language=Scala, numbers=none]
-factC(2, x => identity(3 * x))
+id(3 * (2 * (1 * 1)))
-factC(1, x => identity(3 * (2 * x)))
-factC(0, x => identity(3 * (2 * (1 * x))))
-\end{lstlisting}
-\noindent
-Having reached 0, we get out of the recursion and apply 1 to the
-continuation (see if-branch above). This gives
-\begin{lstlisting}[language=Scala, numbers=none]
-identity(3 * (2 * (1 * 1)))
 = 3 * (2 * (1 * 1))
 = 6
 \end{lstlisting}
 \noindent
 which is the expected result. If this looks somewhat familiar to you,
-than this is because functions with continuations can be
+than this is because functions with continuations can be seen as a
-seen as a kind of generalisation of tail-recursive functions.
+kind of generalisation of tail-recursive functions.  So far, however,
-So far we did not look at this generalisation in earnest.
+we did not look at this generalisation in earnest.  Anyway notice how
-Anyway
+the continuations is ``stacked up'' during the recursion and then
-notice how the continuations is ``stacked up'' during the recursion and
+``unrolled'' when we apply 1 to the continuation. Interestingly, we
-then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
+can do something similar to the Fibonacci function where in the
-can do something similar to the Fibonacci function where in the traditional
+traditional version we have two recursive calls. Consider the
-version we have two recursive calls. Consider the following function
+following function
-\begin{lstlisting}[language=Scala, numbers=none]
+\begin{lstlisting}[language=Scala, numbers=left]
-def fibC(n: Int, ret: Int => Int) : Int =
+def fibC(n: Int, k: Int => Int) : Int =
-if (n == 0 || n == 1) ret(1)
+if (n == 0 || n == 1) k(1)
 else fibC(n - 1,
 r1 => fibC(n - 2,
-r2 => ret(r1 + r2)))
+r2 => k(r1 + r2)))
 \end{lstlisting}
 \noindent
-Here the continuation is a nested function essentially wrapping up
+Here the continuation (Lines 4--5) is a nested function essentially wrapping up
-the second recursive call. Let us check how the recursion unfolds
+the second recursive call plus the original continuation. Let us check how the recursion unfolds
 when called with 3 and the identity function:
 \begin{lstlisting}[language=Scala, numbers=none]
 fibC(3, id)
 fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
 \noindent
 The point of this section is that you should be playing around with these
 simple definitions and make sure they calculate the expected result.
 In the next step, you should understand \emph{how} these functions
-calculate the result with the continuations.
+calculate the result with the continuations. Now change the initial
+continuation which you call the function to
+\begin{lstlisting}[language=Scala, numbers=none]
+r => { println("The result plus 1 is:") ; r + 1 }
+\end{lstlisting}
+\noindent
+Does this still calculate the expected result?
 \section*{Worked Example}
 Let us now come back to the CPS-translations for the Fun-language.
 (1 + 2) * (3 + 4)
 \end{equation}
 \noindent
 which of the subexpressions should be calculated first? We just
-arbitrarily going to decide that it will be from left to right. Other
+arbitrarily going to decide that the calculation will be from left to
-languages make different choices---C famously is right to left. In our
+right. Other languages make different choices---C famously is right to
-case this means we have to tear the expression shown in \eqref{exp}
+left. In our case this means we have to tear the expression shown in
-apart as follows:
+\eqref{exp} apart as follows:
 \[
 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
 \]
 \noindent
-The first subexpression can be calculated and will give us a result,
+The first subexpression can be easily calculated and will give us a result,
 but the second one is not really an expression that makes sense. It
 will only make sense as the next step in the computation when we
 fill-in the result of $1+2$ into the ``hole'' indicated by
 $\Box$. Such incomplete expressions can be represented in Scala as
 functions
 \begin{lstlisting}[language=Scala, numbers=none]
 r => r * (3 + 4)
 \end{lstlisting}
 \noindent
-where \texttt{r} is any result that has been computed earlier. By the
+where \texttt{r} will represent any result that has been computed
-way, in lambda-calculus notation we would write
+earlier. By the way, in lambda-calculus notation we would write
 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
 functions (``continuations'') to represent what is coming next in a
 sequence of instructions. In our case, continuations are functions of
 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
 \code{KLet}s where there is a ``hole'' that needs to be
 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
 let tmp0 = add 1 a in
 let tmp1 = mul (*@$\Box$@*) 5 in
 let tmp2 = add 3 tmp1 in
 let tmp3 = add tmp0 tmp2 in
-retrun tmp3
+return tmp3
 \end{lstlisting}
 \noindent
 where in the second line is a $\Box$ which still expects a \code{KVal}
 to be filled in before it becomes a ``proper'' \code{KExp}. When we
 suppose our source language consists just of three kinds of expressions
 \begin{lstlisting}[language=Scala,numbers=none]
 enum Expr {
 case Num(n: Int)
-case Bop(op: String, e1: Expr, e2: Expr)
+case Aop(op: String, e1: Expr, e2: Expr)
 case Call(fname: String, args: List[Expr])
 }
 \end{lstlisting}
 \noindent
-The code of the CPS-translation is then of the form
+This allows us to generate SSA-instructions for expressions like
+\[
+1 + foo(bar(4 * 7), 3, id(12))
+\]
+\noindent
+The code of the CPS-translation
+that generates these instructions is then of the form
 \begin{lstlisting}[language=Scala,numbers=none]
 def CPS(e: Exp)(k: KVal => KExp) : KExp =
 e match { ... }
 \end{lstlisting}
 be compiled. The result of the function is a K-expression, which later
 can be compiled into LLVM-IR code.
 In case we have numbers, then we can just pass them in CPS-translation
 to the continuations because numbers need not be further teared apart
-as they are already atomic. Passing the number to the continuation
+as they are already primitive. Passing the number to the continuation
 means we apply the continuation like
 \begin{lstlisting}[language=Scala,numbers=none]
 case Num(i) => k(KNum(i))
 \end{lstlisting}
 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
 \end{lstlisting}
-\noindent is a function from \code{KVal} to \code{KExp}. Once
+\noindent is a function from \code{KVal} to \code{KExp}, which we need
-we created the assignment with the fresh temporary variable
+as type for the continuation. Once we created the assignment with the
-\texttt{z}, we need to ``communicate'' that the result of the
+fresh temporary variable \texttt{z}, we need to ``communicate'' that
-computation of the arithmetic expression is stored in \texttt{z} to the
+the result of the computation of the arithmetic expression is stored
-computations that follow. In this way we apply the continuation \texttt{k} with this
+in \texttt{z} to the computations that follow. In this way we apply
-new variable (essentially we are plugging in a hole further down the line).
+the continuation \texttt{k} with this new variable (essentially we are
-Hope this makes sense.
+plugging in a hole further down the line).  Hope this makes sense.
 The last case we need to consider in our small expression language are
-function calls.
+function calls. For them remember each argument of the function
+call can in SSA-format only be a variable or a number.
 \begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
 case Call(fname, args) => {
 def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
 case Nil => {
 aux(args, Nil)
 }
 \end{lstlisting}
 \noindent
-For them we introduce an auxiliary function that compiles first all
+For this case we introduce an auxiliary function that compiles first all
 function arguments---remember in our source language we can have calls
 such as $foo(3 + 4, g(3))$ where we first have to create temporary
 variables (and computations) for each argument. Therefore \texttt{aux}
 analyses the argument list from left to right. In case there is an
 argument \texttt{a} on the front of the list (the case \texttt{a::as}
 \hspace{10mm}...
 \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
 \caption{CPS-translation for a simple expression language.\label{cps}}
 \end{figure}
-The last question we need to answer in the code (see Figure~\ref{cps}) is how we
+The last question we need to answer in the code (see Figure~\ref{cps})
-start the CPS-translation function, or more precisely with which continuation we
+is how we start the CPS-translation function, or more precisely with
-should start CPS? It turns out that this initial continuation will be the last one that is
+which continuation we should start CPS? It turns out that this initial
-called. What should be the last step in the computation? Yes, we need to return the
+continuation will be the last one that is called. What should be the
-temporary variable where the last result is stored (if it is a simple number, then we can
+last step in the computation? Yes, we need to return the temporary
-just return this number). Therefore we call CPS with the code
+variable where the last result is stored (if it is a simple number,
+then we can just return this number). Therefore we call CPS with the
+code
 \begin{lstlisting}[language=Scala,numbers=none]
 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
 \end{lstlisting}
 \noindent
 where we give the function \code{KReturn(_)} as the continuation. With
-this we completed the translation of expressions into our
+this we completed the translation of simple expressions into our
-K-language. Figure~\ref{absfun} gives the complete datatypes for the
+K-language. Play around with some more expressions and see how the
-ASTs of the Fun-language and the K-values and K-expressions for the
+CPS-translation generates the correct code. I know this is not easy to
-K-language.
+follow code when you see it the first time.  Figure~\ref{absfun} gives
+the complete datatypes for the ASTs of the Fun-language and the
+K-values and K-expressions for the K-language. The complete
+CPS-translation you can find in the code.
+\section*{Next Steps}
 Having obtained a K-expression, it is relatively straightforward to
 generate a valid program for the LLVM-IR. We leave this to the
 attentive reader. What else can we do?  Well it should be relatively
 easy to apply some common optimisations to the K-expressions. One

changeset 912	e32802acf952
parent 909	a04efdd5e7a3
child 913	eef6a56c185a