Binary file handouts/ho09.pdf has changed
--- a/handouts/ho09.tex Mon Oct 28 13:34:03 2019 +0000
+++ b/handouts/ho09.tex Fri Nov 01 13:21:51 2019 +0000
@@ -16,30 +16,19 @@
Reflecting on our tiny compiler targetting the JVM, the code generation
part was actually not so hard, no? Pretty much just some post-traversal
-of the abstract syntax tree, yes? One of the main reason for this ease is
-that the JVM is a stack-based virtual machine and it is therefore not
+of the abstract syntax tree, yes? One of the main reason for this ease
+is that the JVM is a stack-based virtual machine and it is therefore not
hard to translate arithmetic expressions into a sequence of instructions
manipulating the stack. The problem is that ``real'' CPUs, although
supporting stack operations, are not really designed to be \emph{stack
machines}. The design of CPUs is more like, here is a chunk of
memory---compiler, or better compiler writers, do something with it.
-Consequently, modern compilers need to go the extra mile in order to generate
-code that is much easier and faster to process by CPUs.
-
-
-Another reason why it makes sense to go the extra mile is that stack
-instructions are very difficult to optimise---you cannot just re-arrange
-instructions without messing about with what is calculated on the stack.
-Also it is hard to find out if all the calculations on the stack are
-actually necessary and not by chance dead code. The JVM has for all this
-sophisticated machinery to make such ``high-level'' code still run fast,
-but let's say that for the sake of argument we do not want to rely on
-it. We want to generate fast code ourselves. This means we have to work
-around the intricacies of what instructions CPUs can actually process.
-To make this all tractable for this module, we target the LLVM
-Intermediate Language. In this way we can take advantage of the tools
-coming with LLVM. For example we do not have to worry about things like
-register allocations.\bigskip
+Consequently, modern compilers need to go the extra mile in order to
+generate code that is much easier and faster to process by CPUs. To make
+this all tractable for this module, we target the LLVM Intermediate
+Language. In this way we can take advantage of the tools coming with
+LLVM. For example we do not have to worry about things like register
+allocations.\bigskip
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
that projects from Academia can make a difference in the world. LLVM
@@ -124,26 +113,48 @@
Obviously these types need to match up\ldots{} but since we have in our
programs only integers, \texttt{i32} everywhere will do.
-Conveniently, you can use the program \texttt{lli}, which comes with LLVM, to interprete
-programs written in the LLVM-IR. So you can easily check whether the
-code you produced actually works. To get a running program that does
-something interesting you need to add some boilerplate about printing out
-numbers and a main-function that is the entrypoint for the program (see Figure~\ref{lli}).
+Conveniently, you can use the program \texttt{lli}, which comes with
+LLVM, to interpret programs written in the LLVM-IR. So you can easily
+check whether the code you produced actually works. To get a running
+program that does something interesting you need to add some boilerplate
+about printing out numbers and a main-function that is the entrypoint
+for the program (see Figure~\ref{lli}). You can generate a binary for
+this program using \texttt{llc}-compiler in order to generate an object
+file and then use gcc (clang) for generating a binary:
-\begin{figure}[t]
-\lstinputlisting[language=LLVM]{../progs/sqr.ll}
-\caption{\label{lli}}
+\begin{lstlisting}[language=bash,numbers=none]
+llc -filetype=obj sqr.ll
+gcc sqr.o -o a.out
+./a.out
+\end{lstlisting}
+
+\begin{figure}[t]\small
+\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
+\caption{An LLVM-IR program for calculating the square function. The
+interesting function is \texttt{sqr} in Lines 13 -- 16. The main
+function calls \texttt{sqr} and prints out the result. The other
+code is boilerplate for printing out integers.\label{lli}}
\end{figure}
-\begin{figure}[t]
+
+
+\subsection*{Our Own Intermediate Language}
+
+Remember compilers have to solve the problem of bridging the gap between
+``high-level'' programs and ``low-level'' hardware. If the gap is tool
+wide then a good strategy is to lay a stepping stone somewhere in
+between. The LLVM-IR itself is such a stepping stone to make the task of
+generating code easier. Like a real compiler we will use another
+stepping stone which I call \emph{K-language}. For this remember
+expressions (and boolean expressions) in the Fun language are given by
+the code on top of Figure~\ref{absfun}
+
+
+\begin{figure}[p]\small
\begin{lstlisting}[language=Scala,numbers=none]
-abstract class Exp extends Serializable
-abstract class BExp extends Serializable
-abstract class Decl extends Serializable
-
-case class Main(e: Exp) extends Decl
-case class Def(name: String, args: List[String], body: Exp)
- extends Decl
+// Fun-language (expressions)
+abstract class Exp
+abstract class BExp
case class Call(name: String, args: List[Exp]) extends Exp
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
@@ -152,16 +163,45 @@
case class Num(i: Int) extends Exp
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
case class Sequence(e1: Exp, e2: Exp) extends Exp
-case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
+case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
+
+
+
+// K-language (K-expressions, K-values)
+abstract class KExp
+abstract class KVal
+
+case class KVar(s: String) extends KVal
+case class KNum(i: Int) extends KVal
+case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
+case class KCall(o: String, vrs: List[KVal]) extends KVal
+case class KWrite(v: KVal) extends KVal
+
+case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
+case class KLet(x: String, e1: KVal, e2: KExp) extends KExp
+case class KReturn(v: KVal) extends KExp
\end{lstlisting}
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
\end{figure}
-
+
\subsection*{CPS-Translations}
+
+
+
+Another reason why it makes sense to go the extra mile is that stack
+instructions are very difficult to optimise---you cannot just re-arrange
+instructions without messing about with what is calculated on the stack.
+Also it is hard to find out if all the calculations on the stack are
+actually necessary and not by chance dead code. The JVM has for all this
+sophisticated machinery to make such ``high-level'' code still run fast,
+but let's say that for the sake of argument we do not want to rely on
+it. We want to generate fast code ourselves. This means we have to work
+around the intricacies of what instructions CPUs can actually process.
+
\end{document}
--- a/progs/fun_llvm.scala Mon Oct 28 13:34:03 2019 +0000
+++ b/progs/fun_llvm.scala Fri Nov 01 13:21:51 2019 +0000
@@ -106,13 +106,14 @@
aux(args, Nil)
}
case Sequence(e1, e2) =>
- CPS(e1)(y1 => CPS(e2)(y2 => k(y2)))
+ CPS(e1)(_ => CPS(e2)(y2 => k(y2)))
case Write(e) => {
val z = Fresh("tmp")
CPS(e)(y => KLet(z, KWrite(y), k(KVar(z))))
}
}
+//initial continuation
def CPSi(e: Exp) = CPS(e)(KReturn)
// some testcases
@@ -190,8 +191,8 @@
case KLet(x: String, v: KVal, e: KExp) =>
i"%$x = ${compile_val(v)}" ++ compile_exp(e)
case KIf(x, e1, e2) => {
- val if_br = Fresh("if_br")
- val else_br = Fresh("else_br")
+ val if_br = Fresh("if_branch")
+ val else_br = Fresh("else_branch")
i"br i1 %$x, label %$if_br, label %$else_br" ++
l"\n$if_br" ++
compile_exp(e1) ++
@@ -208,7 +209,7 @@
define i32 @printInt(i32 %x) {
%t0 = getelementptr [4 x i8], [4 x i8]* @.str, i32 0, i32 0
- call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %x)
+ call i32 (i8*, ...) @printf(i8* %t0, i32 %x)
ret i32 %x
}