afl-material: comparison handouts/ho09.tex

equal deleted inserted replaced

-:7dac350b20a1
+:52263ffd17b9
 \fnote{\copyright{} Christian Urban, King's College London, 2019}
 \section*{Handout 9 (LLVM, SSA and CPS)}
-Reflecting on our tiny compiler targetting the JVM, the code generation
+Reflecting on our two tiny compilers targetting the JVM, the code
-part was actually not so hard, no? Pretty much just some post-traversal
+generation part was actually not so hard, no? Pretty much just some
-of the abstract syntax tree, yes? One of the reasons for this ease is
+post-traversal of the abstract syntax tree, yes? One of the reasons for
-that the JVM is a stack-based virtual machine and it is therefore not
+this ease is that the JVM is a stack-based virtual machine and it is
-hard to translate deeply-nested arithmetic expressions into a sequence
+therefore not hard to translate deeply-nested arithmetic expressions
-of instructions manipulating the stack. The problem is that ``real''
+into a sequence of instructions manipulating the stack. The problem is
-CPUs, although supporting stack operations, are not really designed to
+that ``real'' CPUs, although supporting stack operations, are not really
-be \emph{stack machines}.  The design of CPUs is more like, here is a
+designed to be \emph{stack machines}.  The design of CPUs is more like,
-chunk of memory---compiler, or better compiler writers, do something
+here is a chunk of memory---compiler, or better compiler writers, do
-with it. Consequently, modern compilers need to go the extra mile in
+something with it. Consequently, modern compilers need to go the extra
-order to generate code that is much easier and faster to process by
+mile in order to generate code that is much easier and faster to process
-CPUs. To make this all tractable for this module, we target the LLVM
+by CPUs. To make this all tractable for this module, we target the LLVM
 Intermediate Language. In this way we can take advantage of the tools
 coming with LLVM. For example we do not have to worry about things like
 register allocations.\bigskip
 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
-that projects from Academia can make a difference in the world. LLVM
+that projects from Academia can make a difference in the World. LLVM
 started in 2000 as a project by two researchers at the  University of
 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
 gcc with its myriad of front-ends for other languages (C++, Fortran,
 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
 time into a monolithic gigantic piece of m\ldots ehm software, which you
 could not mess about in an afternoon. In contrast, LLVM is designed to
-be a modular suite of tools with which you could play around easily and
+be a modular suite of tools with which you can play around easily and
 try out something new. LLVM became a big player once Apple hired one of
 the original developers (I cannot remember the reason why Apple did not
 want to use gcc, but maybe they were also just disgusted by its big
 monolithic codebase). Anyway, LLVM is now the big player and gcc is more
 or less legacy. This does not mean that programming languages like C and
 C++ are dying out any time soon---they are nicely supported by LLVM.
-We will target the LLVM Intermediate Language, or Intermediate
+We will target the LLVM Intermediate Language, or LLVM Intermediate
-Representation (short LLVM-IR). As a result we can benefit from the
+Representation (short LLVM-IR). The LLVM-IR looks very similar to the
-modular structure of the LLVM compiler and let for example the compiler
+assembly language of Jasmin and Krakatau. It will also allow us to
-generate code for different CPUs, like X86 or ARM. That means we can be
+benefit from the modular structure of the LLVM compiler and let for
-agnostic about where our code actually runs. We can also be ignorant
+example the compiler generate code for different CPUs, like X86 or ARM.
-about optimising code and allocating memory efficiently. However, what
+That means we can be agnostic about where our code actually runs. We can
-we have to do is to generate code in \emph{Static Single-Assignment}
+also be ignorant about optimising code and allocating memory
-format (short SSA), because that is what the LLVM-IR expects from us.
+efficiently.
-The idea behind the SSA format is to use very simple variable
+However, what we have to do for LLVM is to generate code in \emph{Static
+Single-Assignment} format (short SSA), because that is what the LLVM-IR
+expects from us. A reason why LLVM uses the SSA format, rather than
+JVM-like stack instructions, is that stack instructions are difficult to
+optimise---you cannot just re-arrange instructions without messing about
+with what is calculated on the stack. Also it is hard to find out if all
+the calculations on the stack are actually necessary and not by chance
+dead code. The JVM has for all these obstacles sophisticated machinery
+to make such ``high-level'' code still run fast, but let's say that for
+the sake of argument we do not want to rely on it. We want to generate
+fast code ourselves. This means we have to work around the intricacies
+of what instructions CPUs can actually process fast. This is what the
+SSA format is designed for.
+The main idea behind the SSA format is to use very simple variable
 assignments where every variable is assigned only once. The assignments
 also need to be primitive in the sense that they can be just simple
 operations like addition, multiplication, jumps, comparisons and so on.
 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
-SSA is
+corresponding SSA format is
 \begin{lstlisting}[language=LLVMIR,numbers=left]
 let tmp0 = add 1 a in
 let tmp1 = mul b 5 in
 let tmp2 = add 3 tmp1 in
-let tmp3 = add tmp0 tmp2 in
+let tmp3 = add tmp0 tmp2 in tmp3
-tmp3
 \end{lstlisting}
 \noindent where every variable is used only once (we could not write
 \texttt{tmp1 = add 3 tmp1} in Line 3 for example).  There are
 sophisticated algorithms for imperative languages, like C, that
 Continuation-Passing-Style---basically black programming art or
 abracadabra programming. So sit tight.
 \subsection*{LLVM-IR}
-Before we start, lets first have a look at the \emph{LLVM Intermediate
+Before we start, let's first have a look at the \emph{LLVM Intermediate
-Representation} in more detail. What is good about our toy Fun language
+Representation} in more detail. The LLVM-IR is in between the frontends
-is that it basically only contains expressions (be they arithmetic
+and backends of the LLVM framework. It allows compilation of multiple
-expressions or boolean expressions or if-expressions). The exception are
+source languages to multiple targets. It is also the place where most of
-function definitions. Luckily, for them we can use the mechanism of
+the target independent optimisations are performed.
-defining functions in LLVM-IR. For example the simple Fun program
+What is good about our toy Fun language is that it basically only
+contains expressions (be they arithmetic expressions, boolean
+expressions or if-expressions). The exception are function definitions.
+Luckily, for them we can use the mechanism of defining functions in the
+LLVM-IR (this is similar to using JVM methods for functions in our
+earlier compiler). For example the simple Fun program
 \begin{lstlisting}[language=Scala,numbers=none]
 def sqr(x) = x * x
 \end{lstlisting}
 \noindent
-can be compiled into the following LLVM-IR function:
+can be compiled to the following LLVM-IR function:
 \begin{lstlisting}[language=LLVM]
 define i32 @sqr(i32 %x) {
 %tmp = mul i32 %x, %x
 ret i32 %tmp
 }
 \end{lstlisting}
-\noindent First notice that all variable names in the LLVM-IR are
+\noindent First notice that all variable names, in this case \texttt{x}
-prefixed by \texttt{\%}; function names need to be prefixed with
+and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
-@-symbol. Also, the LLVM-IR is a fully typed language. The \texttt{i32}
+Temporary variables can be named with an identifier, such as
-type stands for 32-bit integers. There are also types for 64-bit
+\texttt{tmp}, or numbers. Function names, since they are ``global'',
-integers (\texttt{i64}), chars (\texttt{i8}), floats, arrays and even
+need to be prefixed with @-symbol. Also, the LLVM-IR is a fully typed
-pointer types. In the code above, \texttt{sqr} takes an argument of type
+language. The \texttt{i32} type stands for 32-bit integers. There are
-\texttt{i32} and produces a result of type \texttt{i32} (the result type
+also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
-is before the function name, like in C). Each arithmetic operation, like
+floats, arrays and even pointer types. In the code above, \texttt{sqr}
-addition or multiplication, are also prefixed with the type they operate
+takes an argument of type \texttt{i32} and produces a result of type
-on. Obviously these types need to match up\ldots{} but since we have in
+\texttt{i32} (the result type is in front of the function name, like in
-our programs only integers, \texttt{i32} everywhere will do.
+C). Each arithmetic operation, for example addition and multiplication,
+are also prefixed with the type they operate on. Obviously these types
+need to match up\ldots{} but since we have in our programs only
+integers, \texttt{i32} everywhere will do. We do not have to generate
+any other types, but obviously this is a limitation in our Fun-language.
+There are a few interesting instructions in the LLVM-IR which are quite
+different than in the JVM. Can you remember the kerfuffle we had to go
+through with boolean expressions and negating the condition? In the
+LLVM-IR, branching  if-conditions is implemented differently: there
+is a separate \texttt{br}-instruction as follows:
+\begin{lstlisting}[language=LLVM]
+br i1 %var, label %if_br, label %else_br
+\end{lstlisting}
+\noindent
+The type \texttt{i1} stands for booleans. If the variable is true, then
+this instruction jumps to the if-branch, which needs an explicit label;
+otherwise to the else-branch, again with its own label. This allows us
+to keep the meaning of the boolean expression as is.  A value of type
+boolean is generated in the LLVM-IR by the \texttt{icmp}-instruction.
+This instruction is for integers (hence the \texttt{i}) and takes the
+comparison operation as argument. For example
+\begin{lstlisting}[language=LLVM]
+icmp eq i32  %x, %y     ; for equal
+icmp sle i32 %x, %y     ;   signed less or equal
+icmp slt i32 %x, %y     ;   signed less than
+icmp ult i32 %x, %y     ;   unsigned less than
+\end{lstlisting}
+\noindent
+In some operations, the LLVM-IR distinguishes between signed and
+unsigned representations of integers.
 Conveniently, you can use the program \texttt{lli}, which comes with
 LLVM, to interpret programs written in the LLVM-IR. So you can easily
 check whether the code you produced actually works. To get a running
 program that does something interesting you need to add some boilerplate
 about printing out numbers and a main-function that is the entrypoint
-for the program (see Figure~\ref{lli} for a complete listing). You can
+for the program (see Figure~\ref{lli} for a complete listing). Again
-generate a binary for this program using \texttt{llc}-compiler and
+this is very similar to the boilerplate we needed to add in our JVM
-\texttt{gcc}---\texttt{llc} can generate an object file and then you can
+compiler.
-use \texttt{gcc} (that is clang) for generating an executable  binary:
+You can generate a binary for the program in Figure~\ref{lli} by using
+the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
+an object file and \texttt{gcc} (that is clang) generates the
+executable binary:
 \begin{lstlisting}[language=bash,numbers=none]
 llc -filetype=obj sqr.ll
 gcc sqr.o -o a.out
 ./a.out
 > 25
 \end{lstlisting}
 \begin{figure}[t]\small
 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
-\caption{An LLVM-IR program for calculating the square function. The
+\caption{An LLVM-IR program for calculating the square function. It
-interesting function is \texttt{sqr} in Lines 13 -- 16. The main
+calls this function in \texttt{@main} with the argument \texttt{5}. The
-function calls \texttt{sqr} and prints out the result. The other
+code for the \texttt{sqr} function is in Lines 13 -- 16. The main
+function calls \texttt{sqr} and then prints out the result. The other
 code is boilerplate for printing out integers.\label{lli}}
 \end{figure}
 ``high-level'' programs and ``low-level'' hardware. If the gap is too
 wide for one step, then a good strategy is to lay a stepping stone
 somewhere in between. The LLVM-IR itself is such a stepping stone to
 make the task of generating and optimising code easier. Like a real
 compiler we will use our own stepping stone which I call the
-\emph{K-language}. For this remember expressions (and boolean
+\emph{K-language}. For what follows recall the various kinds of
-expressions) in the Fun language. For convenience the Scala code is
+expressions in the Fun language. For convenience the Scala code of the
-shown on top of Figure~\ref{absfun}. Below is the code for the
+corresponding abstract syntax trees is shown on top of
-K-language. There are two kinds of syntactic entities, namely
+Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
-\emph{K-values} and \emph{K-expressions}. The central point of the
+the K-language. There are two kinds of syntactic entities, namely
-K-language is the \texttt{KLet}-constructor. For this recall that
+\emph{K-values} and \emph{K-expressions}. The central constructor of the
-arithmetic expressions such as $((1 + a) + (3 + (b * 5)))$ need
+K-language is \texttt{KLet}. For this recall that arithmetic expressions
-to be broken up into smaller ``atomic'' steps, like so
+such as $((1 + a) + (3 + (b * 5)))$ need to be broken up into smaller
+``atomic'' steps, like so
 \begin{lstlisting}[language=LLVMIR,numbers=none]
 let tmp0 = add 1 a in
 let tmp1 = mul b 5 in
 let tmp2 = add 3 tmp1 in
 let tmp3 = add tmp0 tmp2 in
 tmp3
 \end{lstlisting}
 \noindent
-Here \texttt{tmp3} will contain the result of what the expression stands
+Here \texttt{tmp3} will contain the result of what the whole expression
-for. In each step we can only perform an ``atomic'' operation, like
+stands for. In each individual step we can only perform an ``atomic''
-addition or multiplication. We could not for example have an
+operation, like addition or multiplication of a number and a variable.
-if-condition on the right-hand side of an equals. These constraints
+We are not allowed to have for example an if-condition on the right-hand
-enforced upon us because how the SSA format works in the LLVM-IR. By
+side of an equals. Such constraints are enforced upon us because of how
-having in \texttt{KLet},  first a string (standing for an intermediate
+the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
-result) and second a value, we can fulfil this constraint---there is no
+first a string (standing for an intermediate result) and second a value,
-way we could write anything else than a value. To sum up, K-values are
+we can fulfil this constraint ``by construction''---there is no way we
-the atomic operations that can be on the right-hand side of equal-signs.
+could write anything else than a value.
-The K-language is restricted such that it is easy to generate the SSA format
-for the LLVM-IR.
+To sum up, K-values are the atomic operations that can be on the
+right-hand side of equal-signs. The K-language is restricted such that
+it is easy to generate the SSA format for the LLVM-IR.
 \begin{figure}[p]\small
 \begin{lstlisting}[language=Scala,numbers=none]
 \subsection*{CPS-Translations}
+The main difficulty of generating instructions in SSA format is that
+large compound expressions need to be broken up into smaller pieces and
+intermediate results need to be chained into later instructions. To do
+this conveniently, CPS-translations have been developed. They use
-Another reason why it makes sense to go the extra mile is that stack
+functions (``continuations'') to represent what is coming next in a
-instructions are very difficult to optimise---you cannot just re-arrange
+sequence of instructions.
-instructions without messing about with what is calculated on the stack.
-Also it is hard to find out if all the calculations on the stack are
-actually necessary and not by chance dead code. The JVM has for all this
-sophisticated machinery to make such ``high-level'' code still run fast,
-but let's say that for the sake of argument we do not want to rely on
-it. We want to generate fast code ourselves. This means we have to work
-around the intricacies of what instructions CPUs can actually process.
 \end{document}
 %%% Local Variables:

changeset 700	52263ffd17b9
parent 680	eecc4d5a2172
child 701	681c36b2af27