--- a/handouts/ho09.tex	Sun Nov 24 01:03:38 2019 +0000
+++ b/handouts/ho09.tex	Sun Nov 24 16:30:34 2019 +0000
@@ -14,31 +14,31 @@
 
 \section*{Handout 9 (LLVM, SSA and CPS)}
 
-Reflecting on our tiny compiler targetting the JVM, the code generation
-part was actually not so hard, no? Pretty much just some post-traversal
-of the abstract syntax tree, yes? One of the reasons for this ease is
-that the JVM is a stack-based virtual machine and it is therefore not
-hard to translate deeply-nested arithmetic expressions into a sequence
-of instructions manipulating the stack. The problem is that ``real''
-CPUs, although supporting stack operations, are not really designed to
-be \emph{stack machines}.  The design of CPUs is more like, here is a
-chunk of memory---compiler, or better compiler writers, do something
-with it. Consequently, modern compilers need to go the extra mile in
-order to generate code that is much easier and faster to process by
-CPUs. To make this all tractable for this module, we target the LLVM
+Reflecting on our two tiny compilers targetting the JVM, the code
+generation part was actually not so hard, no? Pretty much just some
+post-traversal of the abstract syntax tree, yes? One of the reasons for
+this ease is that the JVM is a stack-based virtual machine and it is
+therefore not hard to translate deeply-nested arithmetic expressions
+into a sequence of instructions manipulating the stack. The problem is
+that ``real'' CPUs, although supporting stack operations, are not really
+designed to be \emph{stack machines}.  The design of CPUs is more like,
+here is a chunk of memory---compiler, or better compiler writers, do
+something with it. Consequently, modern compilers need to go the extra
+mile in order to generate code that is much easier and faster to process
+by CPUs. To make this all tractable for this module, we target the LLVM
 Intermediate Language. In this way we can take advantage of the tools
 coming with LLVM. For example we do not have to worry about things like
 register allocations.\bigskip 
 
 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
-that projects from Academia can make a difference in the world. LLVM
+that projects from Academia can make a difference in the World. LLVM
 started in 2000 as a project by two researchers at the  University of
 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
 gcc with its myriad of front-ends for other languages (C++, Fortran,
 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
 time into a monolithic gigantic piece of m\ldots ehm software, which you
 could not mess about in an afternoon. In contrast, LLVM is designed to
-be a modular suite of tools with which you could play around easily and
+be a modular suite of tools with which you can play around easily and
 try out something new. LLVM became a big player once Apple hired one of
 the original developers (I cannot remember the reason why Apple did not
 want to use gcc, but maybe they were also just disgusted by its big
@@ -46,28 +46,42 @@
 or less legacy. This does not mean that programming languages like C and
 C++ are dying out any time soon---they are nicely supported by LLVM.
 
-We will target the LLVM Intermediate Language, or Intermediate
-Representation (short LLVM-IR). As a result we can benefit from the
-modular structure of the LLVM compiler and let for example the compiler
-generate code for different CPUs, like X86 or ARM. That means we can be
-agnostic about where our code actually runs. We can also be ignorant
-about optimising code and allocating memory efficiently. However, what
-we have to do is to generate code in \emph{Static Single-Assignment}
-format (short SSA), because that is what the LLVM-IR expects from us. 
+We will target the LLVM Intermediate Language, or LLVM Intermediate
+Representation (short LLVM-IR). The LLVM-IR looks very similar to the
+assembly language of Jasmin and Krakatau. It will also allow us to
+benefit from the modular structure of the LLVM compiler and let for
+example the compiler generate code for different CPUs, like X86 or ARM.
+That means we can be agnostic about where our code actually runs. We can
+also be ignorant about optimising code and allocating memory
+efficiently. 
 
-The idea behind the SSA format is to use very simple variable
+However, what we have to do for LLVM is to generate code in \emph{Static
+Single-Assignment} format (short SSA), because that is what the LLVM-IR
+expects from us. A reason why LLVM uses the SSA format, rather than
+JVM-like stack instructions, is that stack instructions are difficult to
+optimise---you cannot just re-arrange instructions without messing about
+with what is calculated on the stack. Also it is hard to find out if all
+the calculations on the stack are actually necessary and not by chance
+dead code. The JVM has for all these obstacles sophisticated machinery
+to make such ``high-level'' code still run fast, but let's say that for
+the sake of argument we do not want to rely on it. We want to generate
+fast code ourselves. This means we have to work around the intricacies
+of what instructions CPUs can actually process fast. This is what the
+SSA format is designed for.
+
+
+The main idea behind the SSA format is to use very simple variable
 assignments where every variable is assigned only once. The assignments
 also need to be primitive in the sense that they can be just simple
 operations like addition, multiplication, jumps, comparisons and so on.
 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
-SSA is 
+corresponding SSA format is 
  
 \begin{lstlisting}[language=LLVMIR,numbers=left]
 let tmp0 = add 1 a in   
 let tmp1 = mul b 5 in 
 let tmp2 = add 3 tmp1 in 
-let tmp3 = add tmp0 tmp2 in
-  tmp3 
+let tmp3 = add tmp0 tmp2 in tmp3 
 \end{lstlisting}
 
 \noindent where every variable is used only once (we could not write
@@ -82,12 +96,18 @@
 
 \subsection*{LLVM-IR}
 
-Before we start, lets first have a look at the \emph{LLVM Intermediate
-Representation} in more detail. What is good about our toy Fun language
-is that it basically only contains expressions (be they arithmetic
-expressions or boolean expressions or if-expressions). The exception are
-function definitions. Luckily, for them we can use the mechanism of
-defining functions in LLVM-IR. For example the simple Fun program 
+Before we start, let's first have a look at the \emph{LLVM Intermediate
+Representation} in more detail. The LLVM-IR is in between the frontends
+and backends of the LLVM framework. It allows compilation of multiple
+source languages to multiple targets. It is also the place where most of
+the target independent optimisations are performed. 
+
+What is good about our toy Fun language is that it basically only
+contains expressions (be they arithmetic expressions, boolean
+expressions or if-expressions). The exception are function definitions.
+Luckily, for them we can use the mechanism of defining functions in the
+LLVM-IR (this is similar to using JVM methods for functions in our
+earlier compiler). For example the simple Fun program 
 
 
 \begin{lstlisting}[language=Scala,numbers=none]
@@ -95,7 +115,7 @@
 \end{lstlisting}
 
 \noindent
-can be compiled into the following LLVM-IR function:
+can be compiled to the following LLVM-IR function:
 
 \begin{lstlisting}[language=LLVM]
 define i32 @sqr(i32 %x) {
@@ -104,27 +124,65 @@
 }    
 \end{lstlisting}
 
-\noindent First notice that all variable names in the LLVM-IR are
-prefixed by \texttt{\%}; function names need to be prefixed with
-@-symbol. Also, the LLVM-IR is a fully typed language. The \texttt{i32}
-type stands for 32-bit integers. There are also types for 64-bit
-integers (\texttt{i64}), chars (\texttt{i8}), floats, arrays and even
-pointer types. In the code above, \texttt{sqr} takes an argument of type
-\texttt{i32} and produces a result of type \texttt{i32} (the result type
-is before the function name, like in C). Each arithmetic operation, like
-addition or multiplication, are also prefixed with the type they operate
-on. Obviously these types need to match up\ldots{} but since we have in
-our programs only integers, \texttt{i32} everywhere will do. 
+\noindent First notice that all variable names, in this case \texttt{x}
+and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
+Temporary variables can be named with an identifier, such as
+\texttt{tmp}, or numbers. Function names, since they are ``global'',
+need to be prefixed with @-symbol. Also, the LLVM-IR is a fully typed
+language. The \texttt{i32} type stands for 32-bit integers. There are
+also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
+floats, arrays and even pointer types. In the code above, \texttt{sqr}
+takes an argument of type \texttt{i32} and produces a result of type
+\texttt{i32} (the result type is in front of the function name, like in
+C). Each arithmetic operation, for example addition and multiplication,
+are also prefixed with the type they operate on. Obviously these types
+need to match up\ldots{} but since we have in our programs only
+integers, \texttt{i32} everywhere will do. We do not have to generate
+any other types, but obviously this is a limitation in our Fun-language.
  
+There are a few interesting instructions in the LLVM-IR which are quite
+different than in the JVM. Can you remember the kerfuffle we had to go
+through with boolean expressions and negating the condition? In the
+LLVM-IR, branching  if-conditions is implemented differently: there
+is a separate \texttt{br}-instruction as follows:
+
+\begin{lstlisting}[language=LLVM]
+br i1 %var, label %if_br, label %else_br
+\end{lstlisting}
+
+\noindent
+The type \texttt{i1} stands for booleans. If the variable is true, then
+this instruction jumps to the if-branch, which needs an explicit label;
+otherwise to the else-branch, again with its own label. This allows us
+to keep the meaning of the boolean expression as is.  A value of type
+boolean is generated in the LLVM-IR by the \texttt{icmp}-instruction.
+This instruction is for integers (hence the \texttt{i}) and takes the
+comparison operation as argument. For example
+
+\begin{lstlisting}[language=LLVM]
+icmp eq i32  %x, %y     ; for equal
+icmp sle i32 %x, %y     ;   signed less or equal
+icmp slt i32 %x, %y     ;   signed less than
+icmp ult i32 %x, %y     ;   unsigned less than 
+\end{lstlisting}
+
+\noindent
+In some operations, the LLVM-IR distinguishes between signed and 
+unsigned representations of integers.
+
 Conveniently, you can use the program \texttt{lli}, which comes with
 LLVM, to interpret programs written in the LLVM-IR. So you can easily
 check whether the code you produced actually works. To get a running
 program that does something interesting you need to add some boilerplate
 about printing out numbers and a main-function that is the entrypoint
-for the program (see Figure~\ref{lli} for a complete listing). You can
-generate a binary for this program using \texttt{llc}-compiler and
-\texttt{gcc}---\texttt{llc} can generate an object file and then you can 
-use \texttt{gcc} (that is clang) for generating an executable  binary:
+for the program (see Figure~\ref{lli} for a complete listing). Again
+this is very similar to the boilerplate we needed to add in our JVM
+compiler. 
+
+You can generate a binary for the program in Figure~\ref{lli} by using
+the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
+an object file and \texttt{gcc} (that is clang) generates the
+executable binary:
 
 \begin{lstlisting}[language=bash,numbers=none]
 llc -filetype=obj sqr.ll
@@ -135,9 +193,10 @@
 
 \begin{figure}[t]\small 
 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
-\caption{An LLVM-IR program for calculating the square function. The 
-interesting function is \texttt{sqr} in Lines 13 -- 16. The main
-function calls \texttt{sqr} and prints out the result. The other
+\caption{An LLVM-IR program for calculating the square function. It
+calls this function in \texttt{@main} with the argument \texttt{5}. The
+code for the \texttt{sqr} function is in Lines 13 -- 16. The main
+function calls \texttt{sqr} and then prints out the result. The other
 code is boilerplate for printing out integers.\label{lli}}
 \end{figure}   
 
@@ -151,14 +210,15 @@
 somewhere in between. The LLVM-IR itself is such a stepping stone to
 make the task of generating and optimising code easier. Like a real
 compiler we will use our own stepping stone which I call the
-\emph{K-language}. For this remember expressions (and boolean
-expressions) in the Fun language. For convenience the Scala code is
-shown on top of Figure~\ref{absfun}. Below is the code for the
-K-language. There are two kinds of syntactic entities, namely
-\emph{K-values} and \emph{K-expressions}. The central point of the
-K-language is the \texttt{KLet}-constructor. For this recall that 
-arithmetic expressions such as $((1 + a) + (3 + (b * 5)))$ need
-to be broken up into smaller ``atomic'' steps, like so
+\emph{K-language}. For what follows recall the various kinds of
+expressions in the Fun language. For convenience the Scala code of the
+corresponding abstract syntax trees is shown on top of
+Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
+the K-language. There are two kinds of syntactic entities, namely
+\emph{K-values} and \emph{K-expressions}. The central constructor of the
+K-language is \texttt{KLet}. For this recall that arithmetic expressions
+such as $((1 + a) + (3 + (b * 5)))$ need to be broken up into smaller
+``atomic'' steps, like so
  
 \begin{lstlisting}[language=LLVMIR,numbers=none]
 let tmp0 = add 1 a in   
@@ -169,17 +229,19 @@
 \end{lstlisting}
 
 \noindent
-Here \texttt{tmp3} will contain the result of what the expression stands
-for. In each step we can only perform an ``atomic'' operation, like
-addition or multiplication. We could not for example have an
-if-condition on the right-hand side of an equals. These constraints
-enforced upon us because how the SSA format works in the LLVM-IR. By
-having in \texttt{KLet},  first a string (standing for an intermediate
-result) and second a value, we can fulfil this constraint---there is no
-way we could write anything else than a value. To sum up, K-values are
-the atomic operations that can be on the right-hand side of equal-signs.
-The K-language is restricted such that it is easy to generate the SSA format 
-for the LLVM-IR.
+Here \texttt{tmp3} will contain the result of what the whole expression
+stands for. In each individual step we can only perform an ``atomic''
+operation, like addition or multiplication of a number and a variable.
+We are not allowed to have for example an if-condition on the right-hand
+side of an equals. Such constraints are enforced upon us because of how
+the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
+first a string (standing for an intermediate result) and second a value,
+we can fulfil this constraint ``by construction''---there is no way we
+could write anything else than a value. 
+
+To sum up, K-values are the atomic operations that can be on the
+right-hand side of equal-signs. The K-language is restricted such that
+it is easy to generate the SSA format for the LLVM-IR. 
 
 
 
@@ -221,19 +283,16 @@
 
 \subsection*{CPS-Translations}
 
+The main difficulty of generating instructions in SSA format is that
+large compound expressions need to be broken up into smaller pieces and
+intermediate results need to be chained into later instructions. To do
+this conveniently, CPS-translations have been developed. They use
+functions (``continuations'') to represent what is coming next in a
+sequence of instructions.
 
 
 
 
-Another reason why it makes sense to go the extra mile is that stack
-instructions are very difficult to optimise---you cannot just re-arrange
-instructions without messing about with what is calculated on the stack.
-Also it is hard to find out if all the calculations on the stack are
-actually necessary and not by chance dead code. The JVM has for all this
-sophisticated machinery to make such ``high-level'' code still run fast,
-but let's say that for the sake of argument we do not want to rely on
-it. We want to generate fast code ourselves. This means we have to work
-around the intricacies of what instructions CPUs can actually process.
 
 \end{document}
changeset 700	52263ffd17b9
parent 680	eecc4d5a2172
child 701	681c36b2af27