| 
677
 | 
     1  | 
% !TEX program = xelatex
  | 
| 
539
 | 
     2  | 
\documentclass{article}
 | 
| 
 | 
     3  | 
\usepackage{../style}
 | 
| 
 | 
     4  | 
\usepackage{../langs}
 | 
| 
 | 
     5  | 
\usepackage{../graphics}
 | 
| 
 | 
     6  | 
\usepackage{../grammar}
 | 
| 
677
 | 
     7  | 
%%\usepackage{multicol}
 | 
| 
539
 | 
     8  | 
  | 
| 
677
 | 
     9  | 
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
 | 
| 
539
 | 
    10  | 
  | 
| 
 | 
    11  | 
\begin{document}
 | 
| 
677
 | 
    12  | 
\fnote{\copyright{} Christian Urban, King's College London, 2019}
 | 
| 
539
 | 
    13  | 
  | 
| 
 | 
    14  | 
  | 
| 
677
 | 
    15  | 
\section*{Handout 9 (LLVM, SSA and CPS)}
 | 
| 
539
 | 
    16  | 
  | 
| 
678
 | 
    17  | 
Reflecting on our tiny compiler targetting the JVM, the code generation
  | 
| 
 | 
    18  | 
part was actually not so hard, no? Pretty much just some post-traversal
  | 
| 
679
 | 
    19  | 
of the abstract syntax tree, yes? One of the main reason for this ease
  | 
| 
 | 
    20  | 
is that the JVM is a stack-based virtual machine and it is therefore not
  | 
| 
678
 | 
    21  | 
hard to translate arithmetic expressions into a sequence of instructions
  | 
| 
 | 
    22  | 
manipulating the stack. The problem is that ``real'' CPUs, although
  | 
| 
 | 
    23  | 
supporting stack operations, are not really designed to be \emph{stack
 | 
| 
 | 
    24  | 
machines}.  The design of CPUs is more like, here is a chunk of
  | 
| 
 | 
    25  | 
memory---compiler, or better compiler writers, do something with it.
  | 
| 
679
 | 
    26  | 
Consequently, modern compilers need to go the extra mile in order to
  | 
| 
 | 
    27  | 
generate code that is much easier and faster to process by CPUs. To make
  | 
| 
 | 
    28  | 
this all tractable for this module, we target the LLVM Intermediate
  | 
| 
 | 
    29  | 
Language. In this way we can take advantage of the tools coming with
  | 
| 
 | 
    30  | 
LLVM. For example we do not have to worry about things like register
  | 
| 
 | 
    31  | 
allocations.\bigskip 
  | 
| 
539
 | 
    32  | 
  | 
| 
678
 | 
    33  | 
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
 | 
| 
 | 
    34  | 
that projects from Academia can make a difference in the world. LLVM
  | 
| 
 | 
    35  | 
started in 2000 as a project by two researchers at the  University of
  | 
| 
 | 
    36  | 
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
  | 
| 
 | 
    37  | 
gcc with its myriad of front-ends for other languages (e.g.~Fortran,
  | 
| 
 | 
    38  | 
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
  | 
| 
 | 
    39  | 
time into a monolithic gigantic piece of m\ldots ehm software, which you
  | 
| 
 | 
    40  | 
could not mess about in an afternoon. In contrast, LLVM is designed to
  | 
| 
 | 
    41  | 
be a modular suite of tools with which you could play around easily and
  | 
| 
 | 
    42  | 
try out something new. LLVM became a big player once Apple hired one of
  | 
| 
 | 
    43  | 
the original developers (I cannot remember the reason why Apple did not
  | 
| 
 | 
    44  | 
want to use gcc, but maybe they were also just disgusted by its big
  | 
| 
 | 
    45  | 
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
  | 
| 
 | 
    46  | 
or less legacy. This does not mean that programming languages like C and
  | 
| 
 | 
    47  | 
C++ are dying out any time soon---they are nicely supported by LLVM.
  | 
| 
539
 | 
    48  | 
  | 
| 
678
 | 
    49  | 
Targetting the LLVM Intermediate Language, or Intermediate
  | 
| 
 | 
    50  | 
Representation (short LLVM-IR), also means we can profit from the very
  | 
| 
 | 
    51  | 
modular structure of the LLVM compiler and let for example the compiler
  | 
| 
 | 
    52  | 
generate code for X86, or ARM etc. That means we can be agnostic about
  | 
| 
 | 
    53  | 
where our code actually runs. However, what we have to do is to generate
  | 
| 
 | 
    54  | 
code in \emph{Static Single-Assignment} format (short SSA), because that
 | 
| 
 | 
    55  | 
is what the LLVM-IR expects from us. LLVM-IR is the intermediate format
  | 
| 
 | 
    56  | 
that LLVM uses for doing cool things, like targetting strange
  | 
| 
 | 
    57  | 
architectures, optimising code and allocating memory efficiently. 
  | 
| 
539
 | 
    58  | 
  | 
| 
678
 | 
    59  | 
The idea behind the SSA format is to use very simple variable
  | 
| 
 | 
    60  | 
assignments where every variable is assigned only once. The assignments
  | 
| 
 | 
    61  | 
also need to be primitive in the sense that they can be just simple
  | 
| 
 | 
    62  | 
operations like addition, multiplication, jumps, comparisons and so on.
  | 
| 
 | 
    63  | 
An idealised snippet of a program in SSA is 
  | 
| 
539
 | 
    64  | 
  | 
| 
677
 | 
    65  | 
\begin{lstlisting}[language=LLVM,numbers=none]
 | 
| 
 | 
    66  | 
    x := 1
  | 
| 
 | 
    67  | 
    y := 2     
  | 
| 
 | 
    68  | 
    z := x + y
  | 
| 
 | 
    69  | 
\end{lstlisting}
 | 
| 
539
 | 
    70  | 
  | 
| 
678
 | 
    71  | 
\noindent where every variable is used only once (we could not write
  | 
| 
 | 
    72  | 
\texttt{x := x + y} in the last line for example).  There are
 | 
| 
 | 
    73  | 
sophisticated algorithms for imperative languages, like C, that
  | 
| 
 | 
    74  | 
efficiently transform a high-level program into SSA format. But we can
  | 
| 
 | 
    75  | 
ignore them here. We want to compile a functional language and there
  | 
| 
 | 
    76  | 
things get much more interesting than just sophisticated. We will need
  | 
| 
 | 
    77  | 
to have a look at CPS translations, where the CPS stands for
  | 
| 
 | 
    78  | 
Continuation-Passing-Style---basically black programming art or
  | 
| 
 | 
    79  | 
abracadabra programming. So sit tight.
  | 
| 
539
 | 
    80  | 
  | 
| 
678
 | 
    81  | 
\subsection*{LLVM-IR}
 | 
| 
539
 | 
    82  | 
  | 
| 
678
 | 
    83  | 
Before we start, lets first have a look at the \emph{LLVM Intermediate
 | 
| 
 | 
    84  | 
Representation}. What is good about our simple Fun language is that it
  | 
| 
 | 
    85  | 
basically only contains expressions (be they arithmetic expressions or
  | 
| 
 | 
    86  | 
boolean expressions). The exception is function definitions. Luckily,
  | 
| 
 | 
    87  | 
for them we can use the mechanism of defining functions in LLVM-IR. For
  | 
| 
 | 
    88  | 
example the simple Fun program 
  | 
| 
539
 | 
    89  | 
  | 
| 
 | 
    90  | 
  | 
| 
677
 | 
    91  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
678
 | 
    92  | 
def sqr(x) = x * x
  | 
| 
677
 | 
    93  | 
\end{lstlisting}
 | 
| 
539
 | 
    94  | 
  | 
| 
677
 | 
    95  | 
\noindent
  | 
| 
 | 
    96  | 
can be compiled into the following LLVM-IR function:
  | 
| 
539
 | 
    97  | 
  | 
| 
677
 | 
    98  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
678
 | 
    99  | 
define i32 @sqr(i32 %x) {
 | 
| 
 | 
   100  | 
   %tmp = mul i32 %x, %x
  | 
| 
677
 | 
   101  | 
   ret i32 %tmp
  | 
| 
 | 
   102  | 
}    
  | 
| 
 | 
   103  | 
\end{lstlisting}
 | 
| 
539
 | 
   104  | 
  | 
| 
678
 | 
   105  | 
\noindent First to notice is that all variable names in the LLVM-IR are
  | 
| 
 | 
   106  | 
prefixed by \texttt{\%}; function names need to be prefixed with @.
 | 
| 
 | 
   107  | 
Also, the LLVM-IR is a fully typed language. The \texttt{i32} type stands
 | 
| 
 | 
   108  | 
for a 32-bit integer. There are also types for 64-bit integers, chars
  | 
| 
 | 
   109  | 
(\texttt{i8}), floats, arrays and even pointer types. In teh code above,
 | 
| 
 | 
   110  | 
\texttt{sqr} takes an argument of type \texttt{i32} and produces a
 | 
| 
 | 
   111  | 
result of type \texttt{i32}. Each arithmetic operation, like addition or
 | 
| 
 | 
   112  | 
multiplication, are also prefixed with the type they operate on.
  | 
| 
 | 
   113  | 
Obviously these types need to match up\ldots{} but since we have in our
 | 
| 
 | 
   114  | 
programs only integers, \texttt{i32} everywhere will do. 
 | 
| 
539
 | 
   115  | 
 
  | 
| 
679
 | 
   116  | 
Conveniently, you can use the program \texttt{lli}, which comes with
 | 
| 
 | 
   117  | 
LLVM, to interpret programs written in the LLVM-IR. So you can easily
  | 
| 
 | 
   118  | 
check whether the code you produced actually works. To get a running
  | 
| 
 | 
   119  | 
program that does something interesting you need to add some boilerplate
  | 
| 
 | 
   120  | 
about printing out numbers and a main-function that is the entrypoint
  | 
| 
 | 
   121  | 
for the program (see Figure~\ref{lli}). You can generate a binary for
 | 
| 
 | 
   122  | 
this program using \texttt{llc}-compiler in order to generate an object
 | 
| 
 | 
   123  | 
file and then use gcc (clang) for generating a binary:
  | 
| 
678
 | 
   124  | 
  | 
| 
679
 | 
   125  | 
\begin{lstlisting}[language=bash,numbers=none]
 | 
| 
 | 
   126  | 
llc -filetype=obj sqr.ll
  | 
| 
 | 
   127  | 
gcc sqr.o -o a.out
  | 
| 
 | 
   128  | 
./a.out
  | 
| 
 | 
   129  | 
\end{lstlisting}
 | 
| 
 | 
   130  | 
  | 
| 
 | 
   131  | 
\begin{figure}[t]\small 
 | 
| 
 | 
   132  | 
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
 | 
| 
 | 
   133  | 
\caption{An LLVM-IR program for calculating the square function. The 
 | 
| 
 | 
   134  | 
interesting function is \texttt{sqr} in Lines 13 -- 16. The main
 | 
| 
 | 
   135  | 
function calls \texttt{sqr} and prints out the result. The other
 | 
| 
 | 
   136  | 
code is boilerplate for printing out integers.\label{lli}}
 | 
| 
678
 | 
   137  | 
\end{figure}   
 | 
| 
 | 
   138  | 
  | 
| 
679
 | 
   139  | 
  | 
| 
 | 
   140  | 
    
  | 
| 
 | 
   141  | 
\subsection*{Our Own Intermediate Language}
 | 
| 
 | 
   142  | 
  | 
| 
 | 
   143  | 
Remember compilers have to solve the problem of bridging the gap between
  | 
| 
 | 
   144  | 
``high-level'' programs and ``low-level'' hardware. If the gap is tool
  | 
| 
 | 
   145  | 
wide then a good strategy is to lay a stepping stone somewhere in
  | 
| 
 | 
   146  | 
between. The LLVM-IR itself is such a stepping stone to make the task of
  | 
| 
 | 
   147  | 
generating code easier. Like a real compiler we will use another
  | 
| 
 | 
   148  | 
stepping stone which I call \emph{K-language}. For this remember
 | 
| 
 | 
   149  | 
expressions (and boolean expressions) in the Fun language are given by
  | 
| 
 | 
   150  | 
the code on top of Figure~\ref{absfun}
 | 
| 
 | 
   151  | 
  | 
| 
 | 
   152  | 
  | 
| 
 | 
   153  | 
\begin{figure}[p]\small
 | 
| 
678
 | 
   154  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
679
 | 
   155  | 
// Fun-language (expressions)
  | 
| 
 | 
   156  | 
abstract class Exp 
  | 
| 
 | 
   157  | 
abstract class BExp 
  | 
| 
678
 | 
   158  | 
  | 
| 
 | 
   159  | 
case class Call(name: String, args: List[Exp]) extends Exp
  | 
| 
 | 
   160  | 
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
  | 
| 
 | 
   161  | 
case class Write(e: Exp) extends Exp
  | 
| 
 | 
   162  | 
case class Var(s: String) extends Exp
  | 
| 
 | 
   163  | 
case class Num(i: Int) extends Exp
  | 
| 
 | 
   164  | 
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
  | 
| 
 | 
   165  | 
case class Sequence(e1: Exp, e2: Exp) extends Exp
  | 
| 
679
 | 
   166  | 
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
  | 
| 
 | 
   167  | 
  | 
| 
 | 
   168  | 
  | 
| 
 | 
   169  | 
  | 
| 
 | 
   170  | 
// K-language (K-expressions, K-values)
  | 
| 
 | 
   171  | 
abstract class KExp
  | 
| 
 | 
   172  | 
abstract class KVal
  | 
| 
 | 
   173  | 
  | 
| 
 | 
   174  | 
case class KVar(s: String) extends KVal
  | 
| 
 | 
   175  | 
case class KNum(i: Int) extends KVal
  | 
| 
 | 
   176  | 
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
  | 
| 
 | 
   177  | 
case class KCall(o: String, vrs: List[KVal]) extends KVal
  | 
| 
 | 
   178  | 
case class KWrite(v: KVal) extends KVal
  | 
| 
 | 
   179  | 
  | 
| 
 | 
   180  | 
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
  | 
| 
 | 
   181  | 
case class KLet(x: String, e1: KVal, e2: KExp) extends KExp
  | 
| 
 | 
   182  | 
case class KReturn(v: KVal) extends KExp
  | 
| 
678
 | 
   183  | 
\end{lstlisting}
 | 
| 
 | 
   184  | 
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
 | 
| 
 | 
   185  | 
\end{figure}
 | 
| 
679
 | 
   186  | 
  | 
| 
678
 | 
   187  | 
  | 
| 
 | 
   188  | 
  | 
| 
 | 
   189  | 
\subsection*{CPS-Translations}
 | 
| 
 | 
   190  | 
  | 
| 
 | 
   191  | 
  | 
| 
679
 | 
   192  | 
  | 
| 
 | 
   193  | 
  | 
| 
 | 
   194  | 
  | 
| 
 | 
   195  | 
Another reason why it makes sense to go the extra mile is that stack
  | 
| 
 | 
   196  | 
instructions are very difficult to optimise---you cannot just re-arrange
  | 
| 
 | 
   197  | 
instructions without messing about with what is calculated on the stack.
  | 
| 
 | 
   198  | 
Also it is hard to find out if all the calculations on the stack are
  | 
| 
 | 
   199  | 
actually necessary and not by chance dead code. The JVM has for all this
  | 
| 
 | 
   200  | 
sophisticated machinery to make such ``high-level'' code still run fast,
  | 
| 
 | 
   201  | 
but let's say that for the sake of argument we do not want to rely on
  | 
| 
 | 
   202  | 
it. We want to generate fast code ourselves. This means we have to work
  | 
| 
 | 
   203  | 
around the intricacies of what instructions CPUs can actually process.
  | 
| 
 | 
   204  | 
  | 
| 
539
 | 
   205  | 
\end{document}
 | 
| 
 | 
   206  | 
  | 
| 
 | 
   207  | 
  | 
| 
 | 
   208  | 
%%% Local Variables: 
  | 
| 
 | 
   209  | 
%%% mode: latex
  | 
| 
 | 
   210  | 
%%% TeX-master: t
  | 
| 
 | 
   211  | 
%%% End: 
  |