677
|
1 |
% !TEX program = xelatex
|
539
|
2 |
\documentclass{article}
|
|
3 |
\usepackage{../style}
|
|
4 |
\usepackage{../langs}
|
|
5 |
\usepackage{../graphics}
|
|
6 |
\usepackage{../grammar}
|
677
|
7 |
%%\usepackage{multicol}
|
539
|
8 |
|
677
|
9 |
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
|
539
|
10 |
|
|
11 |
\begin{document}
|
677
|
12 |
\fnote{\copyright{} Christian Urban, King's College London, 2019}
|
539
|
13 |
|
|
14 |
|
677
|
15 |
\section*{Handout 9 (LLVM, SSA and CPS)}
|
539
|
16 |
|
677
|
17 |
Reflecting on our tiny compiler targetting the JVM, code generation was
|
|
18 |
actually not so hard, no? One of the main reason for this ease is that
|
|
19 |
the JVM is a stack-based virtual machine and it is, for example, not
|
|
20 |
hard to translate arithmetic expressions into instructions manipulating
|
|
21 |
the stack. The problem is that ``real'' CPUs, although supporting stack
|
|
22 |
operations, are not really \emph{stack machines}. They are just not
|
|
23 |
optimised for this way of calculating things. The design of CPUs is more
|
|
24 |
like, here is a piece of memory---compiler, or compiler writer, do
|
|
25 |
something with it. Otherwise get lost. So in the name of raw speed,
|
|
26 |
modern compilers go the extra mile and generate code that is much easier
|
|
27 |
and faster to process by CPUs.
|
539
|
28 |
|
677
|
29 |
Another reason why it makes sense to go the extra mile is that stack
|
|
30 |
instructions are very difficult to optimise---you cannot just re-arrange
|
|
31 |
instructions without messing about with what is calculated on the stack.
|
|
32 |
Also it is hard to find out if all the calculations on the stack are
|
|
33 |
actually necessary and not by chance dead code. The JVM has for all this
|
|
34 |
sophisticated machinery to make such ``high-level'' code still run fast,
|
|
35 |
but let's say that for the sake of argument we want to not rely on it.
|
|
36 |
We want to generate fast code ourselves. This means we have to work around
|
|
37 |
the intricacies of what instructions CPUs can actually process. To make
|
|
38 |
this all tractable, we target the LLVM Intermediate Language. In this way
|
|
39 |
we can take advantage of the tools coming with LLVM and for example do not
|
|
40 |
have to worry about things like that CPUs have only a limited amount of
|
|
41 |
registers.
|
539
|
42 |
|
677
|
43 |
LLVM\footnote{\url{http://llvm.org}} is a beautiful example that
|
|
44 |
projects from Academia can make a difference in the world. LLVM was
|
|
45 |
started in 2000 by two researchers at the University of Illinois at
|
|
46 |
Urbana–Champaign. At the time the behemoth of compilers was gcc with its
|
|
47 |
myriad of front-ends for other languages (e.g.~gfortran, Ada, Go,
|
|
48 |
Objective-C, Pascal etc). The problem was that gcc morphed over time
|
|
49 |
into a monolithic gigantic piece of m\ldots ehm software, which you could
|
|
50 |
not mess about in an afternoon. In contrast LLVM was a modular suite of
|
|
51 |
tools with which you could play around easily and try out something new.
|
|
52 |
LLVM became a big player once Apple hired one of the original developers
|
|
53 |
(I cannot remember the reason why Apple did not want to use gcc, but
|
|
54 |
maybe they were also just disgusted by big monolithic codebase). Anyway,
|
|
55 |
LLVM is now the big player and gcc is more or less legacy. This does not
|
|
56 |
mean that programming languages like C and C++ are dying out any time
|
|
57 |
soon---they are nicely supported by LLVM.
|
539
|
58 |
|
677
|
59 |
Targetting the LLVM-IR also means we can profit from the very modular
|
|
60 |
structure of the LLVM compiler and let for example the compiler generate
|
|
61 |
code for X86, or ARM etc. We can be agnostic about where our code
|
|
62 |
actually runs. However, what we have to do is to generate code in
|
|
63 |
\emph{Static Single-Assignment} format (short SSA), because that is what
|
|
64 |
the LLVM-IR expects from us and which is the intermediate format that
|
|
65 |
LLVM can use to do cool things (like targetting strange architectures)
|
|
66 |
and allocating memory efficiently.
|
539
|
67 |
|
677
|
68 |
The idea behind SSA is to use very simple variable assignments where
|
|
69 |
every variable is assigned only once. The assignments also need to be
|
|
70 |
extremely primitive in the sense that they can be just simple operations
|
|
71 |
like addition, multiplication, jumps and so on. A typical program in SSA
|
|
72 |
is
|
539
|
73 |
|
677
|
74 |
\begin{lstlisting}[language=LLVM,numbers=none]
|
|
75 |
x := 1
|
|
76 |
y := 2
|
|
77 |
z := x + y
|
|
78 |
\end{lstlisting}
|
539
|
79 |
|
677
|
80 |
\noindent where every variable is used only once. You cannot for example have
|
|
81 |
|
|
82 |
\begin{lstlisting}[language=LLVM,numbers=none]
|
|
83 |
x := 1
|
|
84 |
y := 2
|
|
85 |
x := x + y
|
|
86 |
\end{lstlisting}
|
539
|
87 |
|
677
|
88 |
\noindent because in this snippet \texttt{x} is assigned twice. There
|
|
89 |
are sophisticated algorithm for imperative languages, like C, for
|
|
90 |
efficiently transforming a program into SSA format, but we do not have
|
|
91 |
to be concerned about those. We want to compile a functional language
|
|
92 |
and there things get much more interesting than just sophisticated. We
|
|
93 |
will need to have a look at CPS translations, which stands for
|
|
94 |
Continuation-Passing-Style---basically black art. So sit tight.
|
539
|
95 |
|
677
|
96 |
\subsection*{CPS-Translations}
|
539
|
97 |
|
677
|
98 |
What is good about our simple fun language is that it basically only
|
|
99 |
contains expressions (be they arithmetic expressions or boolean expressions).
|
|
100 |
The only exceptions are function definitions, for which we can easily
|
|
101 |
use the mechanism of defining functions in LLVM-IR. For example the simple
|
|
102 |
fun program
|
539
|
103 |
|
677
|
104 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
105 |
def dble(x) = x * x
|
|
106 |
\end{lstlisting}
|
539
|
107 |
|
677
|
108 |
\noindent
|
|
109 |
can be compiled into the following LLVM-IR function:
|
539
|
110 |
|
677
|
111 |
\begin{lstlisting}[language=LLVM]
|
|
112 |
define i32 dble(i32 %x) {
|
|
113 |
%tmp = mul i32 %x, % x
|
|
114 |
ret i32 %tmp
|
|
115 |
}
|
|
116 |
\end{lstlisting}
|
539
|
117 |
|
|
118 |
|
|
119 |
\end{document}
|
|
120 |
|
|
121 |
|
|
122 |
%%% Local Variables:
|
|
123 |
%%% mode: latex
|
|
124 |
%%% TeX-master: t
|
|
125 |
%%% End:
|