677
|
1 |
% !TEX program = xelatex
|
539
|
2 |
\documentclass{article}
|
|
3 |
\usepackage{../style}
|
|
4 |
\usepackage{../langs}
|
|
5 |
\usepackage{../graphics}
|
|
6 |
\usepackage{../grammar}
|
677
|
7 |
%%\usepackage{multicol}
|
539
|
8 |
|
677
|
9 |
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
|
539
|
10 |
|
|
11 |
\begin{document}
|
677
|
12 |
\fnote{\copyright{} Christian Urban, King's College London, 2019}
|
539
|
13 |
|
|
14 |
|
677
|
15 |
\section*{Handout 9 (LLVM, SSA and CPS)}
|
539
|
16 |
|
700
|
17 |
Reflecting on our two tiny compilers targetting the JVM, the code
|
|
18 |
generation part was actually not so hard, no? Pretty much just some
|
|
19 |
post-traversal of the abstract syntax tree, yes? One of the reasons for
|
|
20 |
this ease is that the JVM is a stack-based virtual machine and it is
|
|
21 |
therefore not hard to translate deeply-nested arithmetic expressions
|
|
22 |
into a sequence of instructions manipulating the stack. The problem is
|
|
23 |
that ``real'' CPUs, although supporting stack operations, are not really
|
|
24 |
designed to be \emph{stack machines}. The design of CPUs is more like,
|
|
25 |
here is a chunk of memory---compiler, or better compiler writers, do
|
|
26 |
something with it. Consequently, modern compilers need to go the extra
|
|
27 |
mile in order to generate code that is much easier and faster to process
|
|
28 |
by CPUs. To make this all tractable for this module, we target the LLVM
|
680
|
29 |
Intermediate Language. In this way we can take advantage of the tools
|
|
30 |
coming with LLVM. For example we do not have to worry about things like
|
|
31 |
register allocations.\bigskip
|
539
|
32 |
|
678
|
33 |
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
|
700
|
34 |
that projects from Academia can make a difference in the World. LLVM
|
678
|
35 |
started in 2000 as a project by two researchers at the University of
|
|
36 |
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
|
680
|
37 |
gcc with its myriad of front-ends for other languages (C++, Fortran,
|
678
|
38 |
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
|
|
39 |
time into a monolithic gigantic piece of m\ldots ehm software, which you
|
|
40 |
could not mess about in an afternoon. In contrast, LLVM is designed to
|
700
|
41 |
be a modular suite of tools with which you can play around easily and
|
678
|
42 |
try out something new. LLVM became a big player once Apple hired one of
|
|
43 |
the original developers (I cannot remember the reason why Apple did not
|
|
44 |
want to use gcc, but maybe they were also just disgusted by its big
|
|
45 |
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
|
|
46 |
or less legacy. This does not mean that programming languages like C and
|
|
47 |
C++ are dying out any time soon---they are nicely supported by LLVM.
|
539
|
48 |
|
700
|
49 |
We will target the LLVM Intermediate Language, or LLVM Intermediate
|
|
50 |
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
|
|
51 |
assembly language of Jasmin and Krakatau. It will also allow us to
|
|
52 |
benefit from the modular structure of the LLVM compiler and let for
|
|
53 |
example the compiler generate code for different CPUs, like X86 or ARM.
|
|
54 |
That means we can be agnostic about where our code actually runs. We can
|
|
55 |
also be ignorant about optimising code and allocating memory
|
|
56 |
efficiently.
|
539
|
57 |
|
700
|
58 |
However, what we have to do for LLVM is to generate code in \emph{Static
|
|
59 |
Single-Assignment} format (short SSA), because that is what the LLVM-IR
|
|
60 |
expects from us. A reason why LLVM uses the SSA format, rather than
|
|
61 |
JVM-like stack instructions, is that stack instructions are difficult to
|
|
62 |
optimise---you cannot just re-arrange instructions without messing about
|
|
63 |
with what is calculated on the stack. Also it is hard to find out if all
|
|
64 |
the calculations on the stack are actually necessary and not by chance
|
|
65 |
dead code. The JVM has for all these obstacles sophisticated machinery
|
|
66 |
to make such ``high-level'' code still run fast, but let's say that for
|
|
67 |
the sake of argument we do not want to rely on it. We want to generate
|
|
68 |
fast code ourselves. This means we have to work around the intricacies
|
|
69 |
of what instructions CPUs can actually process fast. This is what the
|
|
70 |
SSA format is designed for.
|
|
71 |
|
|
72 |
|
|
73 |
The main idea behind the SSA format is to use very simple variable
|
678
|
74 |
assignments where every variable is assigned only once. The assignments
|
|
75 |
also need to be primitive in the sense that they can be just simple
|
|
76 |
operations like addition, multiplication, jumps, comparisons and so on.
|
680
|
77 |
Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
|
700
|
78 |
corresponding SSA format is
|
680
|
79 |
|
|
80 |
\begin{lstlisting}[language=LLVMIR,numbers=left]
|
|
81 |
let tmp0 = add 1 a in
|
|
82 |
let tmp1 = mul b 5 in
|
|
83 |
let tmp2 = add 3 tmp1 in
|
700
|
84 |
let tmp3 = add tmp0 tmp2 in tmp3
|
677
|
85 |
\end{lstlisting}
|
539
|
86 |
|
678
|
87 |
\noindent where every variable is used only once (we could not write
|
680
|
88 |
\texttt{tmp1 = add 3 tmp1} in Line 3 for example). There are
|
678
|
89 |
sophisticated algorithms for imperative languages, like C, that
|
|
90 |
efficiently transform a high-level program into SSA format. But we can
|
|
91 |
ignore them here. We want to compile a functional language and there
|
|
92 |
things get much more interesting than just sophisticated. We will need
|
|
93 |
to have a look at CPS translations, where the CPS stands for
|
|
94 |
Continuation-Passing-Style---basically black programming art or
|
|
95 |
abracadabra programming. So sit tight.
|
539
|
96 |
|
678
|
97 |
\subsection*{LLVM-IR}
|
539
|
98 |
|
700
|
99 |
Before we start, let's first have a look at the \emph{LLVM Intermediate
|
|
100 |
Representation} in more detail. The LLVM-IR is in between the frontends
|
|
101 |
and backends of the LLVM framework. It allows compilation of multiple
|
|
102 |
source languages to multiple targets. It is also the place where most of
|
|
103 |
the target independent optimisations are performed.
|
|
104 |
|
|
105 |
What is good about our toy Fun language is that it basically only
|
|
106 |
contains expressions (be they arithmetic expressions, boolean
|
|
107 |
expressions or if-expressions). The exception are function definitions.
|
|
108 |
Luckily, for them we can use the mechanism of defining functions in the
|
|
109 |
LLVM-IR (this is similar to using JVM methods for functions in our
|
|
110 |
earlier compiler). For example the simple Fun program
|
539
|
111 |
|
|
112 |
|
677
|
113 |
\begin{lstlisting}[language=Scala,numbers=none]
|
678
|
114 |
def sqr(x) = x * x
|
677
|
115 |
\end{lstlisting}
|
539
|
116 |
|
677
|
117 |
\noindent
|
700
|
118 |
can be compiled to the following LLVM-IR function:
|
539
|
119 |
|
677
|
120 |
\begin{lstlisting}[language=LLVM]
|
678
|
121 |
define i32 @sqr(i32 %x) {
|
|
122 |
%tmp = mul i32 %x, %x
|
677
|
123 |
ret i32 %tmp
|
|
124 |
}
|
|
125 |
\end{lstlisting}
|
539
|
126 |
|
700
|
127 |
\noindent First notice that all variable names, in this case \texttt{x}
|
|
128 |
and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
|
|
129 |
Temporary variables can be named with an identifier, such as
|
|
130 |
\texttt{tmp}, or numbers. Function names, since they are ``global'',
|
|
131 |
need to be prefixed with @-symbol. Also, the LLVM-IR is a fully typed
|
|
132 |
language. The \texttt{i32} type stands for 32-bit integers. There are
|
|
133 |
also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
|
|
134 |
floats, arrays and even pointer types. In the code above, \texttt{sqr}
|
|
135 |
takes an argument of type \texttt{i32} and produces a result of type
|
|
136 |
\texttt{i32} (the result type is in front of the function name, like in
|
|
137 |
C). Each arithmetic operation, for example addition and multiplication,
|
|
138 |
are also prefixed with the type they operate on. Obviously these types
|
|
139 |
need to match up\ldots{} but since we have in our programs only
|
|
140 |
integers, \texttt{i32} everywhere will do. We do not have to generate
|
|
141 |
any other types, but obviously this is a limitation in our Fun-language.
|
539
|
142 |
|
700
|
143 |
There are a few interesting instructions in the LLVM-IR which are quite
|
|
144 |
different than in the JVM. Can you remember the kerfuffle we had to go
|
|
145 |
through with boolean expressions and negating the condition? In the
|
|
146 |
LLVM-IR, branching if-conditions is implemented differently: there
|
|
147 |
is a separate \texttt{br}-instruction as follows:
|
|
148 |
|
|
149 |
\begin{lstlisting}[language=LLVM]
|
|
150 |
br i1 %var, label %if_br, label %else_br
|
|
151 |
\end{lstlisting}
|
|
152 |
|
|
153 |
\noindent
|
|
154 |
The type \texttt{i1} stands for booleans. If the variable is true, then
|
|
155 |
this instruction jumps to the if-branch, which needs an explicit label;
|
|
156 |
otherwise to the else-branch, again with its own label. This allows us
|
|
157 |
to keep the meaning of the boolean expression as is. A value of type
|
|
158 |
boolean is generated in the LLVM-IR by the \texttt{icmp}-instruction.
|
|
159 |
This instruction is for integers (hence the \texttt{i}) and takes the
|
|
160 |
comparison operation as argument. For example
|
|
161 |
|
|
162 |
\begin{lstlisting}[language=LLVM]
|
|
163 |
icmp eq i32 %x, %y ; for equal
|
|
164 |
icmp sle i32 %x, %y ; signed less or equal
|
|
165 |
icmp slt i32 %x, %y ; signed less than
|
|
166 |
icmp ult i32 %x, %y ; unsigned less than
|
|
167 |
\end{lstlisting}
|
|
168 |
|
|
169 |
\noindent
|
|
170 |
In some operations, the LLVM-IR distinguishes between signed and
|
|
171 |
unsigned representations of integers.
|
|
172 |
|
679
|
173 |
Conveniently, you can use the program \texttt{lli}, which comes with
|
|
174 |
LLVM, to interpret programs written in the LLVM-IR. So you can easily
|
|
175 |
check whether the code you produced actually works. To get a running
|
|
176 |
program that does something interesting you need to add some boilerplate
|
|
177 |
about printing out numbers and a main-function that is the entrypoint
|
700
|
178 |
for the program (see Figure~\ref{lli} for a complete listing). Again
|
|
179 |
this is very similar to the boilerplate we needed to add in our JVM
|
|
180 |
compiler.
|
|
181 |
|
|
182 |
You can generate a binary for the program in Figure~\ref{lli} by using
|
|
183 |
the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
|
|
184 |
an object file and \texttt{gcc} (that is clang) generates the
|
|
185 |
executable binary:
|
678
|
186 |
|
679
|
187 |
\begin{lstlisting}[language=bash,numbers=none]
|
|
188 |
llc -filetype=obj sqr.ll
|
|
189 |
gcc sqr.o -o a.out
|
|
190 |
./a.out
|
680
|
191 |
> 25
|
679
|
192 |
\end{lstlisting}
|
|
193 |
|
|
194 |
\begin{figure}[t]\small
|
|
195 |
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
|
700
|
196 |
\caption{An LLVM-IR program for calculating the square function. It
|
|
197 |
calls this function in \texttt{@main} with the argument \texttt{5}. The
|
|
198 |
code for the \texttt{sqr} function is in Lines 13 -- 16. The main
|
|
199 |
function calls \texttt{sqr} and then prints out the result. The other
|
679
|
200 |
code is boilerplate for printing out integers.\label{lli}}
|
678
|
201 |
\end{figure}
|
|
202 |
|
679
|
203 |
|
|
204 |
|
|
205 |
\subsection*{Our Own Intermediate Language}
|
|
206 |
|
|
207 |
Remember compilers have to solve the problem of bridging the gap between
|
680
|
208 |
``high-level'' programs and ``low-level'' hardware. If the gap is too
|
|
209 |
wide for one step, then a good strategy is to lay a stepping stone
|
|
210 |
somewhere in between. The LLVM-IR itself is such a stepping stone to
|
|
211 |
make the task of generating and optimising code easier. Like a real
|
|
212 |
compiler we will use our own stepping stone which I call the
|
700
|
213 |
\emph{K-language}. For what follows recall the various kinds of
|
|
214 |
expressions in the Fun language. For convenience the Scala code of the
|
|
215 |
corresponding abstract syntax trees is shown on top of
|
|
216 |
Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
|
|
217 |
the K-language. There are two kinds of syntactic entities, namely
|
|
218 |
\emph{K-values} and \emph{K-expressions}. The central constructor of the
|
|
219 |
K-language is \texttt{KLet}. For this recall that arithmetic expressions
|
|
220 |
such as $((1 + a) + (3 + (b * 5)))$ need to be broken up into smaller
|
|
221 |
``atomic'' steps, like so
|
680
|
222 |
|
|
223 |
\begin{lstlisting}[language=LLVMIR,numbers=none]
|
|
224 |
let tmp0 = add 1 a in
|
|
225 |
let tmp1 = mul b 5 in
|
|
226 |
let tmp2 = add 3 tmp1 in
|
|
227 |
let tmp3 = add tmp0 tmp2 in
|
|
228 |
tmp3
|
|
229 |
\end{lstlisting}
|
|
230 |
|
|
231 |
\noindent
|
700
|
232 |
Here \texttt{tmp3} will contain the result of what the whole expression
|
|
233 |
stands for. In each individual step we can only perform an ``atomic''
|
|
234 |
operation, like addition or multiplication of a number and a variable.
|
|
235 |
We are not allowed to have for example an if-condition on the right-hand
|
|
236 |
side of an equals. Such constraints are enforced upon us because of how
|
|
237 |
the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
|
|
238 |
first a string (standing for an intermediate result) and second a value,
|
|
239 |
we can fulfil this constraint ``by construction''---there is no way we
|
|
240 |
could write anything else than a value.
|
|
241 |
|
|
242 |
To sum up, K-values are the atomic operations that can be on the
|
|
243 |
right-hand side of equal-signs. The K-language is restricted such that
|
|
244 |
it is easy to generate the SSA format for the LLVM-IR.
|
680
|
245 |
|
679
|
246 |
|
|
247 |
|
|
248 |
\begin{figure}[p]\small
|
678
|
249 |
\begin{lstlisting}[language=Scala,numbers=none]
|
679
|
250 |
// Fun-language (expressions)
|
|
251 |
abstract class Exp
|
|
252 |
abstract class BExp
|
678
|
253 |
|
|
254 |
case class Call(name: String, args: List[Exp]) extends Exp
|
|
255 |
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
|
|
256 |
case class Write(e: Exp) extends Exp
|
|
257 |
case class Var(s: String) extends Exp
|
|
258 |
case class Num(i: Int) extends Exp
|
|
259 |
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
|
|
260 |
case class Sequence(e1: Exp, e2: Exp) extends Exp
|
679
|
261 |
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
|
|
262 |
|
|
263 |
|
|
264 |
|
|
265 |
// K-language (K-expressions, K-values)
|
|
266 |
abstract class KExp
|
|
267 |
abstract class KVal
|
|
268 |
|
|
269 |
case class KVar(s: String) extends KVal
|
|
270 |
case class KNum(i: Int) extends KVal
|
|
271 |
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
|
|
272 |
case class KCall(o: String, vrs: List[KVal]) extends KVal
|
|
273 |
case class KWrite(v: KVal) extends KVal
|
|
274 |
|
|
275 |
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
|
680
|
276 |
case class KLet(x: String, v: KVal, e: KExp) extends KExp
|
679
|
277 |
case class KReturn(v: KVal) extends KExp
|
678
|
278 |
\end{lstlisting}
|
|
279 |
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
|
|
280 |
\end{figure}
|
679
|
281 |
|
678
|
282 |
|
|
283 |
|
|
284 |
\subsection*{CPS-Translations}
|
|
285 |
|
700
|
286 |
The main difficulty of generating instructions in SSA format is that
|
|
287 |
large compound expressions need to be broken up into smaller pieces and
|
|
288 |
intermediate results need to be chained into later instructions. To do
|
|
289 |
this conveniently, CPS-translations have been developed. They use
|
|
290 |
functions (``continuations'') to represent what is coming next in a
|
|
291 |
sequence of instructions.
|
678
|
292 |
|
679
|
293 |
|
|
294 |
|
|
295 |
|
|
296 |
|
539
|
297 |
\end{document}
|
|
298 |
|
|
299 |
|
|
300 |
%%% Local Variables:
|
|
301 |
%%% mode: latex
|
|
302 |
%%% TeX-master: t
|
|
303 |
%%% End:
|