677
|
1 |
% !TEX program = xelatex
|
539
|
2 |
\documentclass{article}
|
|
3 |
\usepackage{../style}
|
|
4 |
\usepackage{../langs}
|
|
5 |
\usepackage{../graphics}
|
|
6 |
\usepackage{../grammar}
|
677
|
7 |
%%\usepackage{multicol}
|
539
|
8 |
|
677
|
9 |
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
|
539
|
10 |
|
|
11 |
\begin{document}
|
677
|
12 |
\fnote{\copyright{} Christian Urban, King's College London, 2019}
|
539
|
13 |
|
|
14 |
|
677
|
15 |
\section*{Handout 9 (LLVM, SSA and CPS)}
|
539
|
16 |
|
678
|
17 |
Reflecting on our tiny compiler targetting the JVM, the code generation
|
|
18 |
part was actually not so hard, no? Pretty much just some post-traversal
|
680
|
19 |
of the abstract syntax tree, yes? One of the reasons for this ease is
|
|
20 |
that the JVM is a stack-based virtual machine and it is therefore not
|
|
21 |
hard to translate deeply-nested arithmetic expressions into a sequence
|
|
22 |
of instructions manipulating the stack. The problem is that ``real''
|
|
23 |
CPUs, although supporting stack operations, are not really designed to
|
|
24 |
be \emph{stack machines}. The design of CPUs is more like, here is a
|
|
25 |
chunk of memory---compiler, or better compiler writers, do something
|
|
26 |
with it. Consequently, modern compilers need to go the extra mile in
|
|
27 |
order to generate code that is much easier and faster to process by
|
|
28 |
CPUs. To make this all tractable for this module, we target the LLVM
|
|
29 |
Intermediate Language. In this way we can take advantage of the tools
|
|
30 |
coming with LLVM. For example we do not have to worry about things like
|
|
31 |
register allocations.\bigskip
|
539
|
32 |
|
678
|
33 |
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
|
|
34 |
that projects from Academia can make a difference in the world. LLVM
|
|
35 |
started in 2000 as a project by two researchers at the University of
|
|
36 |
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
|
680
|
37 |
gcc with its myriad of front-ends for other languages (C++, Fortran,
|
678
|
38 |
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
|
|
39 |
time into a monolithic gigantic piece of m\ldots ehm software, which you
|
|
40 |
could not mess about in an afternoon. In contrast, LLVM is designed to
|
|
41 |
be a modular suite of tools with which you could play around easily and
|
|
42 |
try out something new. LLVM became a big player once Apple hired one of
|
|
43 |
the original developers (I cannot remember the reason why Apple did not
|
|
44 |
want to use gcc, but maybe they were also just disgusted by its big
|
|
45 |
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
|
|
46 |
or less legacy. This does not mean that programming languages like C and
|
|
47 |
C++ are dying out any time soon---they are nicely supported by LLVM.
|
539
|
48 |
|
680
|
49 |
We will target the LLVM Intermediate Language, or Intermediate
|
|
50 |
Representation (short LLVM-IR). As a result we can benefit from the
|
678
|
51 |
modular structure of the LLVM compiler and let for example the compiler
|
680
|
52 |
generate code for different CPUs, like X86 or ARM. That means we can be
|
|
53 |
agnostic about where our code actually runs. We can also be ignorant
|
|
54 |
about optimising code and allocating memory efficiently. However, what
|
|
55 |
we have to do is to generate code in \emph{Static Single-Assignment}
|
|
56 |
format (short SSA), because that is what the LLVM-IR expects from us.
|
539
|
57 |
|
678
|
58 |
The idea behind the SSA format is to use very simple variable
|
|
59 |
assignments where every variable is assigned only once. The assignments
|
|
60 |
also need to be primitive in the sense that they can be just simple
|
|
61 |
operations like addition, multiplication, jumps, comparisons and so on.
|
680
|
62 |
Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
|
|
63 |
SSA is
|
|
64 |
|
|
65 |
\begin{lstlisting}[language=LLVMIR,numbers=left]
|
|
66 |
let tmp0 = add 1 a in
|
|
67 |
let tmp1 = mul b 5 in
|
|
68 |
let tmp2 = add 3 tmp1 in
|
|
69 |
let tmp3 = add tmp0 tmp2 in
|
|
70 |
tmp3
|
677
|
71 |
\end{lstlisting}
|
539
|
72 |
|
678
|
73 |
\noindent where every variable is used only once (we could not write
|
680
|
74 |
\texttt{tmp1 = add 3 tmp1} in Line 3 for example). There are
|
678
|
75 |
sophisticated algorithms for imperative languages, like C, that
|
|
76 |
efficiently transform a high-level program into SSA format. But we can
|
|
77 |
ignore them here. We want to compile a functional language and there
|
|
78 |
things get much more interesting than just sophisticated. We will need
|
|
79 |
to have a look at CPS translations, where the CPS stands for
|
|
80 |
Continuation-Passing-Style---basically black programming art or
|
|
81 |
abracadabra programming. So sit tight.
|
539
|
82 |
|
678
|
83 |
\subsection*{LLVM-IR}
|
539
|
84 |
|
678
|
85 |
Before we start, lets first have a look at the \emph{LLVM Intermediate
|
680
|
86 |
Representation} in more detail. What is good about our toy Fun language
|
|
87 |
is that it basically only contains expressions (be they arithmetic
|
|
88 |
expressions or boolean expressions or if-expressions). The exception are
|
|
89 |
function definitions. Luckily, for them we can use the mechanism of
|
|
90 |
defining functions in LLVM-IR. For example the simple Fun program
|
539
|
91 |
|
|
92 |
|
677
|
93 |
\begin{lstlisting}[language=Scala,numbers=none]
|
678
|
94 |
def sqr(x) = x * x
|
677
|
95 |
\end{lstlisting}
|
539
|
96 |
|
677
|
97 |
\noindent
|
|
98 |
can be compiled into the following LLVM-IR function:
|
539
|
99 |
|
677
|
100 |
\begin{lstlisting}[language=LLVM]
|
678
|
101 |
define i32 @sqr(i32 %x) {
|
|
102 |
%tmp = mul i32 %x, %x
|
677
|
103 |
ret i32 %tmp
|
|
104 |
}
|
|
105 |
\end{lstlisting}
|
539
|
106 |
|
680
|
107 |
\noindent First notice that all variable names in the LLVM-IR are
|
|
108 |
prefixed by \texttt{\%}; function names need to be prefixed with
|
|
109 |
@-symbol. Also, the LLVM-IR is a fully typed language. The \texttt{i32}
|
|
110 |
type stands for 32-bit integers. There are also types for 64-bit
|
|
111 |
integers (\texttt{i64}), chars (\texttt{i8}), floats, arrays and even
|
|
112 |
pointer types. In the code above, \texttt{sqr} takes an argument of type
|
|
113 |
\texttt{i32} and produces a result of type \texttt{i32} (the result type
|
|
114 |
is before the function name, like in C). Each arithmetic operation, like
|
|
115 |
addition or multiplication, are also prefixed with the type they operate
|
|
116 |
on. Obviously these types need to match up\ldots{} but since we have in
|
|
117 |
our programs only integers, \texttt{i32} everywhere will do.
|
539
|
118 |
|
679
|
119 |
Conveniently, you can use the program \texttt{lli}, which comes with
|
|
120 |
LLVM, to interpret programs written in the LLVM-IR. So you can easily
|
|
121 |
check whether the code you produced actually works. To get a running
|
|
122 |
program that does something interesting you need to add some boilerplate
|
|
123 |
about printing out numbers and a main-function that is the entrypoint
|
680
|
124 |
for the program (see Figure~\ref{lli} for a complete listing). You can
|
|
125 |
generate a binary for this program using \texttt{llc}-compiler and
|
|
126 |
\texttt{gcc}---\texttt{llc} can generate an object file and then you can
|
|
127 |
use \texttt{gcc} (that is clang) for generating an executable binary:
|
678
|
128 |
|
679
|
129 |
\begin{lstlisting}[language=bash,numbers=none]
|
|
130 |
llc -filetype=obj sqr.ll
|
|
131 |
gcc sqr.o -o a.out
|
|
132 |
./a.out
|
680
|
133 |
> 25
|
679
|
134 |
\end{lstlisting}
|
|
135 |
|
|
136 |
\begin{figure}[t]\small
|
|
137 |
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
|
|
138 |
\caption{An LLVM-IR program for calculating the square function. The
|
|
139 |
interesting function is \texttt{sqr} in Lines 13 -- 16. The main
|
|
140 |
function calls \texttt{sqr} and prints out the result. The other
|
|
141 |
code is boilerplate for printing out integers.\label{lli}}
|
678
|
142 |
\end{figure}
|
|
143 |
|
679
|
144 |
|
|
145 |
|
|
146 |
\subsection*{Our Own Intermediate Language}
|
|
147 |
|
|
148 |
Remember compilers have to solve the problem of bridging the gap between
|
680
|
149 |
``high-level'' programs and ``low-level'' hardware. If the gap is too
|
|
150 |
wide for one step, then a good strategy is to lay a stepping stone
|
|
151 |
somewhere in between. The LLVM-IR itself is such a stepping stone to
|
|
152 |
make the task of generating and optimising code easier. Like a real
|
|
153 |
compiler we will use our own stepping stone which I call the
|
|
154 |
\emph{K-language}. For this remember expressions (and boolean
|
|
155 |
expressions) in the Fun language. For convenience the Scala code is
|
|
156 |
shown on top of Figure~\ref{absfun}. Below is the code for the
|
|
157 |
K-language. There are two kinds of syntactic entities, namely
|
|
158 |
\emph{K-values} and \emph{K-expressions}. The central point of the
|
|
159 |
K-language is the \texttt{KLet}-constructor. For this recall that
|
|
160 |
arithmetic expressions such as $((1 + a) + (3 + (b * 5)))$ need
|
|
161 |
to be broken up into smaller ``atomic'' steps, like so
|
|
162 |
|
|
163 |
\begin{lstlisting}[language=LLVMIR,numbers=none]
|
|
164 |
let tmp0 = add 1 a in
|
|
165 |
let tmp1 = mul b 5 in
|
|
166 |
let tmp2 = add 3 tmp1 in
|
|
167 |
let tmp3 = add tmp0 tmp2 in
|
|
168 |
tmp3
|
|
169 |
\end{lstlisting}
|
|
170 |
|
|
171 |
\noindent
|
|
172 |
Here \texttt{tmp3} will contain the result of what the expression stands
|
|
173 |
for. In each step we can only perform an ``atomic'' operation, like
|
|
174 |
addition or multiplication. We could not for example have an
|
|
175 |
if-condition on the right-hand side of an equals. These constraints
|
|
176 |
enforced upon us because how the SSA format works in the LLVM-IR. By
|
|
177 |
having in \texttt{KLet}, first a string (standing for an intermediate
|
|
178 |
result) and second a value, we can fulfil this constraint---there is no
|
|
179 |
way we could write anything else than a value. To sum up, K-values are
|
|
180 |
the atomic operations that can be on the right-hand side of equal-signs.
|
|
181 |
The K-language is restricted such that it is easy to generate the SSA format
|
|
182 |
for the LLVM-IR.
|
|
183 |
|
679
|
184 |
|
|
185 |
|
|
186 |
\begin{figure}[p]\small
|
678
|
187 |
\begin{lstlisting}[language=Scala,numbers=none]
|
679
|
188 |
// Fun-language (expressions)
|
|
189 |
abstract class Exp
|
|
190 |
abstract class BExp
|
678
|
191 |
|
|
192 |
case class Call(name: String, args: List[Exp]) extends Exp
|
|
193 |
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
|
|
194 |
case class Write(e: Exp) extends Exp
|
|
195 |
case class Var(s: String) extends Exp
|
|
196 |
case class Num(i: Int) extends Exp
|
|
197 |
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
|
|
198 |
case class Sequence(e1: Exp, e2: Exp) extends Exp
|
679
|
199 |
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
|
|
200 |
|
|
201 |
|
|
202 |
|
|
203 |
// K-language (K-expressions, K-values)
|
|
204 |
abstract class KExp
|
|
205 |
abstract class KVal
|
|
206 |
|
|
207 |
case class KVar(s: String) extends KVal
|
|
208 |
case class KNum(i: Int) extends KVal
|
|
209 |
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
|
|
210 |
case class KCall(o: String, vrs: List[KVal]) extends KVal
|
|
211 |
case class KWrite(v: KVal) extends KVal
|
|
212 |
|
|
213 |
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
|
680
|
214 |
case class KLet(x: String, v: KVal, e: KExp) extends KExp
|
679
|
215 |
case class KReturn(v: KVal) extends KExp
|
678
|
216 |
\end{lstlisting}
|
|
217 |
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
|
|
218 |
\end{figure}
|
679
|
219 |
|
678
|
220 |
|
|
221 |
|
|
222 |
\subsection*{CPS-Translations}
|
|
223 |
|
|
224 |
|
679
|
225 |
|
|
226 |
|
|
227 |
|
|
228 |
Another reason why it makes sense to go the extra mile is that stack
|
|
229 |
instructions are very difficult to optimise---you cannot just re-arrange
|
|
230 |
instructions without messing about with what is calculated on the stack.
|
|
231 |
Also it is hard to find out if all the calculations on the stack are
|
|
232 |
actually necessary and not by chance dead code. The JVM has for all this
|
|
233 |
sophisticated machinery to make such ``high-level'' code still run fast,
|
|
234 |
but let's say that for the sake of argument we do not want to rely on
|
|
235 |
it. We want to generate fast code ourselves. This means we have to work
|
|
236 |
around the intricacies of what instructions CPUs can actually process.
|
|
237 |
|
539
|
238 |
\end{document}
|
|
239 |
|
|
240 |
|
|
241 |
%%% Local Variables:
|
|
242 |
%%% mode: latex
|
|
243 |
%%% TeX-master: t
|
|
244 |
%%% End:
|