677
|
1 |
% !TEX program = xelatex
|
539
|
2 |
\documentclass{article}
|
|
3 |
\usepackage{../style}
|
|
4 |
\usepackage{../langs}
|
|
5 |
\usepackage{../graphics}
|
|
6 |
\usepackage{../grammar}
|
|
7 |
|
|
8 |
\begin{document}
|
908
|
9 |
\fnote{\copyright{} Christian Urban, King's College London, 2019, 2023}
|
539
|
10 |
|
|
11 |
|
722
|
12 |
% CPS translations
|
|
13 |
% https://felleisen.org/matthias/4400-s20/lecture17.html
|
|
14 |
|
|
15 |
%% pattern matching resources
|
|
16 |
%% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
|
|
17 |
|
677
|
18 |
\section*{Handout 9 (LLVM, SSA and CPS)}
|
539
|
19 |
|
700
|
20 |
Reflecting on our two tiny compilers targetting the JVM, the code
|
|
21 |
generation part was actually not so hard, no? Pretty much just some
|
908
|
22 |
post-traversal of the abstract syntax tree, yes? One of the reasons
|
|
23 |
for this ease is that the JVM is a stack-based virtual machine and it
|
|
24 |
is therefore not hard to translate deeply-nested arithmetic
|
|
25 |
expressions into a sequence of instructions manipulating the
|
|
26 |
stack. The problem is that ``real'' CPUs, although supporting stack
|
|
27 |
operations, are not really designed to be \emph{stack machines}. The
|
|
28 |
design of CPUs is more like: Here are some operations and a chunk of
|
|
29 |
memory---compiler, or better compiler writers, do something with
|
|
30 |
them. Consequently, modern compilers need to go the extra mile in
|
|
31 |
order to generate code that is much easier and faster to process by
|
|
32 |
actual CPUs, rather than running code on virtual machines that
|
|
33 |
introduce an additional layer of indirection. To make this all
|
|
34 |
tractable for this module, we target the LLVM Intermediate
|
|
35 |
Language. In this way we can take advantage of the tools coming with
|
|
36 |
LLVM.\footnote{\url{http://llvm.org}} For example we do not have to
|
|
37 |
worry about things like register allocations. By using LLVM, however,
|
|
38 |
we also have to pay price in the sense that generating code gets
|
|
39 |
harder\ldots{}unfortunately.
|
539
|
40 |
|
908
|
41 |
\subsection*{LLVM and LLVM-IR}
|
|
42 |
|
|
43 |
\noindent LLVM is a beautiful example
|
700
|
44 |
that projects from Academia can make a difference in the World. LLVM
|
678
|
45 |
started in 2000 as a project by two researchers at the University of
|
|
46 |
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
|
908
|
47 |
gcc with its myriad of front-ends for different programming languages (C++, Fortran,
|
678
|
48 |
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
|
908
|
49 |
time into a monolithic gigantic piece of m\ldots ehm complicated
|
|
50 |
software, which you
|
678
|
51 |
could not mess about in an afternoon. In contrast, LLVM is designed to
|
700
|
52 |
be a modular suite of tools with which you can play around easily and
|
678
|
53 |
try out something new. LLVM became a big player once Apple hired one of
|
|
54 |
the original developers (I cannot remember the reason why Apple did not
|
898
|
55 |
want to use gcc, but maybe they were also just disgusted by gcc's big
|
678
|
56 |
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
|
|
57 |
or less legacy. This does not mean that programming languages like C and
|
|
58 |
C++ are dying out any time soon---they are nicely supported by LLVM.
|
539
|
59 |
|
700
|
60 |
We will target the LLVM Intermediate Language, or LLVM Intermediate
|
|
61 |
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
|
908
|
62 |
assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
|
|
63 |
allow us to benefit from the modular structure of the LLVM compiler
|
|
64 |
and we can let, for example, the compiler generate code for different
|
|
65 |
CPUs, say X86 or ARM. That means we can be agnostic about where our
|
|
66 |
code is actually going to run.\footnote{Anybody want to try to run our
|
|
67 |
programs on Android phones?} We can also be somewhat ignorant about
|
|
68 |
optimising our code and about allocating memory efficiently: the LLVM
|
|
69 |
tools will take care of all this.
|
539
|
70 |
|
908
|
71 |
However, what we have to do in order to make LLVM to play ball is to
|
|
72 |
generate code in \emph{Static Single-Assignment} format (short SSA),
|
|
73 |
because that is what the LLVM-IR expects from us. A reason why LLVM
|
|
74 |
uses the SSA format, rather than JVM-like stack instructions, is that
|
|
75 |
stack instructions are difficult to optimise---you cannot just
|
|
76 |
re-arrange instructions without messing about with what is calculated
|
|
77 |
on the stack. Also it is hard to find out if all the calculations on
|
|
78 |
the stack are actually necessary and not by chance dead code. The JVM
|
|
79 |
has for all these obstacles sophisticated machinery to make such
|
|
80 |
``high-level'' code still run fast, but let's say that for the sake of
|
|
81 |
argument we do not want to rely on it. We want to generate fast code
|
|
82 |
ourselves. This means we have to work around the intricacies of what
|
|
83 |
instructions CPUs can actually process fast. This is what the SSA
|
|
84 |
format is designed for.
|
700
|
85 |
|
|
86 |
|
|
87 |
The main idea behind the SSA format is to use very simple variable
|
908
|
88 |
assignments where every tmp-variable is assigned only once. The
|
|
89 |
assignments also need to be primitive in the sense that they can be
|
|
90 |
just simple operations like addition, multiplication, jumps,
|
|
91 |
comparisons, function calls and so on. Say, we have an expression
|
|
92 |
$((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is
|
680
|
93 |
|
|
94 |
\begin{lstlisting}[language=LLVMIR,numbers=left]
|
908
|
95 |
let tmp0 = add 1 a in
|
|
96 |
let tmp1 = add 3 2 in
|
|
97 |
let tmp2 = call foo(tmp1)
|
|
98 |
let tmp3 = mul b 5 in
|
|
99 |
let tmp4 = add tmp2 tmp3 in
|
|
100 |
let tmp5 = add tmp0 tmp4 in
|
|
101 |
return tmp5
|
677
|
102 |
\end{lstlisting}
|
539
|
103 |
|
908
|
104 |
\noindent where every tmp-variable is used only once (we could, for
|
|
105 |
example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
|
|
106 |
would not change the overall result).
|
898
|
107 |
|
|
108 |
There are sophisticated algorithms for imperative languages, like C,
|
|
109 |
that efficiently transform a high-level program into SSA format. But
|
|
110 |
we can ignore them here. We want to compile a functional language and
|
|
111 |
there things get much more interesting than just sophisticated. We
|
|
112 |
will need to have a look at CPS translations, where the CPS stands for
|
678
|
113 |
Continuation-Passing-Style---basically black programming art or
|
|
114 |
abracadabra programming. So sit tight.
|
539
|
115 |
|
678
|
116 |
\subsection*{LLVM-IR}
|
539
|
117 |
|
908
|
118 |
Before we start, let's however first have a look at the \emph{LLVM Intermediate
|
700
|
119 |
Representation} in more detail. The LLVM-IR is in between the frontends
|
|
120 |
and backends of the LLVM framework. It allows compilation of multiple
|
|
121 |
source languages to multiple targets. It is also the place where most of
|
|
122 |
the target independent optimisations are performed.
|
|
123 |
|
908
|
124 |
What is good about our toy Fun-language is that it basically only
|
700
|
125 |
contains expressions (be they arithmetic expressions, boolean
|
|
126 |
expressions or if-expressions). The exception are function definitions.
|
|
127 |
Luckily, for them we can use the mechanism of defining functions in the
|
|
128 |
LLVM-IR (this is similar to using JVM methods for functions in our
|
908
|
129 |
earlier compiler). For example the simple Fun-program
|
539
|
130 |
|
|
131 |
|
677
|
132 |
\begin{lstlisting}[language=Scala,numbers=none]
|
678
|
133 |
def sqr(x) = x * x
|
677
|
134 |
\end{lstlisting}
|
539
|
135 |
|
677
|
136 |
\noindent
|
700
|
137 |
can be compiled to the following LLVM-IR function:
|
539
|
138 |
|
677
|
139 |
\begin{lstlisting}[language=LLVM]
|
678
|
140 |
define i32 @sqr(i32 %x) {
|
|
141 |
%tmp = mul i32 %x, %x
|
677
|
142 |
ret i32 %tmp
|
|
143 |
}
|
|
144 |
\end{lstlisting}
|
539
|
145 |
|
908
|
146 |
\noindent First notice that all ``local'' variable names, in this case
|
|
147 |
\texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the
|
|
148 |
LLVM-IR. Temporary variables can be named with an identifier, such as
|
|
149 |
\texttt{tmp}, or numbers. In contrast, function names, since they are
|
|
150 |
``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is
|
|
151 |
a fully typed language. The \texttt{i32} type stands for 32-bit
|
|
152 |
integers. There are also types for 64-bit integers (\texttt{i64}),
|
|
153 |
chars (\texttt{i8}), floats, arrays and even pointer types. In the
|
|
154 |
code above, \texttt{sqr} takes an argument of type \texttt{i32} and
|
|
155 |
produces a result of type \texttt{i32} (the result type is in front of
|
|
156 |
the function name, like in C). Each arithmetic operation, for example
|
|
157 |
addition and multiplication, are also prefixed with the type they
|
|
158 |
operate on. Obviously these types need to match up\ldots{} but since
|
|
159 |
we have in our programs only integers, for the moment \texttt{i32}
|
|
160 |
everywhere will do. We do not have to generate any other types, but
|
|
161 |
obviously this is a limitation in our Fun-language (and which we
|
|
162 |
lift in the final CW).
|
539
|
163 |
|
700
|
164 |
There are a few interesting instructions in the LLVM-IR which are quite
|
701
|
165 |
different than what we have seen in the JVM. Can you remember the
|
|
166 |
kerfuffle we had to go through with boolean expressions and negating the
|
908
|
167 |
condition? In the LLVM-IR, branching if-conditions are implemented
|
701
|
168 |
differently: there is a separate \texttt{br}-instruction as follows:
|
700
|
169 |
|
|
170 |
\begin{lstlisting}[language=LLVM]
|
|
171 |
br i1 %var, label %if_br, label %else_br
|
|
172 |
\end{lstlisting}
|
|
173 |
|
|
174 |
\noindent
|
908
|
175 |
The type \texttt{i1} stands for booleans. If the variable is true,
|
|
176 |
then this instruction jumps to the if-branch, which needs an explicit
|
|
177 |
label; otherwise it jumps to the else-branch, again with its own
|
|
178 |
label. This allows us to keep the meaning of the boolean expression
|
|
179 |
``as is'' when compiling if's---thanks god no more negating of
|
|
180 |
booleans.
|
|
181 |
|
701
|
182 |
A value of type boolean is generated in the LLVM-IR by the
|
|
183 |
\texttt{icmp}-instruction. This instruction is for integers (hence the
|
|
184 |
\texttt{i}) and takes the comparison operation as argument. For example
|
700
|
185 |
|
|
186 |
\begin{lstlisting}[language=LLVM]
|
898
|
187 |
icmp eq i32 %x, %y ; for equal
|
|
188 |
icmp sle i32 %x, %y ; signed less or equal
|
|
189 |
icmp slt i32 %x, %y ; signed less than
|
|
190 |
icmp ult i32 %x, %y ; unsigned less than
|
700
|
191 |
\end{lstlisting}
|
|
192 |
|
|
193 |
\noindent
|
898
|
194 |
Note that in some operations the LLVM-IR distinguishes between signed and
|
908
|
195 |
unsigned representations of integers. I assume you know what this means,
|
|
196 |
otherwise just ask.
|
700
|
197 |
|
701
|
198 |
It is also easy to call another function in LLVM-IR: as can be
|
|
199 |
seen from Figure~\ref{lli} we can just call a function with the
|
|
200 |
instruction \texttt{call} and can also assign the result to
|
|
201 |
a variable. The syntax is as follows
|
|
202 |
|
|
203 |
\begin{lstlisting}[language=LLVM]
|
|
204 |
%var = call i32 @foo(...args...)
|
|
205 |
\end{lstlisting}
|
|
206 |
|
|
207 |
\noindent
|
908
|
208 |
where the arguments can only be simple variables and numbers, but not compound
|
701
|
209 |
expressions.
|
|
210 |
|
679
|
211 |
Conveniently, you can use the program \texttt{lli}, which comes with
|
|
212 |
LLVM, to interpret programs written in the LLVM-IR. So you can easily
|
|
213 |
check whether the code you produced actually works. To get a running
|
|
214 |
program that does something interesting you need to add some boilerplate
|
701
|
215 |
about printing out numbers and a main-function that is the entry point
|
908
|
216 |
for the program (see Figure~\ref{lli} for a complete listing of the square function). Again
|
700
|
217 |
this is very similar to the boilerplate we needed to add in our JVM
|
|
218 |
compiler.
|
|
219 |
|
|
220 |
You can generate a binary for the program in Figure~\ref{lli} by using
|
908
|
221 |
the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
|
|
222 |
an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
|
700
|
223 |
executable binary:
|
678
|
224 |
|
679
|
225 |
\begin{lstlisting}[language=bash,numbers=none]
|
|
226 |
llc -filetype=obj sqr.ll
|
|
227 |
gcc sqr.o -o a.out
|
|
228 |
./a.out
|
680
|
229 |
> 25
|
679
|
230 |
\end{lstlisting}
|
|
231 |
|
|
232 |
\begin{figure}[t]\small
|
|
233 |
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
|
700
|
234 |
\caption{An LLVM-IR program for calculating the square function. It
|
908
|
235 |
calls the sqr-function in \texttt{@main} with the argument \texttt{5}
|
|
236 |
(Line 20). The
|
|
237 |
code for the \texttt{sqr} function is in Lines 13 -- 16. It stores
|
|
238 |
the result of sqr in the variable called \texttt{\%i} and then
|
|
239 |
prints out the contents of this variable in Line 21. The other
|
|
240 |
code is boilerplate for printing out integers.\label{lli}}
|
678
|
241 |
\end{figure}
|
|
242 |
|
679
|
243 |
|
|
244 |
|
|
245 |
\subsection*{Our Own Intermediate Language}
|
|
246 |
|
908
|
247 |
Let's get back to our compiler: Remember compilers have to solve the
|
|
248 |
problem of bridging the gap between ``high-level'' programs and
|
|
249 |
``low-level'' hardware. If the gap is too wide for one step, then a
|
|
250 |
good strategy is to lay a stepping stone somewhere in between. The
|
|
251 |
LLVM-IR itself is such a stepping stone to make the task of generating
|
|
252 |
and optimising code easier. Like a real compiler we will use our own
|
|
253 |
stepping stone which I call the \emph{K-language}.
|
|
254 |
|
|
255 |
\begin{center}
|
|
256 |
\begin{tikzpicture}[scale=0.9,font=\bf,
|
|
257 |
node/.style={
|
|
258 |
rectangle,rounded corners=3mm,
|
|
259 |
ultra thick,draw=black!50,minimum height=16mm,
|
|
260 |
minimum width=20mm,
|
|
261 |
top color=white,bottom color=black!20}]
|
|
262 |
|
|
263 |
%\node (0) at (-3,0) {};
|
|
264 |
\node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language};
|
|
265 |
\node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language};
|
|
266 |
\node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR};
|
|
267 |
|
|
268 |
\draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B);
|
|
269 |
\draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C);
|
|
270 |
\end{tikzpicture}
|
|
271 |
\end{center}
|
|
272 |
|
|
273 |
\noindent
|
|
274 |
To see why we need a stepping stone for generating code in SSA-format,
|
|
275 |
considder again arithmetic expressions such as
|
|
276 |
$((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller
|
|
277 |
``atomic'' steps, like so
|
680
|
278 |
|
|
279 |
\begin{lstlisting}[language=LLVMIR,numbers=none]
|
|
280 |
let tmp0 = add 1 a in
|
|
281 |
let tmp1 = mul b 5 in
|
|
282 |
let tmp2 = add 3 tmp1 in
|
|
283 |
let tmp3 = add tmp0 tmp2 in
|
908
|
284 |
return tmp3
|
|
285 |
\end{lstlisting}
|
|
286 |
|
|
287 |
\noindent
|
|
288 |
Here \texttt{tmp3} will contain the result of what the whole
|
|
289 |
expression stands for. In each individual step we can only perform an
|
|
290 |
``atomic'' or ``trival'' operation, like addition or multiplication of
|
|
291 |
a number or a variable. We are not allowed to have for example a
|
|
292 |
nested addition or an if-condition on the right-hand side of an
|
|
293 |
assignment. Such constraints are forced upon us because of how the SSA
|
|
294 |
format works in the LLVM-IR. To simplify matters we represent
|
|
295 |
assignments with two kinds of syntactic entities, namely
|
|
296 |
\emph{K-values} and \emph{K-expressions}. K-values are all ``things''
|
|
297 |
that can appear on the right-hand side of an equal. Say we have
|
|
298 |
the following definition for K-values:
|
|
299 |
|
|
300 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
301 |
enum KVal {
|
|
302 |
case KVar(s: String)
|
|
303 |
case KNum(n: Int)
|
|
304 |
case KAop(op: String, v1: KVal, v2: KVal)
|
|
305 |
case KCall(fname: String, args: List[KVal])
|
|
306 |
}
|
680
|
307 |
\end{lstlisting}
|
|
308 |
|
|
309 |
\noindent
|
908
|
310 |
where a K-value can be a variable, a number or a ``trivial'' binary
|
|
311 |
operation, such as addition or multiplication. The two arguments of a
|
|
312 |
binary operation need to be K-values as well. Finally, we have
|
|
313 |
function calls, but again each argument of the function call needs to
|
|
314 |
be a K-value. One point we need to be careful, however, is that the
|
|
315 |
arguments of the binary operations and function calls are in fact only
|
|
316 |
variables or numbers. To encode this constraint into the type-system
|
|
317 |
of Scala would make things too complicated---to satisfy this
|
|
318 |
constraint is therefore on us. For our Fun-language, we will later on
|
|
319 |
consider some further K-values.
|
679
|
320 |
|
|
321 |
|
908
|
322 |
Our K-expressions will encode the \texttt{let} and the \texttt{return}
|
|
323 |
from the SSA-format (again for the moment we only consider these two
|
|
324 |
constructors---later on we will have if-branches as well).
|
678
|
325 |
|
908
|
326 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
327 |
enum KExp {
|
|
328 |
case KLet(x: String, v: KVal, e: KExp)
|
|
329 |
case KReturn(v: KVal)
|
|
330 |
}
|
|
331 |
\end{lstlisting}
|
679
|
332 |
|
908
|
333 |
\noindent
|
|
334 |
By having in \texttt{KLet} taking first a string (standing for an
|
|
335 |
intermediate variable) and second a value, we can fulfil the SSA
|
|
336 |
constraint in assignments ``by construction''---there is no way we
|
|
337 |
could write anything else than a K-value. Note that the third
|
|
338 |
argument of a \texttt{KLet} is again a K-expression, meaning either
|
|
339 |
another \texttt{KLet} or a \texttt{KReturn}. In this way we can
|
|
340 |
construct a sequence of computations and return a final result. Of
|
|
341 |
course we also have to ensure that all intermediary computations are
|
|
342 |
given (fresh) names---we will use an (ugly) counter for this.
|
679
|
343 |
|
908
|
344 |
To sum up, K-values are the atomic operations that can be on the
|
|
345 |
right-hand side of assignemnts. The K-language is restricted such that
|
|
346 |
it is easy to generate the SSA format for the LLVM-IR. What remains to
|
|
347 |
be done is a translation of our Fun-language into the K-language. The
|
|
348 |
main difficulty is that we need to order the computationa---in teh
|
|
349 |
K-language we only have linear sequence of instructions to need to be
|
|
350 |
followed. Before we explain this, we have a look at some
|
|
351 |
CPS-translation.
|
679
|
352 |
|
|
353 |
|
678
|
354 |
|
|
355 |
|
|
356 |
\subsection*{CPS-Translations}
|
|
357 |
|
704
|
358 |
CPS stands for Continuation-Passing-Style. It is a kind of programming
|
|
359 |
technique often used in advanced functional programming. Before we delve
|
908
|
360 |
into the CPS-translation for our Fun-language, let us look at
|
704
|
361 |
CPS-versions of some well-known functions. Consider
|
|
362 |
|
|
363 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
364 |
def fact(n: Int) : Int =
|
|
365 |
if (n == 0) 1 else n * fact(n - 1)
|
|
366 |
\end{lstlisting}
|
|
367 |
|
|
368 |
\noindent
|
|
369 |
This is clearly the usual factorial function. But now consider the
|
|
370 |
following version of the factorial function:
|
|
371 |
|
|
372 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
373 |
def factC(n: Int, ret: Int => Int) : Int =
|
|
374 |
if (n == 0) ret(1)
|
|
375 |
else factC(n - 1, x => ret(n * x))
|
|
376 |
|
|
377 |
factC(3, identity)
|
|
378 |
\end{lstlisting}
|
|
379 |
|
|
380 |
\noindent
|
|
381 |
This function is called with the number, in this case 3, and the
|
|
382 |
identity-function (which returns just its input). The recursive
|
|
383 |
calls are:
|
|
384 |
|
|
385 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
386 |
factC(2, x => identity(3 * x))
|
|
387 |
factC(1, x => identity(3 * (2 * x)))
|
|
388 |
factC(0, x => identity(3 * (2 * (1 * x))))
|
|
389 |
\end{lstlisting}
|
|
390 |
|
|
391 |
\noindent
|
|
392 |
Having reached 0, we get out of the recursion and apply 1 to the
|
|
393 |
continuation (see if-branch above). This gives
|
|
394 |
|
|
395 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
396 |
identity(3 * (2 * (1 * 1)))
|
|
397 |
= 3 * (2 * (1 * 1))
|
|
398 |
= 6
|
|
399 |
\end{lstlisting}
|
|
400 |
|
|
401 |
\noindent
|
898
|
402 |
which is the expected result. If this looks somewhat familiar to you,
|
|
403 |
than this is because functions with continuations can be
|
908
|
404 |
seen as a kind of generalisation of tail-recursive functions.
|
|
405 |
So far we did not look at this generalisation in earnest.
|
|
406 |
Anyway
|
704
|
407 |
notice how the continuations is ``stacked up'' during the recursion and
|
|
408 |
then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
|
|
409 |
can do something similar to the Fibonacci function where in the traditional
|
|
410 |
version we have two recursive calls. Consider the following function
|
|
411 |
|
|
412 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
413 |
def fibC(n: Int, ret: Int => Int) : Int =
|
|
414 |
if (n == 0 || n == 1) ret(1)
|
|
415 |
else fibC(n - 1,
|
|
416 |
r1 => fibC(n - 2,
|
|
417 |
r2 => ret(r1 + r2)))
|
|
418 |
\end{lstlisting}
|
|
419 |
|
|
420 |
\noindent
|
|
421 |
Here the continuation is a nested function essentially wrapping up
|
|
422 |
the second recursive call. Let us check how the recursion unfolds
|
|
423 |
when called with 3 and the identity function:
|
|
424 |
|
|
425 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
426 |
fibC(3, id)
|
|
427 |
fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
|
|
428 |
fibC(1, r1 =>
|
|
429 |
fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
|
|
430 |
fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
|
|
431 |
fibC(1, r2a => id((1 + 1) + r2a))
|
|
432 |
id((1 + 1) + 1)
|
|
433 |
(1 + 1) + 1
|
|
434 |
3
|
|
435 |
\end{lstlisting}
|
|
436 |
|
908
|
437 |
\noindent
|
|
438 |
The point of this section is that you are playing around with these
|
|
439 |
simple definitions and make sure they calculate the expected result.
|
|
440 |
In the next step, you should understand \emph{how} these functions
|
|
441 |
calculate the result with the continuations.
|
|
442 |
|
|
443 |
|
|
444 |
\section*{Worked Example}
|
|
445 |
|
|
446 |
Let us now come back to the CPS-translations for the Fun-language. The
|
704
|
447 |
main difficulty of generating instructions in SSA format is that large
|
|
448 |
compound expressions need to be broken up into smaller pieces and
|
700
|
449 |
intermediate results need to be chained into later instructions. To do
|
908
|
450 |
this conveniently, we use the CPS-translation mechanism. What continuations
|
|
451 |
essentially solve is the following problem: Given an expressions
|
|
452 |
|
|
453 |
\begin{equation}\label{exp}
|
|
454 |
(1 + 2) * (3 + 4)
|
|
455 |
\end{equation}
|
|
456 |
|
|
457 |
\noindent
|
|
458 |
which of the subexpressions should be calculated first? We just arbitrarily
|
|
459 |
going to decide that it will be from left to right. This means we have
|
|
460 |
to tear the expression shown in \eqref{exp} apart as follows:
|
|
461 |
|
|
462 |
\[
|
|
463 |
(1 + 2) \qquad\text{and}\qquad \Box * (3 + 4)
|
|
464 |
\]
|
|
465 |
|
|
466 |
\noindent
|
|
467 |
The first one will give us a result, but the second one is not
|
|
468 |
really an expression that makes sense. It will only make sense
|
|
469 |
as the next step in the computation when we fill-in the result of
|
|
470 |
$1+2$ into the ``hole'' indicated by $\Box$. Such incomplete
|
|
471 |
expressions can be represented in Scala as functions
|
|
472 |
|
|
473 |
\begin{lstlisting}[language=Scala, numbers=none]
|
|
474 |
r => r * (3 + 4)
|
|
475 |
\end{lstlisting}
|
|
476 |
|
|
477 |
\noindent
|
|
478 |
where \texttt{r} is any result that has been computed earlier. By the
|
|
479 |
way in lambda-calculus notation we would write
|
|
480 |
$\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
|
700
|
481 |
functions (``continuations'') to represent what is coming next in a
|
908
|
482 |
sequence of instructions. In our case, continuations are functions of
|
|
483 |
type \code{KVal} to \code{KExp}. They can be seen as a sequence of
|
|
484 |
\code{KLet}s where there is a ``hole'' that needs to be
|
|
485 |
filled. Consider for example
|
678
|
486 |
|
701
|
487 |
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
|
|
488 |
let tmp0 = add 1 a in
|
|
489 |
let tmp1 = mul (*@$\Box$@*) 5 in
|
|
490 |
let tmp2 = add 3 tmp1 in
|
|
491 |
let tmp3 = add tmp0 tmp2 in
|
908
|
492 |
retrun tmp3
|
701
|
493 |
\end{lstlisting}
|
|
494 |
|
|
495 |
\noindent
|
|
496 |
where in the second line is a $\Box$ which still expects a \code{KVal}
|
|
497 |
to be filled in before it becomes a ``proper'' \code{KExp}. When we
|
898
|
498 |
apply an argument to the continuation (remember they are functions)
|
908
|
499 |
we essentially fill something into the corresponding hole.
|
|
500 |
|
|
501 |
Lets look at concrete code. To simplify matters first,
|
|
502 |
suppose our source language consists just of three kinds of expressions
|
|
503 |
|
|
504 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
505 |
enum Expr {
|
|
506 |
case Num(n: Int)
|
|
507 |
case Bop(op: String, e1: Expr, e2: Expr)
|
|
508 |
case Call(fname: String, args: List[Expr])
|
|
509 |
}
|
|
510 |
\end{lstlisting}
|
|
511 |
|
|
512 |
\noindent
|
|
513 |
The code of the CPS-translation is then of the form
|
679
|
514 |
|
701
|
515 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
516 |
def CPS(e: Exp)(k: KVal => KExp) : KExp =
|
|
517 |
e match { ... }
|
|
518 |
\end{lstlisting}
|
679
|
519 |
|
701
|
520 |
\noindent
|
908
|
521 |
where \code{k} is the continuation and \code{e} is the expression to
|
|
522 |
be compiled. In case we have numbers, we can just pass them to the
|
|
523 |
continuations because numbers need not be further teared apart as they
|
|
524 |
are already atomic. Passing the number to the continuation means we
|
701
|
525 |
apply the continuation like
|
679
|
526 |
|
908
|
527 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
528 |
case Num(i) => k(KNum(i))
|
|
529 |
\end{lstlisting}
|
679
|
530 |
|
701
|
531 |
\noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
|
908
|
532 |
There is no need to create a temporary variable for simple numbers.
|
|
533 |
More interesting is the case for arithmetic operations.
|
|
534 |
|
|
535 |
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
|
|
536 |
case Aop(op, e1, e2) => {
|
|
537 |
val z = Fresh("tmp")
|
|
538 |
CPS(e1)(r1 =>
|
|
539 |
CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))))
|
|
540 |
}
|
|
541 |
\end{lstlisting}
|
|
542 |
|
|
543 |
\noindent
|
|
544 |
What we essentially have to do in this case is the following: compile
|
|
545 |
the subexpressions \texttt{e1} and \texttt{e2}. They will produce some
|
|
546 |
result that is stored in two temporary variables (assuming they are more
|
|
547 |
complicated than just numbers). We need to use these two temporary
|
|
548 |
variables and feed them into a new assignment of the form
|
|
549 |
|
|
550 |
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
|
|
551 |
let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in
|
|
552 |
...
|
|
553 |
\end{lstlisting}
|
701
|
554 |
|
908
|
555 |
\noindent
|
|
556 |
Note that this assignment has two ``holes'', one for the left
|
|
557 |
subexpression and one for the right subexpression. We can fill both
|
|
558 |
holes by calling the CPS function twice---this is a bit of the black
|
|
559 |
art in CPS. We can use the second call of CPS as the continuation
|
|
560 |
of the first call: The reason is that
|
|
561 |
|
|
562 |
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
|
|
563 |
r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
|
|
564 |
\end{lstlisting}
|
|
565 |
|
|
566 |
\noindent is a function from \code{KVal} to \code{KExp}. Once
|
|
567 |
we created the assignment with the fresh temporary variable
|
|
568 |
\texttt{z}, we need to ``communicate'' that the result of the
|
|
569 |
computation of the arithmetic expression is stored in \texttt{z} to the
|
|
570 |
computations that follow. In this way we apply the continuation \texttt{k} with this
|
|
571 |
new variable (essentially we are plugging in a hole further down the line).
|
|
572 |
|
|
573 |
The last case we need to consider in our small expression language are
|
|
574 |
function calls.
|
|
575 |
|
|
576 |
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
|
|
577 |
case Call(fname, args) => {
|
|
578 |
def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
|
|
579 |
case Nil => {
|
|
580 |
val z = Fresh("tmp")
|
|
581 |
KLet(z, KCall(fname, vs), k(KVar(z)))
|
|
582 |
}
|
|
583 |
case a::as => CPS(a)(r => aux(as, vs ::: List(r)))
|
|
584 |
}
|
|
585 |
aux(args, Nil)
|
701
|
586 |
}
|
|
587 |
\end{lstlisting}
|
|
588 |
|
|
589 |
\noindent
|
908
|
590 |
For them we introduce an auxiliary function that compiles first all
|
|
591 |
function arguments---remember in our source language we can have calls
|
|
592 |
such as $foo(3 + 4, g(3))$ where we first have to create temporary
|
|
593 |
variables (and computations) for each argument. Therefore \texttt{aux}
|
|
594 |
analyses the arguments from left to right. In case there is an argument
|
|
595 |
\texttt{a} on the front of the list (the case \texttt{a::as}), we call
|
|
596 |
CPS recursively for the corresponding subexpression. The temporary variable
|
|
597 |
containing the result for this argument we add to the end of the K-values we
|
|
598 |
have already analysed before. Again very conveniently we can use the
|
|
599 |
recursive call to \texttt{aux} as the continuation for the computations
|
|
600 |
that follow. If we reach the end of the argument list (the \texttt{Nil}-case),
|
|
601 |
then we collect all the K-values \texttt{v1} to \texttt{vn} and call the
|
|
602 |
function in the K-language like so
|
|
603 |
|
|
604 |
|
|
605 |
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
|
|
606 |
let z = call foo(v1,...,vn) in
|
|
607 |
...
|
|
608 |
\end{lstlisting}
|
|
609 |
|
|
610 |
\noindent
|
|
611 |
Again we need to communicate the result of the function call, namely the
|
|
612 |
fresh temporary variable \texttt{z}, to the rest of the computation. Therefore
|
|
613 |
we apply the continuation \texttt{k} with this variable.
|
|
614 |
|
|
615 |
|
|
616 |
|
|
617 |
\begin{figure}[p]\small
|
|
618 |
\lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc}
|
|
619 |
\hspace{10mm}...
|
|
620 |
\lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
|
|
621 |
\caption{CPS-translation for a simple expression language.\label{cps}}
|
|
622 |
\end{figure}
|
|
623 |
|
|
624 |
The last question we need to answer in the code (see Figure~\ref{cps}) is how we
|
|
625 |
start the CPS-translation function, or more precisely with which continuation we
|
|
626 |
should start CPS? It turns out that this initial continuation will be the last one that is
|
|
627 |
called. What should be the last step in the computation? Yes, we need to return the
|
|
628 |
temporary variable where the last result is stored (if it is a simple number, then we can
|
|
629 |
just return this number). Therefore we cal CPS with the code
|
|
630 |
|
|
631 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
632 |
def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
|
|
633 |
\end{lstlisting}
|
|
634 |
|
|
635 |
\noindent
|
|
636 |
where we give the function \code{KReturn(_)}.
|
|
637 |
|
|
638 |
|
|
639 |
\begin{figure}[p]\small
|
|
640 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
641 |
// Fun language (expressions)
|
|
642 |
abstract class Exp
|
|
643 |
abstract class BExp
|
|
644 |
|
|
645 |
case class Call(name: String, args: List[Exp]) extends Exp
|
|
646 |
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
|
|
647 |
case class Write(e: Exp) extends Exp
|
|
648 |
case class Var(s: String) extends Exp
|
|
649 |
case class Num(i: Int) extends Exp
|
|
650 |
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
|
|
651 |
case class Sequence(e1: Exp, e2: Exp) extends Exp
|
|
652 |
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
|
|
653 |
|
|
654 |
|
|
655 |
|
|
656 |
// K-language (K-expressions, K-values)
|
|
657 |
abstract class KExp
|
|
658 |
abstract class KVal
|
|
659 |
|
|
660 |
case class KVar(s: String) extends KVal
|
|
661 |
case class KNum(i: Int) extends KVal
|
|
662 |
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
|
|
663 |
case class KCall(o: String, vrs: List[KVal]) extends KVal
|
|
664 |
case class KWrite(v: KVal) extends KVal
|
|
665 |
|
|
666 |
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
|
|
667 |
case class KLet(x: String, v: KVal, e: KExp) extends KExp
|
|
668 |
case class KReturn(v: KVal) extends KExp
|
|
669 |
\end{lstlisting}
|
|
670 |
\caption{Abstract syntax trees for the Fun-language.\label{absfun}}
|
|
671 |
\end{figure}
|
|
672 |
|
|
673 |
|
898
|
674 |
|
|
675 |
\noindent
|
539
|
676 |
\end{document}
|
|
677 |
|
|
678 |
|
|
679 |
%%% Local Variables:
|
|
680 |
%%% mode: latex
|
|
681 |
%%% TeX-master: t
|
|
682 |
%%% End:
|