| 677 |      1 | % !TEX program = xelatex
 | 
| 539 |      2 | \documentclass{article}
 | 
|  |      3 | \usepackage{../style}
 | 
|  |      4 | \usepackage{../langs}
 | 
|  |      5 | \usepackage{../graphics}
 | 
|  |      6 | \usepackage{../grammar}
 | 
| 677 |      7 | %%\usepackage{multicol}
 | 
| 539 |      8 | 
 | 
| 677 |      9 | %%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
 | 
| 539 |     10 | 
 | 
|  |     11 | \begin{document}
 | 
| 677 |     12 | \fnote{\copyright{} Christian Urban, King's College London, 2019}
 | 
| 539 |     13 | 
 | 
|  |     14 | 
 | 
| 722 |     15 | % CPS translations
 | 
|  |     16 | % https://felleisen.org/matthias/4400-s20/lecture17.html
 | 
|  |     17 | 
 | 
|  |     18 | %% pattern matching resources
 | 
|  |     19 | %% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
 | 
|  |     20 | 
 | 
| 677 |     21 | \section*{Handout 9 (LLVM, SSA and CPS)}
 | 
| 539 |     22 | 
 | 
| 700 |     23 | Reflecting on our two tiny compilers targetting the JVM, the code
 | 
|  |     24 | generation part was actually not so hard, no? Pretty much just some
 | 
|  |     25 | post-traversal of the abstract syntax tree, yes? One of the reasons for
 | 
|  |     26 | this ease is that the JVM is a stack-based virtual machine and it is
 | 
|  |     27 | therefore not hard to translate deeply-nested arithmetic expressions
 | 
|  |     28 | into a sequence of instructions manipulating the stack. The problem is
 | 
|  |     29 | that ``real'' CPUs, although supporting stack operations, are not really
 | 
|  |     30 | designed to be \emph{stack machines}.  The design of CPUs is more like,
 | 
|  |     31 | here is a chunk of memory---compiler, or better compiler writers, do
 | 
|  |     32 | something with it. Consequently, modern compilers need to go the extra
 | 
|  |     33 | mile in order to generate code that is much easier and faster to process
 | 
|  |     34 | by CPUs. To make this all tractable for this module, we target the LLVM
 | 
| 680 |     35 | Intermediate Language. In this way we can take advantage of the tools
 | 
|  |     36 | coming with LLVM. For example we do not have to worry about things like
 | 
|  |     37 | register allocations.\bigskip 
 | 
| 539 |     38 | 
 | 
| 678 |     39 | \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
 | 
| 700 |     40 | that projects from Academia can make a difference in the World. LLVM
 | 
| 678 |     41 | started in 2000 as a project by two researchers at the  University of
 | 
|  |     42 | Illinois at Urbana-Champaign. At the time the behemoth of compilers was
 | 
| 680 |     43 | gcc with its myriad of front-ends for other languages (C++, Fortran,
 | 
| 678 |     44 | Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
 | 
|  |     45 | time into a monolithic gigantic piece of m\ldots ehm software, which you
 | 
|  |     46 | could not mess about in an afternoon. In contrast, LLVM is designed to
 | 
| 700 |     47 | be a modular suite of tools with which you can play around easily and
 | 
| 678 |     48 | try out something new. LLVM became a big player once Apple hired one of
 | 
|  |     49 | the original developers (I cannot remember the reason why Apple did not
 | 
| 898 |     50 | want to use gcc, but maybe they were also just disgusted by gcc's big
 | 
| 678 |     51 | monolithic codebase). Anyway, LLVM is now the big player and gcc is more
 | 
|  |     52 | or less legacy. This does not mean that programming languages like C and
 | 
|  |     53 | C++ are dying out any time soon---they are nicely supported by LLVM.
 | 
| 539 |     54 | 
 | 
| 700 |     55 | We will target the LLVM Intermediate Language, or LLVM Intermediate
 | 
|  |     56 | Representation (short LLVM-IR). The LLVM-IR looks very similar to the
 | 
|  |     57 | assembly language of Jasmin and Krakatau. It will also allow us to
 | 
|  |     58 | benefit from the modular structure of the LLVM compiler and let for
 | 
|  |     59 | example the compiler generate code for different CPUs, like X86 or ARM.
 | 
|  |     60 | That means we can be agnostic about where our code actually runs. We can
 | 
|  |     61 | also be ignorant about optimising code and allocating memory
 | 
|  |     62 | efficiently. 
 | 
| 539 |     63 | 
 | 
| 700 |     64 | However, what we have to do for LLVM is to generate code in \emph{Static
 | 
|  |     65 | Single-Assignment} format (short SSA), because that is what the LLVM-IR
 | 
|  |     66 | expects from us. A reason why LLVM uses the SSA format, rather than
 | 
|  |     67 | JVM-like stack instructions, is that stack instructions are difficult to
 | 
|  |     68 | optimise---you cannot just re-arrange instructions without messing about
 | 
|  |     69 | with what is calculated on the stack. Also it is hard to find out if all
 | 
|  |     70 | the calculations on the stack are actually necessary and not by chance
 | 
|  |     71 | dead code. The JVM has for all these obstacles sophisticated machinery
 | 
|  |     72 | to make such ``high-level'' code still run fast, but let's say that for
 | 
|  |     73 | the sake of argument we do not want to rely on it. We want to generate
 | 
|  |     74 | fast code ourselves. This means we have to work around the intricacies
 | 
|  |     75 | of what instructions CPUs can actually process fast. This is what the
 | 
|  |     76 | SSA format is designed for.
 | 
|  |     77 | 
 | 
|  |     78 | 
 | 
|  |     79 | The main idea behind the SSA format is to use very simple variable
 | 
| 898 |     80 | assignments where every tmp-variable is assigned only once. The assignments
 | 
| 678 |     81 | also need to be primitive in the sense that they can be just simple
 | 
|  |     82 | operations like addition, multiplication, jumps, comparisons and so on.
 | 
| 680 |     83 | Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
 | 
| 700 |     84 | corresponding SSA format is 
 | 
| 680 |     85 |  
 | 
|  |     86 | \begin{lstlisting}[language=LLVMIR,numbers=left]
 | 
|  |     87 | let tmp0 = add 1 a in   
 | 
|  |     88 | let tmp1 = mul b 5 in 
 | 
|  |     89 | let tmp2 = add 3 tmp1 in 
 | 
| 700 |     90 | let tmp3 = add tmp0 tmp2 in tmp3 
 | 
| 677 |     91 | \end{lstlisting}
 | 
| 539 |     92 | 
 | 
| 678 |     93 | \noindent where every variable is used only once (we could not write
 | 
| 898 |     94 | \texttt{tmp1 = add 3 tmp1} in Line 3 for example).
 | 
|  |     95 | 
 | 
|  |     96 | There are sophisticated algorithms for imperative languages, like C,
 | 
|  |     97 | that efficiently transform a high-level program into SSA format. But
 | 
|  |     98 | we can ignore them here. We want to compile a functional language and
 | 
|  |     99 | there things get much more interesting than just sophisticated. We
 | 
|  |    100 | will need to have a look at CPS translations, where the CPS stands for
 | 
| 678 |    101 | Continuation-Passing-Style---basically black programming art or
 | 
|  |    102 | abracadabra programming. So sit tight.
 | 
| 539 |    103 | 
 | 
| 678 |    104 | \subsection*{LLVM-IR}
 | 
| 539 |    105 | 
 | 
| 700 |    106 | Before we start, let's first have a look at the \emph{LLVM Intermediate
 | 
|  |    107 | Representation} in more detail. The LLVM-IR is in between the frontends
 | 
|  |    108 | and backends of the LLVM framework. It allows compilation of multiple
 | 
|  |    109 | source languages to multiple targets. It is also the place where most of
 | 
|  |    110 | the target independent optimisations are performed. 
 | 
|  |    111 | 
 | 
|  |    112 | What is good about our toy Fun language is that it basically only
 | 
|  |    113 | contains expressions (be they arithmetic expressions, boolean
 | 
|  |    114 | expressions or if-expressions). The exception are function definitions.
 | 
|  |    115 | Luckily, for them we can use the mechanism of defining functions in the
 | 
|  |    116 | LLVM-IR (this is similar to using JVM methods for functions in our
 | 
|  |    117 | earlier compiler). For example the simple Fun program 
 | 
| 539 |    118 | 
 | 
|  |    119 | 
 | 
| 677 |    120 | \begin{lstlisting}[language=Scala,numbers=none]
 | 
| 678 |    121 | def sqr(x) = x * x
 | 
| 677 |    122 | \end{lstlisting}
 | 
| 539 |    123 | 
 | 
| 677 |    124 | \noindent
 | 
| 700 |    125 | can be compiled to the following LLVM-IR function:
 | 
| 539 |    126 | 
 | 
| 677 |    127 | \begin{lstlisting}[language=LLVM]
 | 
| 678 |    128 | define i32 @sqr(i32 %x) {
 | 
|  |    129 |    %tmp = mul i32 %x, %x
 | 
| 677 |    130 |    ret i32 %tmp
 | 
|  |    131 | }    
 | 
|  |    132 | \end{lstlisting}
 | 
| 539 |    133 | 
 | 
| 700 |    134 | \noindent First notice that all variable names, in this case \texttt{x}
 | 
|  |    135 | and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
 | 
|  |    136 | Temporary variables can be named with an identifier, such as
 | 
| 898 |    137 | \texttt{tmp}, or numbers. In contrast, function names, since they are ``global'',
 | 
|  |    138 | need to be prefixed with an @-symbol. Also, the LLVM-IR is a fully typed
 | 
| 700 |    139 | language. The \texttt{i32} type stands for 32-bit integers. There are
 | 
|  |    140 | also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
 | 
|  |    141 | floats, arrays and even pointer types. In the code above, \texttt{sqr}
 | 
|  |    142 | takes an argument of type \texttt{i32} and produces a result of type
 | 
|  |    143 | \texttt{i32} (the result type is in front of the function name, like in
 | 
|  |    144 | C). Each arithmetic operation, for example addition and multiplication,
 | 
|  |    145 | are also prefixed with the type they operate on. Obviously these types
 | 
|  |    146 | need to match up\ldots{} but since we have in our programs only
 | 
| 898 |    147 | integers, for the moment \texttt{i32} everywhere will do. We do not have to generate
 | 
| 701 |    148 | any other types, but obviously this is a limitation in our Fun language.
 | 
| 539 |    149 |  
 | 
| 700 |    150 | There are a few interesting instructions in the LLVM-IR which are quite
 | 
| 701 |    151 | different than what we have seen in the JVM. Can you remember the
 | 
|  |    152 | kerfuffle we had to go through with boolean expressions and negating the
 | 
|  |    153 | condition? In the LLVM-IR, branching  if-conditions is implemented
 | 
|  |    154 | differently: there is a separate \texttt{br}-instruction as follows:
 | 
| 700 |    155 | 
 | 
|  |    156 | \begin{lstlisting}[language=LLVM]
 | 
|  |    157 | br i1 %var, label %if_br, label %else_br
 | 
|  |    158 | \end{lstlisting}
 | 
|  |    159 | 
 | 
|  |    160 | \noindent
 | 
|  |    161 | The type \texttt{i1} stands for booleans. If the variable is true, then
 | 
|  |    162 | this instruction jumps to the if-branch, which needs an explicit label;
 | 
| 898 |    163 | otherwise it jumps to the else-branch, again with its own label. This allows us
 | 
|  |    164 | to keep the meaning of the boolean expression ``as is'' when compiling if's---thanks god no more negating the boolean.
 | 
| 701 |    165 | A value of type boolean is generated in the LLVM-IR by the
 | 
|  |    166 | \texttt{icmp}-instruction. This instruction is for integers (hence the
 | 
|  |    167 | \texttt{i}) and takes the comparison operation as argument. For example
 | 
| 700 |    168 | 
 | 
|  |    169 | \begin{lstlisting}[language=LLVM]
 | 
| 898 |    170 | icmp eq  i32 %x, %y     ; for equal
 | 
|  |    171 | icmp sle i32 %x, %y     ; signed less or equal
 | 
|  |    172 | icmp slt i32 %x, %y     ; signed less than
 | 
|  |    173 | icmp ult i32 %x, %y     ; unsigned less than 
 | 
| 700 |    174 | \end{lstlisting}
 | 
|  |    175 | 
 | 
|  |    176 | \noindent
 | 
| 898 |    177 | Note that in some operations the LLVM-IR distinguishes between signed and 
 | 
| 700 |    178 | unsigned representations of integers.
 | 
|  |    179 | 
 | 
| 701 |    180 | It is also easy to call another function in LLVM-IR: as can be 
 | 
|  |    181 | seen from Figure~\ref{lli} we can just call a function with the 
 | 
|  |    182 | instruction \texttt{call} and can also assign the result to 
 | 
|  |    183 | a variable. The syntax is as follows
 | 
|  |    184 | 
 | 
|  |    185 | \begin{lstlisting}[language=LLVM]
 | 
|  |    186 | %var = call i32 @foo(...args...)
 | 
|  |    187 | \end{lstlisting}
 | 
|  |    188 | 
 | 
|  |    189 | \noindent
 | 
|  |    190 | where the arguments can only be simple variables, not compound
 | 
|  |    191 | expressions.
 | 
|  |    192 | 
 | 
| 679 |    193 | Conveniently, you can use the program \texttt{lli}, which comes with
 | 
|  |    194 | LLVM, to interpret programs written in the LLVM-IR. So you can easily
 | 
|  |    195 | check whether the code you produced actually works. To get a running
 | 
|  |    196 | program that does something interesting you need to add some boilerplate
 | 
| 701 |    197 | about printing out numbers and a main-function that is the entry point
 | 
| 700 |    198 | for the program (see Figure~\ref{lli} for a complete listing). Again
 | 
|  |    199 | this is very similar to the boilerplate we needed to add in our JVM
 | 
|  |    200 | compiler. 
 | 
|  |    201 | 
 | 
|  |    202 | You can generate a binary for the program in Figure~\ref{lli} by using
 | 
|  |    203 | the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
 | 
|  |    204 | an object file and \texttt{gcc} (that is clang) generates the
 | 
|  |    205 | executable binary:
 | 
| 678 |    206 | 
 | 
| 679 |    207 | \begin{lstlisting}[language=bash,numbers=none]
 | 
|  |    208 | llc -filetype=obj sqr.ll
 | 
|  |    209 | gcc sqr.o -o a.out
 | 
|  |    210 | ./a.out
 | 
| 680 |    211 | > 25
 | 
| 679 |    212 | \end{lstlisting}
 | 
|  |    213 | 
 | 
|  |    214 | \begin{figure}[t]\small 
 | 
|  |    215 | \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
 | 
| 700 |    216 | \caption{An LLVM-IR program for calculating the square function. It
 | 
|  |    217 | calls this function in \texttt{@main} with the argument \texttt{5}. The
 | 
|  |    218 | code for the \texttt{sqr} function is in Lines 13 -- 16. The main
 | 
|  |    219 | function calls \texttt{sqr} and then prints out the result. The other
 | 
| 679 |    220 | code is boilerplate for printing out integers.\label{lli}}
 | 
| 678 |    221 | \end{figure}   
 | 
|  |    222 | 
 | 
| 679 |    223 | 
 | 
|  |    224 |     
 | 
|  |    225 | \subsection*{Our Own Intermediate Language}
 | 
|  |    226 | 
 | 
|  |    227 | Remember compilers have to solve the problem of bridging the gap between
 | 
| 680 |    228 | ``high-level'' programs and ``low-level'' hardware. If the gap is too
 | 
|  |    229 | wide for one step, then a good strategy is to lay a stepping stone
 | 
|  |    230 | somewhere in between. The LLVM-IR itself is such a stepping stone to
 | 
|  |    231 | make the task of generating and optimising code easier. Like a real
 | 
|  |    232 | compiler we will use our own stepping stone which I call the
 | 
| 700 |    233 | \emph{K-language}. For what follows recall the various kinds of
 | 
|  |    234 | expressions in the Fun language. For convenience the Scala code of the
 | 
|  |    235 | corresponding abstract syntax trees is shown on top of
 | 
|  |    236 | Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
 | 
| 701 |    237 | the K-language. In K, here are two kinds of syntactic entities, namely
 | 
| 700 |    238 | \emph{K-values} and \emph{K-expressions}. The central constructor of the
 | 
| 701 |    239 | K-language is \texttt{KLet}. For this recall in SSA that arithmetic
 | 
|  |    240 | expressions such as $((1 + a) + (3 + (b * 5)))$ need to be broken up
 | 
|  |    241 | into smaller ``atomic'' steps, like so
 | 
| 680 |    242 |  
 | 
|  |    243 | \begin{lstlisting}[language=LLVMIR,numbers=none]
 | 
|  |    244 | let tmp0 = add 1 a in   
 | 
|  |    245 | let tmp1 = mul b 5 in 
 | 
|  |    246 | let tmp2 = add 3 tmp1 in 
 | 
|  |    247 | let tmp3 = add tmp0 tmp2 in
 | 
|  |    248 |   tmp3 
 | 
|  |    249 | \end{lstlisting}
 | 
|  |    250 | 
 | 
|  |    251 | \noindent
 | 
| 700 |    252 | Here \texttt{tmp3} will contain the result of what the whole expression
 | 
|  |    253 | stands for. In each individual step we can only perform an ``atomic''
 | 
|  |    254 | operation, like addition or multiplication of a number and a variable.
 | 
|  |    255 | We are not allowed to have for example an if-condition on the right-hand
 | 
|  |    256 | side of an equals. Such constraints are enforced upon us because of how
 | 
|  |    257 | the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
 | 
|  |    258 | first a string (standing for an intermediate result) and second a value,
 | 
|  |    259 | we can fulfil this constraint ``by construction''---there is no way we
 | 
|  |    260 | could write anything else than a value. 
 | 
|  |    261 | 
 | 
|  |    262 | To sum up, K-values are the atomic operations that can be on the
 | 
|  |    263 | right-hand side of equal-signs. The K-language is restricted such that
 | 
|  |    264 | it is easy to generate the SSA format for the LLVM-IR. 
 | 
| 680 |    265 | 
 | 
| 679 |    266 | 
 | 
|  |    267 | 
 | 
|  |    268 | \begin{figure}[p]\small
 | 
| 678 |    269 | \begin{lstlisting}[language=Scala,numbers=none]
 | 
| 701 |    270 | // Fun language (expressions)
 | 
| 679 |    271 | abstract class Exp 
 | 
|  |    272 | abstract class BExp 
 | 
| 678 |    273 | 
 | 
|  |    274 | case class Call(name: String, args: List[Exp]) extends Exp
 | 
|  |    275 | case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
 | 
|  |    276 | case class Write(e: Exp) extends Exp
 | 
|  |    277 | case class Var(s: String) extends Exp
 | 
|  |    278 | case class Num(i: Int) extends Exp
 | 
|  |    279 | case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
 | 
|  |    280 | case class Sequence(e1: Exp, e2: Exp) extends Exp
 | 
| 679 |    281 | case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
 | 
|  |    282 | 
 | 
|  |    283 | 
 | 
|  |    284 | 
 | 
|  |    285 | // K-language (K-expressions, K-values)
 | 
|  |    286 | abstract class KExp
 | 
|  |    287 | abstract class KVal
 | 
|  |    288 | 
 | 
|  |    289 | case class KVar(s: String) extends KVal
 | 
|  |    290 | case class KNum(i: Int) extends KVal
 | 
|  |    291 | case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
 | 
|  |    292 | case class KCall(o: String, vrs: List[KVal]) extends KVal
 | 
|  |    293 | case class KWrite(v: KVal) extends KVal
 | 
|  |    294 | 
 | 
|  |    295 | case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
 | 
| 680 |    296 | case class KLet(x: String, v: KVal, e: KExp) extends KExp
 | 
| 679 |    297 | case class KReturn(v: KVal) extends KExp
 | 
| 678 |    298 | \end{lstlisting}
 | 
|  |    299 | \caption{Abstract syntax trees for the Fun language.\label{absfun}}
 | 
|  |    300 | \end{figure}
 | 
| 679 |    301 | 
 | 
| 678 |    302 | 
 | 
|  |    303 | 
 | 
|  |    304 | \subsection*{CPS-Translations}
 | 
|  |    305 | 
 | 
| 704 |    306 | CPS stands for Continuation-Passing-Style. It is a kind of programming
 | 
|  |    307 | technique often used in advanced functional programming. Before we delve
 | 
|  |    308 | into the CPS-translation for our Fun language, let us look at 
 | 
|  |    309 | CPS-versions of some well-known functions. Consider
 | 
|  |    310 | 
 | 
|  |    311 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    312 | def fact(n: Int) : Int = 
 | 
|  |    313 |   if (n == 0) 1 else n * fact(n - 1) 
 | 
|  |    314 | \end{lstlisting}
 | 
|  |    315 | 
 | 
|  |    316 | \noindent 
 | 
|  |    317 | This is clearly the usual factorial function. But now consider the
 | 
|  |    318 | following version of the factorial function:
 | 
|  |    319 | 
 | 
|  |    320 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    321 | def factC(n: Int, ret: Int => Int) : Int = 
 | 
|  |    322 |   if (n == 0) ret(1) 
 | 
|  |    323 |   else factC(n - 1, x => ret(n * x)) 
 | 
|  |    324 | 
 | 
|  |    325 | factC(3, identity)
 | 
|  |    326 | \end{lstlisting}
 | 
|  |    327 | 
 | 
|  |    328 | \noindent
 | 
|  |    329 | This function is called with the number, in this case 3, and the
 | 
|  |    330 | identity-function (which returns just its input). The recursive
 | 
|  |    331 | calls are:
 | 
|  |    332 | 
 | 
|  |    333 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    334 | factC(2, x => identity(3 * x))
 | 
|  |    335 | factC(1, x => identity(3 * (2 * x)))
 | 
|  |    336 | factC(0, x => identity(3 * (2 * (1 * x))))
 | 
|  |    337 | \end{lstlisting}
 | 
|  |    338 | 
 | 
|  |    339 | \noindent
 | 
|  |    340 | Having reached 0, we get out of the recursion and apply 1 to the
 | 
|  |    341 | continuation (see if-branch above). This gives
 | 
|  |    342 | 
 | 
|  |    343 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    344 | identity(3 * (2 * (1 * 1)))
 | 
|  |    345 | = 3 * (2 * (1 * 1))
 | 
|  |    346 | = 6
 | 
|  |    347 | \end{lstlisting}
 | 
|  |    348 | 
 | 
|  |    349 | \noindent
 | 
| 898 |    350 | which is the expected result. If this looks somewhat familiar to you,
 | 
|  |    351 | than this is because functions with continuations can be
 | 
| 704 |    352 | seen as a kind of generalisation of tail-recursive functions. Anyway
 | 
|  |    353 | notice how the continuations is ``stacked up'' during the recursion and
 | 
|  |    354 | then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
 | 
|  |    355 | can do something similar to the Fibonacci function where in the traditional
 | 
|  |    356 | version we have two recursive calls. Consider the following function
 | 
|  |    357 | 
 | 
|  |    358 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    359 | def fibC(n: Int, ret: Int => Int) : Int = 
 | 
|  |    360 |   if (n == 0 || n == 1) ret(1) 
 | 
|  |    361 |   else fibC(n - 1,
 | 
|  |    362 |              r1 => fibC(n - 2,
 | 
|  |    363 |                r2 => ret(r1 + r2)))
 | 
|  |    364 | \end{lstlisting}
 | 
|  |    365 | 
 | 
|  |    366 | \noindent
 | 
|  |    367 | Here the continuation is a nested function essentially wrapping up 
 | 
|  |    368 | the second recursive call. Let us check how the recursion unfolds
 | 
|  |    369 | when called with 3 and the identity function:
 | 
|  |    370 | 
 | 
|  |    371 | \begin{lstlisting}[language=Scala, numbers=none]
 | 
|  |    372 | fibC(3, id)
 | 
|  |    373 | fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
 | 
|  |    374 | fibC(1, r1 => 
 | 
|  |    375 |    fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
 | 
|  |    376 | fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
 | 
|  |    377 | fibC(1, r2a => id((1 + 1) + r2a))
 | 
|  |    378 | id((1 + 1) + 1)
 | 
|  |    379 | (1 + 1) + 1
 | 
|  |    380 | 3
 | 
|  |    381 | \end{lstlisting}
 | 
|  |    382 | 
 | 
|  |    383 | Let us now come back to the CPS-translations for the Fun language. The
 | 
|  |    384 | main difficulty of generating instructions in SSA format is that large
 | 
|  |    385 | compound expressions need to be broken up into smaller pieces and
 | 
| 700 |    386 | intermediate results need to be chained into later instructions. To do
 | 
|  |    387 | this conveniently, CPS-translations have been developed. They use
 | 
|  |    388 | functions (``continuations'') to represent what is coming next in a
 | 
| 898 |    389 | sequence of instructions. In our case, continuations are functions of type
 | 
| 704 |    390 | \code{KVal} to \code{KExp}. They can be seen as a sequence of
 | 
|  |    391 | \code{KLet}s where there is a ``hole'' that needs to be filled. Consider
 | 
|  |    392 | for example
 | 
| 678 |    393 | 
 | 
| 701 |    394 | \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
 | 
|  |    395 | let tmp0 = add 1 a in   
 | 
|  |    396 | let tmp1 = mul (*@$\Box$@*) 5 in 
 | 
|  |    397 | let tmp2 = add 3 tmp1 in 
 | 
|  |    398 | let tmp3 = add tmp0 tmp2 in
 | 
|  |    399 |   tmp3 
 | 
|  |    400 | \end{lstlisting}
 | 
|  |    401 | 
 | 
|  |    402 | \noindent
 | 
|  |    403 | where in the second line is a $\Box$ which still expects a \code{KVal}
 | 
|  |    404 | to be filled in before it becomes a ``proper'' \code{KExp}. When we 
 | 
| 898 |    405 | apply an argument to the continuation (remember they are functions)
 | 
| 701 |    406 | we essentially fill something into the corresponding hole. The code
 | 
|  |    407 | of the CPS-translation is 
 | 
| 679 |    408 | 
 | 
| 701 |    409 | \begin{lstlisting}[language=Scala,numbers=none]
 | 
|  |    410 | def CPS(e: Exp)(k: KVal => KExp) : KExp = 
 | 
|  |    411 |   e match { ... }
 | 
|  |    412 | \end{lstlisting}
 | 
| 679 |    413 | 
 | 
| 701 |    414 | \noindent 
 | 
|  |    415 | where \code{k} is the continuation and \code{e} is the expression 
 | 
|  |    416 | to be compiled. In case we have numbers or variables, we can just
 | 
|  |    417 | apply the continuation like 
 | 
| 679 |    418 | 
 | 
| 701 |    419 | \begin{center}
 | 
|  |    420 | \code{k(KNum(n))} \qquad \code{k(KVar(x))}
 | 
|  |    421 | \end{center}
 | 
| 679 |    422 | 
 | 
| 701 |    423 | \noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
 | 
|  |    424 | More interesting is the case for an arithmetic expression.
 | 
|  |    425 | 
 | 
|  |    426 | \begin{lstlisting}[language=Scala,numbers=none]
 | 
|  |    427 | case Aop(o, e1, e2) => {
 | 
|  |    428 |     val z = Fresh("tmp")
 | 
|  |    429 |     CPS(e1)(y1 => 
 | 
|  |    430 |       CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z)))))
 | 
|  |    431 | }
 | 
|  |    432 | \end{lstlisting}
 | 
|  |    433 | 
 | 
|  |    434 | \noindent
 | 
| 898 |    435 | For more such rules, have a look to the code of the fun-llvm
 | 
|  |    436 | compiler.
 | 
|  |    437 | 
 | 
|  |    438 | \noindent
 | 
| 539 |    439 | \end{document}
 | 
|  |    440 | 
 | 
|  |    441 | 
 | 
|  |    442 | %%% Local Variables: 
 | 
|  |    443 | %%% mode: latex
 | 
|  |    444 | %%% TeX-master: t
 | 
|  |    445 | %%% End: 
 |