| 
677
 | 
     1  | 
% !TEX program = xelatex
  | 
| 
539
 | 
     2  | 
\documentclass{article}
 | 
| 
 | 
     3  | 
\usepackage{../style}
 | 
| 
 | 
     4  | 
\usepackage{../langs}
 | 
| 
 | 
     5  | 
\usepackage{../graphics}
 | 
| 
 | 
     6  | 
\usepackage{../grammar}
 | 
| 
677
 | 
     7  | 
%%\usepackage{multicol}
 | 
| 
539
 | 
     8  | 
  | 
| 
677
 | 
     9  | 
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
 | 
| 
539
 | 
    10  | 
  | 
| 
 | 
    11  | 
\begin{document}
 | 
| 
677
 | 
    12  | 
\fnote{\copyright{} Christian Urban, King's College London, 2019}
 | 
| 
539
 | 
    13  | 
  | 
| 
 | 
    14  | 
  | 
| 
722
 | 
    15  | 
% CPS translations
  | 
| 
 | 
    16  | 
% https://felleisen.org/matthias/4400-s20/lecture17.html
  | 
| 
 | 
    17  | 
  | 
| 
 | 
    18  | 
%% pattern matching resources
  | 
| 
 | 
    19  | 
%% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
  | 
| 
 | 
    20  | 
  | 
| 
677
 | 
    21  | 
\section*{Handout 9 (LLVM, SSA and CPS)}
 | 
| 
539
 | 
    22  | 
  | 
| 
700
 | 
    23  | 
Reflecting on our two tiny compilers targetting the JVM, the code
  | 
| 
 | 
    24  | 
generation part was actually not so hard, no? Pretty much just some
  | 
| 
 | 
    25  | 
post-traversal of the abstract syntax tree, yes? One of the reasons for
  | 
| 
 | 
    26  | 
this ease is that the JVM is a stack-based virtual machine and it is
  | 
| 
 | 
    27  | 
therefore not hard to translate deeply-nested arithmetic expressions
  | 
| 
 | 
    28  | 
into a sequence of instructions manipulating the stack. The problem is
  | 
| 
 | 
    29  | 
that ``real'' CPUs, although supporting stack operations, are not really
  | 
| 
 | 
    30  | 
designed to be \emph{stack machines}.  The design of CPUs is more like,
 | 
| 
 | 
    31  | 
here is a chunk of memory---compiler, or better compiler writers, do
  | 
| 
 | 
    32  | 
something with it. Consequently, modern compilers need to go the extra
  | 
| 
 | 
    33  | 
mile in order to generate code that is much easier and faster to process
  | 
| 
 | 
    34  | 
by CPUs. To make this all tractable for this module, we target the LLVM
  | 
| 
680
 | 
    35  | 
Intermediate Language. In this way we can take advantage of the tools
  | 
| 
 | 
    36  | 
coming with LLVM. For example we do not have to worry about things like
  | 
| 
 | 
    37  | 
register allocations.\bigskip 
  | 
| 
539
 | 
    38  | 
  | 
| 
678
 | 
    39  | 
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
 | 
| 
700
 | 
    40  | 
that projects from Academia can make a difference in the World. LLVM
  | 
| 
678
 | 
    41  | 
started in 2000 as a project by two researchers at the  University of
  | 
| 
 | 
    42  | 
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
  | 
| 
680
 | 
    43  | 
gcc with its myriad of front-ends for other languages (C++, Fortran,
  | 
| 
678
 | 
    44  | 
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
  | 
| 
 | 
    45  | 
time into a monolithic gigantic piece of m\ldots ehm software, which you
  | 
| 
 | 
    46  | 
could not mess about in an afternoon. In contrast, LLVM is designed to
  | 
| 
700
 | 
    47  | 
be a modular suite of tools with which you can play around easily and
  | 
| 
678
 | 
    48  | 
try out something new. LLVM became a big player once Apple hired one of
  | 
| 
 | 
    49  | 
the original developers (I cannot remember the reason why Apple did not
  | 
| 
 | 
    50  | 
want to use gcc, but maybe they were also just disgusted by its big
  | 
| 
 | 
    51  | 
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
  | 
| 
 | 
    52  | 
or less legacy. This does not mean that programming languages like C and
  | 
| 
 | 
    53  | 
C++ are dying out any time soon---they are nicely supported by LLVM.
  | 
| 
539
 | 
    54  | 
  | 
| 
700
 | 
    55  | 
We will target the LLVM Intermediate Language, or LLVM Intermediate
  | 
| 
 | 
    56  | 
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
  | 
| 
 | 
    57  | 
assembly language of Jasmin and Krakatau. It will also allow us to
  | 
| 
 | 
    58  | 
benefit from the modular structure of the LLVM compiler and let for
  | 
| 
 | 
    59  | 
example the compiler generate code for different CPUs, like X86 or ARM.
  | 
| 
 | 
    60  | 
That means we can be agnostic about where our code actually runs. We can
  | 
| 
 | 
    61  | 
also be ignorant about optimising code and allocating memory
  | 
| 
 | 
    62  | 
efficiently. 
  | 
| 
539
 | 
    63  | 
  | 
| 
700
 | 
    64  | 
However, what we have to do for LLVM is to generate code in \emph{Static
 | 
| 
 | 
    65  | 
Single-Assignment} format (short SSA), because that is what the LLVM-IR
  | 
| 
 | 
    66  | 
expects from us. A reason why LLVM uses the SSA format, rather than
  | 
| 
 | 
    67  | 
JVM-like stack instructions, is that stack instructions are difficult to
  | 
| 
 | 
    68  | 
optimise---you cannot just re-arrange instructions without messing about
  | 
| 
 | 
    69  | 
with what is calculated on the stack. Also it is hard to find out if all
  | 
| 
 | 
    70  | 
the calculations on the stack are actually necessary and not by chance
  | 
| 
 | 
    71  | 
dead code. The JVM has for all these obstacles sophisticated machinery
  | 
| 
 | 
    72  | 
to make such ``high-level'' code still run fast, but let's say that for
  | 
| 
 | 
    73  | 
the sake of argument we do not want to rely on it. We want to generate
  | 
| 
 | 
    74  | 
fast code ourselves. This means we have to work around the intricacies
  | 
| 
 | 
    75  | 
of what instructions CPUs can actually process fast. This is what the
  | 
| 
 | 
    76  | 
SSA format is designed for.
  | 
| 
 | 
    77  | 
  | 
| 
 | 
    78  | 
  | 
| 
 | 
    79  | 
The main idea behind the SSA format is to use very simple variable
  | 
| 
678
 | 
    80  | 
assignments where every variable is assigned only once. The assignments
  | 
| 
 | 
    81  | 
also need to be primitive in the sense that they can be just simple
  | 
| 
 | 
    82  | 
operations like addition, multiplication, jumps, comparisons and so on.
  | 
| 
680
 | 
    83  | 
Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
  | 
| 
700
 | 
    84  | 
corresponding SSA format is 
  | 
| 
680
 | 
    85  | 
 
  | 
| 
 | 
    86  | 
\begin{lstlisting}[language=LLVMIR,numbers=left]
 | 
| 
 | 
    87  | 
let tmp0 = add 1 a in   
  | 
| 
 | 
    88  | 
let tmp1 = mul b 5 in 
  | 
| 
 | 
    89  | 
let tmp2 = add 3 tmp1 in 
  | 
| 
700
 | 
    90  | 
let tmp3 = add tmp0 tmp2 in tmp3 
  | 
| 
677
 | 
    91  | 
\end{lstlisting}
 | 
| 
539
 | 
    92  | 
  | 
| 
678
 | 
    93  | 
\noindent where every variable is used only once (we could not write
  | 
| 
680
 | 
    94  | 
\texttt{tmp1 = add 3 tmp1} in Line 3 for example).  There are
 | 
| 
678
 | 
    95  | 
sophisticated algorithms for imperative languages, like C, that
  | 
| 
 | 
    96  | 
efficiently transform a high-level program into SSA format. But we can
  | 
| 
 | 
    97  | 
ignore them here. We want to compile a functional language and there
  | 
| 
 | 
    98  | 
things get much more interesting than just sophisticated. We will need
  | 
| 
 | 
    99  | 
to have a look at CPS translations, where the CPS stands for
  | 
| 
 | 
   100  | 
Continuation-Passing-Style---basically black programming art or
  | 
| 
 | 
   101  | 
abracadabra programming. So sit tight.
  | 
| 
539
 | 
   102  | 
  | 
| 
678
 | 
   103  | 
\subsection*{LLVM-IR}
 | 
| 
539
 | 
   104  | 
  | 
| 
700
 | 
   105  | 
Before we start, let's first have a look at the \emph{LLVM Intermediate
 | 
| 
 | 
   106  | 
Representation} in more detail. The LLVM-IR is in between the frontends
  | 
| 
 | 
   107  | 
and backends of the LLVM framework. It allows compilation of multiple
  | 
| 
 | 
   108  | 
source languages to multiple targets. It is also the place where most of
  | 
| 
 | 
   109  | 
the target independent optimisations are performed. 
  | 
| 
 | 
   110  | 
  | 
| 
 | 
   111  | 
What is good about our toy Fun language is that it basically only
  | 
| 
 | 
   112  | 
contains expressions (be they arithmetic expressions, boolean
  | 
| 
 | 
   113  | 
expressions or if-expressions). The exception are function definitions.
  | 
| 
 | 
   114  | 
Luckily, for them we can use the mechanism of defining functions in the
  | 
| 
 | 
   115  | 
LLVM-IR (this is similar to using JVM methods for functions in our
  | 
| 
 | 
   116  | 
earlier compiler). For example the simple Fun program 
  | 
| 
539
 | 
   117  | 
  | 
| 
 | 
   118  | 
  | 
| 
677
 | 
   119  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
678
 | 
   120  | 
def sqr(x) = x * x
  | 
| 
677
 | 
   121  | 
\end{lstlisting}
 | 
| 
539
 | 
   122  | 
  | 
| 
677
 | 
   123  | 
\noindent
  | 
| 
700
 | 
   124  | 
can be compiled to the following LLVM-IR function:
  | 
| 
539
 | 
   125  | 
  | 
| 
677
 | 
   126  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
678
 | 
   127  | 
define i32 @sqr(i32 %x) {
 | 
| 
 | 
   128  | 
   %tmp = mul i32 %x, %x
  | 
| 
677
 | 
   129  | 
   ret i32 %tmp
  | 
| 
 | 
   130  | 
}    
  | 
| 
 | 
   131  | 
\end{lstlisting}
 | 
| 
539
 | 
   132  | 
  | 
| 
700
 | 
   133  | 
\noindent First notice that all variable names, in this case \texttt{x}
 | 
| 
 | 
   134  | 
and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
 | 
| 
 | 
   135  | 
Temporary variables can be named with an identifier, such as
  | 
| 
 | 
   136  | 
\texttt{tmp}, or numbers. Function names, since they are ``global'',
 | 
| 
 | 
   137  | 
need to be prefixed with @-symbol. Also, the LLVM-IR is a fully typed
  | 
| 
 | 
   138  | 
language. The \texttt{i32} type stands for 32-bit integers. There are
 | 
| 
 | 
   139  | 
also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
 | 
| 
 | 
   140  | 
floats, arrays and even pointer types. In the code above, \texttt{sqr}
 | 
| 
 | 
   141  | 
takes an argument of type \texttt{i32} and produces a result of type
 | 
| 
 | 
   142  | 
\texttt{i32} (the result type is in front of the function name, like in
 | 
| 
 | 
   143  | 
C). Each arithmetic operation, for example addition and multiplication,
  | 
| 
 | 
   144  | 
are also prefixed with the type they operate on. Obviously these types
  | 
| 
 | 
   145  | 
need to match up\ldots{} but since we have in our programs only
 | 
| 
 | 
   146  | 
integers, \texttt{i32} everywhere will do. We do not have to generate
 | 
| 
701
 | 
   147  | 
any other types, but obviously this is a limitation in our Fun language.
  | 
| 
539
 | 
   148  | 
 
  | 
| 
700
 | 
   149  | 
There are a few interesting instructions in the LLVM-IR which are quite
  | 
| 
701
 | 
   150  | 
different than what we have seen in the JVM. Can you remember the
  | 
| 
 | 
   151  | 
kerfuffle we had to go through with boolean expressions and negating the
  | 
| 
 | 
   152  | 
condition? In the LLVM-IR, branching  if-conditions is implemented
  | 
| 
 | 
   153  | 
differently: there is a separate \texttt{br}-instruction as follows:
 | 
| 
700
 | 
   154  | 
  | 
| 
 | 
   155  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
 | 
   156  | 
br i1 %var, label %if_br, label %else_br
  | 
| 
 | 
   157  | 
\end{lstlisting}
 | 
| 
 | 
   158  | 
  | 
| 
 | 
   159  | 
\noindent
  | 
| 
 | 
   160  | 
The type \texttt{i1} stands for booleans. If the variable is true, then
 | 
| 
 | 
   161  | 
this instruction jumps to the if-branch, which needs an explicit label;
  | 
| 
 | 
   162  | 
otherwise to the else-branch, again with its own label. This allows us
  | 
| 
701
 | 
   163  | 
to keep the meaning of the boolean expression as is when compiling if's.
  | 
| 
 | 
   164  | 
A value of type boolean is generated in the LLVM-IR by the
  | 
| 
 | 
   165  | 
\texttt{icmp}-instruction. This instruction is for integers (hence the
 | 
| 
 | 
   166  | 
\texttt{i}) and takes the comparison operation as argument. For example
 | 
| 
700
 | 
   167  | 
  | 
| 
 | 
   168  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
 | 
   169  | 
icmp eq i32  %x, %y     ; for equal
  | 
| 
 | 
   170  | 
icmp sle i32 %x, %y     ;   signed less or equal
  | 
| 
 | 
   171  | 
icmp slt i32 %x, %y     ;   signed less than
  | 
| 
 | 
   172  | 
icmp ult i32 %x, %y     ;   unsigned less than 
  | 
| 
 | 
   173  | 
\end{lstlisting}
 | 
| 
 | 
   174  | 
  | 
| 
 | 
   175  | 
\noindent
  | 
| 
 | 
   176  | 
In some operations, the LLVM-IR distinguishes between signed and 
  | 
| 
 | 
   177  | 
unsigned representations of integers.
  | 
| 
 | 
   178  | 
  | 
| 
701
 | 
   179  | 
It is also easy to call another function in LLVM-IR: as can be 
  | 
| 
 | 
   180  | 
seen from Figure~\ref{lli} we can just call a function with the 
 | 
| 
 | 
   181  | 
instruction \texttt{call} and can also assign the result to 
 | 
| 
 | 
   182  | 
a variable. The syntax is as follows
  | 
| 
 | 
   183  | 
  | 
| 
 | 
   184  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
 | 
   185  | 
%var = call i32 @foo(...args...)
  | 
| 
 | 
   186  | 
\end{lstlisting}
 | 
| 
 | 
   187  | 
  | 
| 
 | 
   188  | 
\noindent
  | 
| 
 | 
   189  | 
where the arguments can only be simple variables, not compound
  | 
| 
 | 
   190  | 
expressions.
  | 
| 
 | 
   191  | 
  | 
| 
679
 | 
   192  | 
Conveniently, you can use the program \texttt{lli}, which comes with
 | 
| 
 | 
   193  | 
LLVM, to interpret programs written in the LLVM-IR. So you can easily
  | 
| 
 | 
   194  | 
check whether the code you produced actually works. To get a running
  | 
| 
 | 
   195  | 
program that does something interesting you need to add some boilerplate
  | 
| 
701
 | 
   196  | 
about printing out numbers and a main-function that is the entry point
  | 
| 
700
 | 
   197  | 
for the program (see Figure~\ref{lli} for a complete listing). Again
 | 
| 
 | 
   198  | 
this is very similar to the boilerplate we needed to add in our JVM
  | 
| 
 | 
   199  | 
compiler. 
  | 
| 
 | 
   200  | 
  | 
| 
 | 
   201  | 
You can generate a binary for the program in Figure~\ref{lli} by using
 | 
| 
 | 
   202  | 
the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
 | 
| 
 | 
   203  | 
an object file and \texttt{gcc} (that is clang) generates the
 | 
| 
 | 
   204  | 
executable binary:
  | 
| 
678
 | 
   205  | 
  | 
| 
679
 | 
   206  | 
\begin{lstlisting}[language=bash,numbers=none]
 | 
| 
 | 
   207  | 
llc -filetype=obj sqr.ll
  | 
| 
 | 
   208  | 
gcc sqr.o -o a.out
  | 
| 
 | 
   209  | 
./a.out
  | 
| 
680
 | 
   210  | 
> 25
  | 
| 
679
 | 
   211  | 
\end{lstlisting}
 | 
| 
 | 
   212  | 
  | 
| 
 | 
   213  | 
\begin{figure}[t]\small 
 | 
| 
 | 
   214  | 
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
 | 
| 
700
 | 
   215  | 
\caption{An LLVM-IR program for calculating the square function. It
 | 
| 
 | 
   216  | 
calls this function in \texttt{@main} with the argument \texttt{5}. The
 | 
| 
 | 
   217  | 
code for the \texttt{sqr} function is in Lines 13 -- 16. The main
 | 
| 
 | 
   218  | 
function calls \texttt{sqr} and then prints out the result. The other
 | 
| 
679
 | 
   219  | 
code is boilerplate for printing out integers.\label{lli}}
 | 
| 
678
 | 
   220  | 
\end{figure}   
 | 
| 
 | 
   221  | 
  | 
| 
679
 | 
   222  | 
  | 
| 
 | 
   223  | 
    
  | 
| 
 | 
   224  | 
\subsection*{Our Own Intermediate Language}
 | 
| 
 | 
   225  | 
  | 
| 
 | 
   226  | 
Remember compilers have to solve the problem of bridging the gap between
  | 
| 
680
 | 
   227  | 
``high-level'' programs and ``low-level'' hardware. If the gap is too
  | 
| 
 | 
   228  | 
wide for one step, then a good strategy is to lay a stepping stone
  | 
| 
 | 
   229  | 
somewhere in between. The LLVM-IR itself is such a stepping stone to
  | 
| 
 | 
   230  | 
make the task of generating and optimising code easier. Like a real
  | 
| 
 | 
   231  | 
compiler we will use our own stepping stone which I call the
  | 
| 
700
 | 
   232  | 
\emph{K-language}. For what follows recall the various kinds of
 | 
| 
 | 
   233  | 
expressions in the Fun language. For convenience the Scala code of the
  | 
| 
 | 
   234  | 
corresponding abstract syntax trees is shown on top of
  | 
| 
 | 
   235  | 
Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
 | 
| 
701
 | 
   236  | 
the K-language. In K, here are two kinds of syntactic entities, namely
  | 
| 
700
 | 
   237  | 
\emph{K-values} and \emph{K-expressions}. The central constructor of the
 | 
| 
701
 | 
   238  | 
K-language is \texttt{KLet}. For this recall in SSA that arithmetic
 | 
| 
 | 
   239  | 
expressions such as $((1 + a) + (3 + (b * 5)))$ need to be broken up
  | 
| 
 | 
   240  | 
into smaller ``atomic'' steps, like so
  | 
| 
680
 | 
   241  | 
 
  | 
| 
 | 
   242  | 
\begin{lstlisting}[language=LLVMIR,numbers=none]
 | 
| 
 | 
   243  | 
let tmp0 = add 1 a in   
  | 
| 
 | 
   244  | 
let tmp1 = mul b 5 in 
  | 
| 
 | 
   245  | 
let tmp2 = add 3 tmp1 in 
  | 
| 
 | 
   246  | 
let tmp3 = add tmp0 tmp2 in
  | 
| 
 | 
   247  | 
  tmp3 
  | 
| 
 | 
   248  | 
\end{lstlisting}
 | 
| 
 | 
   249  | 
  | 
| 
 | 
   250  | 
\noindent
  | 
| 
700
 | 
   251  | 
Here \texttt{tmp3} will contain the result of what the whole expression
 | 
| 
 | 
   252  | 
stands for. In each individual step we can only perform an ``atomic''
  | 
| 
 | 
   253  | 
operation, like addition or multiplication of a number and a variable.
  | 
| 
 | 
   254  | 
We are not allowed to have for example an if-condition on the right-hand
  | 
| 
 | 
   255  | 
side of an equals. Such constraints are enforced upon us because of how
  | 
| 
 | 
   256  | 
the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
 | 
| 
 | 
   257  | 
first a string (standing for an intermediate result) and second a value,
  | 
| 
 | 
   258  | 
we can fulfil this constraint ``by construction''---there is no way we
  | 
| 
 | 
   259  | 
could write anything else than a value. 
  | 
| 
 | 
   260  | 
  | 
| 
 | 
   261  | 
To sum up, K-values are the atomic operations that can be on the
  | 
| 
 | 
   262  | 
right-hand side of equal-signs. The K-language is restricted such that
  | 
| 
 | 
   263  | 
it is easy to generate the SSA format for the LLVM-IR. 
  | 
| 
680
 | 
   264  | 
  | 
| 
679
 | 
   265  | 
  | 
| 
 | 
   266  | 
  | 
| 
 | 
   267  | 
\begin{figure}[p]\small
 | 
| 
678
 | 
   268  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
701
 | 
   269  | 
// Fun language (expressions)
  | 
| 
679
 | 
   270  | 
abstract class Exp 
  | 
| 
 | 
   271  | 
abstract class BExp 
  | 
| 
678
 | 
   272  | 
  | 
| 
 | 
   273  | 
case class Call(name: String, args: List[Exp]) extends Exp
  | 
| 
 | 
   274  | 
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
  | 
| 
 | 
   275  | 
case class Write(e: Exp) extends Exp
  | 
| 
 | 
   276  | 
case class Var(s: String) extends Exp
  | 
| 
 | 
   277  | 
case class Num(i: Int) extends Exp
  | 
| 
 | 
   278  | 
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
  | 
| 
 | 
   279  | 
case class Sequence(e1: Exp, e2: Exp) extends Exp
  | 
| 
679
 | 
   280  | 
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
  | 
| 
 | 
   281  | 
  | 
| 
 | 
   282  | 
  | 
| 
 | 
   283  | 
  | 
| 
 | 
   284  | 
// K-language (K-expressions, K-values)
  | 
| 
 | 
   285  | 
abstract class KExp
  | 
| 
 | 
   286  | 
abstract class KVal
  | 
| 
 | 
   287  | 
  | 
| 
 | 
   288  | 
case class KVar(s: String) extends KVal
  | 
| 
 | 
   289  | 
case class KNum(i: Int) extends KVal
  | 
| 
 | 
   290  | 
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
  | 
| 
 | 
   291  | 
case class KCall(o: String, vrs: List[KVal]) extends KVal
  | 
| 
 | 
   292  | 
case class KWrite(v: KVal) extends KVal
  | 
| 
 | 
   293  | 
  | 
| 
 | 
   294  | 
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
  | 
| 
680
 | 
   295  | 
case class KLet(x: String, v: KVal, e: KExp) extends KExp
  | 
| 
679
 | 
   296  | 
case class KReturn(v: KVal) extends KExp
  | 
| 
678
 | 
   297  | 
\end{lstlisting}
 | 
| 
 | 
   298  | 
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
 | 
| 
 | 
   299  | 
\end{figure}
 | 
| 
679
 | 
   300  | 
  | 
| 
678
 | 
   301  | 
  | 
| 
 | 
   302  | 
  | 
| 
 | 
   303  | 
\subsection*{CPS-Translations}
 | 
| 
 | 
   304  | 
  | 
| 
704
 | 
   305  | 
CPS stands for Continuation-Passing-Style. It is a kind of programming
  | 
| 
 | 
   306  | 
technique often used in advanced functional programming. Before we delve
  | 
| 
 | 
   307  | 
into the CPS-translation for our Fun language, let us look at 
  | 
| 
 | 
   308  | 
CPS-versions of some well-known functions. Consider
  | 
| 
 | 
   309  | 
  | 
| 
 | 
   310  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   311  | 
def fact(n: Int) : Int = 
  | 
| 
 | 
   312  | 
  if (n == 0) 1 else n * fact(n - 1) 
  | 
| 
 | 
   313  | 
\end{lstlisting}
 | 
| 
 | 
   314  | 
  | 
| 
 | 
   315  | 
\noindent 
  | 
| 
 | 
   316  | 
This is clearly the usual factorial function. But now consider the
  | 
| 
 | 
   317  | 
following version of the factorial function:
  | 
| 
 | 
   318  | 
  | 
| 
 | 
   319  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   320  | 
def factC(n: Int, ret: Int => Int) : Int = 
  | 
| 
 | 
   321  | 
  if (n == 0) ret(1) 
  | 
| 
 | 
   322  | 
  else factC(n - 1, x => ret(n * x)) 
  | 
| 
 | 
   323  | 
  | 
| 
 | 
   324  | 
factC(3, identity)
  | 
| 
 | 
   325  | 
\end{lstlisting}
 | 
| 
 | 
   326  | 
  | 
| 
 | 
   327  | 
\noindent
  | 
| 
 | 
   328  | 
This function is called with the number, in this case 3, and the
  | 
| 
 | 
   329  | 
identity-function (which returns just its input). The recursive
  | 
| 
 | 
   330  | 
calls are:
  | 
| 
 | 
   331  | 
  | 
| 
 | 
   332  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   333  | 
factC(2, x => identity(3 * x))
  | 
| 
 | 
   334  | 
factC(1, x => identity(3 * (2 * x)))
  | 
| 
 | 
   335  | 
factC(0, x => identity(3 * (2 * (1 * x))))
  | 
| 
 | 
   336  | 
\end{lstlisting}
 | 
| 
 | 
   337  | 
  | 
| 
 | 
   338  | 
\noindent
  | 
| 
 | 
   339  | 
Having reached 0, we get out of the recursion and apply 1 to the
  | 
| 
 | 
   340  | 
continuation (see if-branch above). This gives
  | 
| 
 | 
   341  | 
  | 
| 
 | 
   342  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   343  | 
identity(3 * (2 * (1 * 1)))
  | 
| 
 | 
   344  | 
= 3 * (2 * (1 * 1))
  | 
| 
 | 
   345  | 
= 6
  | 
| 
 | 
   346  | 
\end{lstlisting}
 | 
| 
 | 
   347  | 
  | 
| 
 | 
   348  | 
\noindent
  | 
| 
 | 
   349  | 
which is the expected result. If this looks somewhat familiar, then this
  | 
| 
 | 
   350  | 
is not a 1000 miles off, because functions with continuations can be
  | 
| 
 | 
   351  | 
seen as a kind of generalisation of tail-recursive functions. Anyway
  | 
| 
 | 
   352  | 
notice how the continuations is ``stacked up'' during the recursion and
  | 
| 
 | 
   353  | 
then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
  | 
| 
 | 
   354  | 
can do something similar to the Fibonacci function where in the traditional
  | 
| 
 | 
   355  | 
version we have two recursive calls. Consider the following function
  | 
| 
 | 
   356  | 
  | 
| 
 | 
   357  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   358  | 
def fibC(n: Int, ret: Int => Int) : Int = 
  | 
| 
 | 
   359  | 
  if (n == 0 || n == 1) ret(1) 
  | 
| 
 | 
   360  | 
  else fibC(n - 1,
  | 
| 
 | 
   361  | 
             r1 => fibC(n - 2,
  | 
| 
 | 
   362  | 
               r2 => ret(r1 + r2)))
  | 
| 
 | 
   363  | 
\end{lstlisting}
 | 
| 
 | 
   364  | 
  | 
| 
 | 
   365  | 
\noindent
  | 
| 
 | 
   366  | 
Here the continuation is a nested function essentially wrapping up 
  | 
| 
 | 
   367  | 
the second recursive call. Let us check how the recursion unfolds
  | 
| 
 | 
   368  | 
when called with 3 and the identity function:
  | 
| 
 | 
   369  | 
  | 
| 
 | 
   370  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   371  | 
fibC(3, id)
  | 
| 
 | 
   372  | 
fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
  | 
| 
 | 
   373  | 
fibC(1, r1 => 
  | 
| 
 | 
   374  | 
   fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
  | 
| 
 | 
   375  | 
fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
  | 
| 
 | 
   376  | 
fibC(1, r2a => id((1 + 1) + r2a))
  | 
| 
 | 
   377  | 
id((1 + 1) + 1)
  | 
| 
 | 
   378  | 
(1 + 1) + 1
  | 
| 
 | 
   379  | 
3
  | 
| 
 | 
   380  | 
\end{lstlisting}
 | 
| 
 | 
   381  | 
  | 
| 
 | 
   382  | 
Let us now come back to the CPS-translations for the Fun language. The
  | 
| 
 | 
   383  | 
main difficulty of generating instructions in SSA format is that large
  | 
| 
 | 
   384  | 
compound expressions need to be broken up into smaller pieces and
  | 
| 
700
 | 
   385  | 
intermediate results need to be chained into later instructions. To do
  | 
| 
 | 
   386  | 
this conveniently, CPS-translations have been developed. They use
  | 
| 
 | 
   387  | 
functions (``continuations'') to represent what is coming next in a
  | 
| 
701
 | 
   388  | 
sequence of instructions. Continuations are functions of type
  | 
| 
704
 | 
   389  | 
\code{KVal} to \code{KExp}. They can be seen as a sequence of
 | 
| 
 | 
   390  | 
\code{KLet}s where there is a ``hole'' that needs to be filled. Consider
 | 
| 
 | 
   391  | 
for example
  | 
| 
678
 | 
   392  | 
  | 
| 
701
 | 
   393  | 
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
 | 
| 
 | 
   394  | 
let tmp0 = add 1 a in   
  | 
| 
 | 
   395  | 
let tmp1 = mul (*@$\Box$@*) 5 in 
  | 
| 
 | 
   396  | 
let tmp2 = add 3 tmp1 in 
  | 
| 
 | 
   397  | 
let tmp3 = add tmp0 tmp2 in
  | 
| 
 | 
   398  | 
  tmp3 
  | 
| 
 | 
   399  | 
\end{lstlisting}
 | 
| 
 | 
   400  | 
  | 
| 
 | 
   401  | 
\noindent
  | 
| 
 | 
   402  | 
where in the second line is a $\Box$ which still expects a \code{KVal}
 | 
| 
 | 
   403  | 
to be filled in before it becomes a ``proper'' \code{KExp}. When we 
 | 
| 
 | 
   404  | 
apply and argument to the continuation (remember they are functions)
  | 
| 
 | 
   405  | 
we essentially fill something into the corresponding hole. The code
  | 
| 
 | 
   406  | 
of the CPS-translation is 
  | 
| 
679
 | 
   407  | 
  | 
| 
701
 | 
   408  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   409  | 
def CPS(e: Exp)(k: KVal => KExp) : KExp = 
  | 
| 
 | 
   410  | 
  e match { ... }
 | 
| 
 | 
   411  | 
\end{lstlisting}
 | 
| 
679
 | 
   412  | 
  | 
| 
701
 | 
   413  | 
\noindent 
  | 
| 
 | 
   414  | 
where \code{k} is the continuation and \code{e} is the expression 
 | 
| 
 | 
   415  | 
to be compiled. In case we have numbers or variables, we can just
  | 
| 
 | 
   416  | 
apply the continuation like 
  | 
| 
679
 | 
   417  | 
  | 
| 
701
 | 
   418  | 
\begin{center}
 | 
| 
 | 
   419  | 
\code{k(KNum(n))} \qquad \code{k(KVar(x))}
 | 
| 
 | 
   420  | 
\end{center}
 | 
| 
679
 | 
   421  | 
  | 
| 
701
 | 
   422  | 
\noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
 | 
| 
 | 
   423  | 
More interesting is the case for an arithmetic expression.
  | 
| 
 | 
   424  | 
  | 
| 
 | 
   425  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   426  | 
case Aop(o, e1, e2) => {
 | 
| 
 | 
   427  | 
    val z = Fresh("tmp")
 | 
| 
 | 
   428  | 
    CPS(e1)(y1 => 
  | 
| 
 | 
   429  | 
      CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z)))))
  | 
| 
 | 
   430  | 
}
  | 
| 
 | 
   431  | 
\end{lstlisting}
 | 
| 
 | 
   432  | 
  | 
| 
 | 
   433  | 
\noindent
  | 
| 
539
 | 
   434  | 
\end{document}
 | 
| 
 | 
   435  | 
  | 
| 
 | 
   436  | 
  | 
| 
 | 
   437  | 
%%% Local Variables: 
  | 
| 
 | 
   438  | 
%%% mode: latex
  | 
| 
 | 
   439  | 
%%% TeX-master: t
  | 
| 
 | 
   440  | 
%%% End: 
  |