| 
677
 | 
     1  | 
% !TEX program = xelatex
  | 
| 
539
 | 
     2  | 
\documentclass{article}
 | 
| 
 | 
     3  | 
\usepackage{../style}
 | 
| 
 | 
     4  | 
\usepackage{../langs}
 | 
| 
 | 
     5  | 
\usepackage{../graphics}
 | 
| 
 | 
     6  | 
\usepackage{../grammar}
 | 
| 
 | 
     7  | 
  | 
| 
 | 
     8  | 
\begin{document}
 | 
| 
908
 | 
     9  | 
\fnote{\copyright{} Christian Urban, King's College London, 2019, 2023}
 | 
| 
539
 | 
    10  | 
  | 
| 
 | 
    11  | 
  | 
| 
722
 | 
    12  | 
% CPS translations
  | 
| 
 | 
    13  | 
% https://felleisen.org/matthias/4400-s20/lecture17.html
  | 
| 
 | 
    14  | 
  | 
| 
 | 
    15  | 
%% pattern matching resources
  | 
| 
 | 
    16  | 
%% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
  | 
| 
 | 
    17  | 
  | 
| 
677
 | 
    18  | 
\section*{Handout 9 (LLVM, SSA and CPS)}
 | 
| 
539
 | 
    19  | 
  | 
| 
700
 | 
    20  | 
Reflecting on our two tiny compilers targetting the JVM, the code
  | 
| 
 | 
    21  | 
generation part was actually not so hard, no? Pretty much just some
  | 
| 
913
 | 
    22  | 
post-traversal of the abstract syntax tree. Yes? One of the reasons
  | 
| 
908
 | 
    23  | 
for this ease is that the JVM is a stack-based virtual machine and it
  | 
| 
 | 
    24  | 
is therefore not hard to translate deeply-nested arithmetic
  | 
| 
 | 
    25  | 
expressions into a sequence of instructions manipulating the
  | 
| 
913
 | 
    26  | 
stack. That is pretty much the whole point of the JVM. The problem is
  | 
| 
 | 
    27  | 
that ``real'' CPUs, although supporting stack operations, are not
  | 
| 
 | 
    28  | 
really designed to be \emph{stack machines}.  The design of CPUs is
 | 
| 
 | 
    29  | 
more like: Here are some instructions and a chunk of
  | 
| 
908
 | 
    30  | 
memory---compiler, or better compiler writers, do something with
  | 
| 
 | 
    31  | 
them. Consequently, modern compilers need to go the extra mile in
  | 
| 
 | 
    32  | 
order to generate code that is much easier and faster to process by
  | 
| 
 | 
    33  | 
actual CPUs, rather than running code on virtual machines that
  | 
| 
 | 
    34  | 
introduce an additional layer of indirection. To make this all
  | 
| 
913
 | 
    35  | 
tractable for this module, we target the \emph{LLVM Intermediate
 | 
| 
 | 
    36  | 
  Language} (LLVM-IR). In this way we can take advantage of the tools
  | 
| 
 | 
    37  | 
coming with LLVM.\footnote{\url{http://llvm.org}} For example we do
 | 
| 
 | 
    38  | 
not have to worry about things like register allocations or peephole
  | 
| 
 | 
    39  | 
optimisations. By using the LLVM-IR, however, we also have to pay
  | 
| 
 | 
    40  | 
price in the sense that generating code gets
  | 
| 
 | 
    41  | 
harder\ldots{}unfor\-tunately nothing comes for free in life.
 | 
| 
539
 | 
    42  | 
  | 
| 
909
 | 
    43  | 
\subsection*{LLVM and the LLVM-IR}
 | 
| 
908
 | 
    44  | 
  | 
| 
 | 
    45  | 
\noindent LLVM is a beautiful example
  | 
| 
913
 | 
    46  | 
that projects from Academia can make a difference in the Real World. LLVM
  | 
| 
678
 | 
    47  | 
started in 2000 as a project by two researchers at the  University of
  | 
| 
 | 
    48  | 
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
  | 
| 
908
 | 
    49  | 
gcc with its myriad of front-ends for different programming languages (C++, Fortran,
  | 
| 
678
 | 
    50  | 
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
  | 
| 
908
 | 
    51  | 
time into a monolithic gigantic piece of m\ldots ehm complicated
  | 
| 
 | 
    52  | 
software, which you
  | 
| 
678
 | 
    53  | 
could not mess about in an afternoon. In contrast, LLVM is designed to
  | 
| 
700
 | 
    54  | 
be a modular suite of tools with which you can play around easily and
  | 
| 
678
 | 
    55  | 
try out something new. LLVM became a big player once Apple hired one of
  | 
| 
 | 
    56  | 
the original developers (I cannot remember the reason why Apple did not
  | 
| 
898
 | 
    57  | 
want to use gcc, but maybe they were also just disgusted by gcc's big
  | 
| 
678
 | 
    58  | 
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
  | 
| 
 | 
    59  | 
or less legacy. This does not mean that programming languages like C and
  | 
| 
 | 
    60  | 
C++ are dying out any time soon---they are nicely supported by LLVM.
  | 
| 
539
 | 
    61  | 
  | 
| 
700
 | 
    62  | 
We will target the LLVM Intermediate Language, or LLVM Intermediate
  | 
| 
 | 
    63  | 
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
  | 
| 
908
 | 
    64  | 
assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
  | 
| 
 | 
    65  | 
allow us to benefit from the modular structure of the LLVM compiler
  | 
| 
909
 | 
    66  | 
and we can let the compiler generate code for different CPUs, for
  | 
| 
 | 
    67  | 
example X86 or ARM.  That means we can be agnostic about where our
  | 
| 
908
 | 
    68  | 
code is actually going to run.\footnote{Anybody want to try to run our
 | 
| 
 | 
    69  | 
  programs on Android phones?}  We can also be somewhat ignorant about
  | 
| 
912
 | 
    70  | 
optimising our code and about allocating memory efficiently---the LLVM
  | 
| 
908
 | 
    71  | 
tools will take care of all this.
  | 
| 
539
 | 
    72  | 
  | 
| 
908
 | 
    73  | 
However, what we have to do in order to make LLVM to play ball is to
  | 
| 
912
 | 
    74  | 
generate code in \emph{Static Single-Assignment} format (short SSA). A
 | 
| 
 | 
    75  | 
reason why LLVM uses the SSA-format, rather than JVM-like stack
  | 
| 
 | 
    76  | 
instructions, is that stack instructions are difficult to
  | 
| 
 | 
    77  | 
optimise---you cannot just re-arrange instructions without messing
  | 
| 
913
 | 
    78  | 
about with what is calculated on the stack. Have a look at the
  | 
| 
 | 
    79  | 
expression $((a + b) * 4) - (3 * (a + b))$ and the corresponding JVM
  | 
| 
 | 
    80  | 
instructions:
  | 
| 
 | 
    81  | 
  | 
| 
 | 
    82  | 
\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
 | 
| 
 | 
    83  | 
iload a
  | 
| 
 | 
    84  | 
iload b
  | 
| 
 | 
    85  | 
iadd
  | 
| 
 | 
    86  | 
ldc 4
  | 
| 
 | 
    87  | 
imul
  | 
| 
 | 
    88  | 
ldc 3
  | 
| 
 | 
    89  | 
iload a
  | 
| 
 | 
    90  | 
iload b
  | 
| 
 | 
    91  | 
iadd
  | 
| 
 | 
    92  | 
imul
  | 
| 
 | 
    93  | 
isub
  | 
| 
 | 
    94  | 
\end{lstlisting}
 | 
| 
 | 
    95  | 
  | 
| 
 | 
    96  | 
\noindent
  | 
| 
 | 
    97  | 
and try to reorganise the code such that you calculate the expression
  | 
| 
 | 
    98  | 
$(a + b)$ only once. This requires either quite a bit of
  | 
| 
 | 
    99  | 
math-understanding from the compiler or you need to add ``copy-and-fetch''
  | 
| 
 | 
   100  | 
of a result from a local variable.  Also it is hard to find out if all
  | 
| 
 | 
   101  | 
the calculations on the stack are actually necessary and not by chance
  | 
| 
 | 
   102  | 
dead code. The JVM has for all these obstacles sophisticated machinery
  | 
| 
 | 
   103  | 
to make such ``high-level'' code still run fast, but let's say that
  | 
| 
 | 
   104  | 
for the sake of argument we do not want to rely on it. We want to
  | 
| 
 | 
   105  | 
generate fast code ourselves. This means we have to work around the
  | 
| 
 | 
   106  | 
intricacies of what instructions CPUs can actually process fast. This
  | 
| 
 | 
   107  | 
is what the SSA format is designed for.
  | 
| 
700
 | 
   108  | 
  | 
| 
 | 
   109  | 
  | 
| 
909
 | 
   110  | 
The main idea behind the SSA-format is to have sequences of very
  | 
| 
 | 
   111  | 
simple variable assignments where every (tmp)variable is assigned only
  | 
| 
 | 
   112  | 
once. The assignments need to be simple in the sense that they can be
  | 
| 
 | 
   113  | 
just be primitive operations like addition, multiplication, jumps,
  | 
| 
908
 | 
   114  | 
comparisons, function calls and so on.  Say, we have an expression
  | 
| 
909
 | 
   115  | 
$((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding sequence
  | 
| 
 | 
   116  | 
of assignments in SSA-format are:
  | 
| 
680
 | 
   117  | 
 
  | 
| 
 | 
   118  | 
\begin{lstlisting}[language=LLVMIR,numbers=left]
 | 
| 
908
 | 
   119  | 
let tmp0 = add 1 a in
  | 
| 
 | 
   120  | 
let tmp1 = add 3 2 in  
  | 
| 
 | 
   121  | 
let tmp2 = call foo(tmp1)  
  | 
| 
 | 
   122  | 
let tmp3 = mul b 5 in 
  | 
| 
 | 
   123  | 
let tmp4 = add tmp2 tmp3 in 
  | 
| 
 | 
   124  | 
let tmp5 = add tmp0 tmp4 in
  | 
| 
 | 
   125  | 
  return tmp5 
  | 
| 
677
 | 
   126  | 
\end{lstlisting}
 | 
| 
539
 | 
   127  | 
  | 
| 
909
 | 
   128  | 
\noindent where every tmpX-variable is used only once (we could, for
  | 
| 
912
 | 
   129  | 
example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if
 | 
| 
 | 
   130  | 
this would not change the overall result). At the end we have a
  | 
| 
 | 
   131  | 
return-instruction wich contains the final result of the
  | 
| 
913
 | 
   132  | 
expression. As can be seen the task we have to solve for generating
  | 
| 
 | 
   133  | 
SSA-code is to take apart compound expressions into its most basic
  | 
| 
 | 
   134  | 
''particles'' and transfrom them into a sequence of simple assignments
  | 
| 
 | 
   135  | 
that calculates the desired result. Note that this means we have to
  | 
| 
 | 
   136  | 
fix the order in which the expression is calculated, like from the
  | 
| 
 | 
   137  | 
left to right.
  | 
| 
898
 | 
   138  | 
  | 
| 
 | 
   139  | 
There are sophisticated algorithms for imperative languages, like C,
  | 
| 
909
 | 
   140  | 
that efficiently transform high-level programs into SSA-format. But
  | 
| 
898
 | 
   141  | 
we can ignore them here. We want to compile a functional language and
  | 
| 
 | 
   142  | 
there things get much more interesting than just sophisticated. We
  | 
| 
 | 
   143  | 
will need to have a look at CPS translations, where the CPS stands for
  | 
| 
909
 | 
   144  | 
\emph{Continuation-Passing-Style}---basically black programming art or
 | 
| 
678
 | 
   145  | 
abracadabra programming. So sit tight.
  | 
| 
539
 | 
   146  | 
  | 
| 
678
 | 
   147  | 
\subsection*{LLVM-IR}
 | 
| 
539
 | 
   148  | 
  | 
| 
908
 | 
   149  | 
Before we start, let's however first have a look at the \emph{LLVM Intermediate
 | 
| 
700
 | 
   150  | 
Representation} in more detail. The LLVM-IR is in between the frontends
  | 
| 
 | 
   151  | 
and backends of the LLVM framework. It allows compilation of multiple
  | 
| 
 | 
   152  | 
source languages to multiple targets. It is also the place where most of
  | 
| 
 | 
   153  | 
the target independent optimisations are performed. 
  | 
| 
 | 
   154  | 
  | 
| 
908
 | 
   155  | 
What is good about our toy Fun-language is that it basically only
  | 
| 
700
 | 
   156  | 
contains expressions (be they arithmetic expressions, boolean
  | 
| 
 | 
   157  | 
expressions or if-expressions). The exception are function definitions.
  | 
| 
 | 
   158  | 
Luckily, for them we can use the mechanism of defining functions in the
  | 
| 
 | 
   159  | 
LLVM-IR (this is similar to using JVM methods for functions in our
  | 
| 
908
 | 
   160  | 
earlier compiler). For example the simple Fun-program 
  | 
| 
539
 | 
   161  | 
  | 
| 
 | 
   162  | 
  | 
| 
677
 | 
   163  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
678
 | 
   164  | 
def sqr(x) = x * x
  | 
| 
677
 | 
   165  | 
\end{lstlisting}
 | 
| 
539
 | 
   166  | 
  | 
| 
677
 | 
   167  | 
\noindent
  | 
| 
700
 | 
   168  | 
can be compiled to the following LLVM-IR function:
  | 
| 
539
 | 
   169  | 
  | 
| 
677
 | 
   170  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
678
 | 
   171  | 
define i32 @sqr(i32 %x) {
 | 
| 
 | 
   172  | 
   %tmp = mul i32 %x, %x
  | 
| 
677
 | 
   173  | 
   ret i32 %tmp
  | 
| 
 | 
   174  | 
}    
  | 
| 
 | 
   175  | 
\end{lstlisting}
 | 
| 
539
 | 
   176  | 
  | 
| 
908
 | 
   177  | 
\noindent First notice that all ``local'' variable names, in this case
  | 
| 
 | 
   178  | 
\texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the
 | 
| 
 | 
   179  | 
LLVM-IR.  Temporary variables can be named with an identifier, such as
  | 
| 
 | 
   180  | 
\texttt{tmp}, or numbers. In contrast, function names, since they are
 | 
| 
 | 
   181  | 
``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is
  | 
| 
 | 
   182  | 
a fully typed language. The \texttt{i32} type stands for 32-bit
 | 
| 
 | 
   183  | 
integers. There are also types for 64-bit integers (\texttt{i64}),
 | 
| 
 | 
   184  | 
chars (\texttt{i8}), floats, arrays and even pointer types. In the
 | 
| 
 | 
   185  | 
code above, \texttt{sqr} takes an argument of type \texttt{i32} and
 | 
| 
 | 
   186  | 
produces a result of type \texttt{i32} (the result type is in front of
 | 
| 
 | 
   187  | 
the function name, like in C). Each arithmetic operation, for example
  | 
| 
 | 
   188  | 
addition and multiplication, are also prefixed with the type they
  | 
| 
 | 
   189  | 
operate on. Obviously these types need to match up\ldots{} but since
 | 
| 
 | 
   190  | 
we have in our programs only integers, for the moment \texttt{i32}
 | 
| 
 | 
   191  | 
everywhere will do. We do not have to generate any other types, but
  | 
| 
 | 
   192  | 
obviously this is a limitation in our Fun-language (and which we
  | 
| 
912
 | 
   193  | 
are going to lift in the final CW).
  | 
| 
539
 | 
   194  | 
 
  | 
| 
700
 | 
   195  | 
There are a few interesting instructions in the LLVM-IR which are quite
  | 
| 
701
 | 
   196  | 
different than what we have seen in the JVM. Can you remember the
  | 
| 
 | 
   197  | 
kerfuffle we had to go through with boolean expressions and negating the
  | 
| 
908
 | 
   198  | 
condition? In the LLVM-IR, branching  if-conditions are implemented
  | 
| 
701
 | 
   199  | 
differently: there is a separate \texttt{br}-instruction as follows:
 | 
| 
700
 | 
   200  | 
  | 
| 
 | 
   201  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
 | 
   202  | 
br i1 %var, label %if_br, label %else_br
  | 
| 
 | 
   203  | 
\end{lstlisting}
 | 
| 
 | 
   204  | 
  | 
| 
 | 
   205  | 
\noindent
  | 
| 
908
 | 
   206  | 
The type \texttt{i1} stands for booleans. If the variable is true,
 | 
| 
 | 
   207  | 
then this instruction jumps to the if-branch, which needs an explicit
  | 
| 
 | 
   208  | 
label; otherwise it jumps to the else-branch, again with its own
  | 
| 
 | 
   209  | 
label. This allows us to keep the meaning of the boolean expression
  | 
| 
 | 
   210  | 
``as is'' when compiling if's---thanks god no more negating of
  | 
| 
 | 
   211  | 
booleans.
  | 
| 
 | 
   212  | 
  | 
| 
701
 | 
   213  | 
A value of type boolean is generated in the LLVM-IR by the
  | 
| 
 | 
   214  | 
\texttt{icmp}-instruction. This instruction is for integers (hence the
 | 
| 
 | 
   215  | 
\texttt{i}) and takes the comparison operation as argument. For example
 | 
| 
700
 | 
   216  | 
  | 
| 
 | 
   217  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
898
 | 
   218  | 
icmp eq  i32 %x, %y     ; for equal
  | 
| 
 | 
   219  | 
icmp sle i32 %x, %y     ; signed less or equal
  | 
| 
 | 
   220  | 
icmp slt i32 %x, %y     ; signed less than
  | 
| 
 | 
   221  | 
icmp ult i32 %x, %y     ; unsigned less than 
  | 
| 
700
 | 
   222  | 
\end{lstlisting}
 | 
| 
 | 
   223  | 
  | 
| 
 | 
   224  | 
\noindent
  | 
| 
898
 | 
   225  | 
Note that in some operations the LLVM-IR distinguishes between signed and 
  | 
| 
908
 | 
   226  | 
unsigned representations of integers. I assume you know what this means,
  | 
| 
 | 
   227  | 
otherwise just ask.
  | 
| 
700
 | 
   228  | 
  | 
| 
701
 | 
   229  | 
It is also easy to call another function in LLVM-IR: as can be 
  | 
| 
 | 
   230  | 
seen from Figure~\ref{lli} we can just call a function with the 
 | 
| 
 | 
   231  | 
instruction \texttt{call} and can also assign the result to 
 | 
| 
 | 
   232  | 
a variable. The syntax is as follows
  | 
| 
 | 
   233  | 
  | 
| 
 | 
   234  | 
\begin{lstlisting}[language=LLVM]
 | 
| 
 | 
   235  | 
%var = call i32 @foo(...args...)
  | 
| 
 | 
   236  | 
\end{lstlisting}
 | 
| 
 | 
   237  | 
  | 
| 
 | 
   238  | 
\noindent
  | 
| 
908
 | 
   239  | 
where the arguments can only be simple variables and numbers, but not compound
  | 
| 
701
 | 
   240  | 
expressions.
  | 
| 
 | 
   241  | 
  | 
| 
679
 | 
   242  | 
Conveniently, you can use the program \texttt{lli}, which comes with
 | 
| 
912
 | 
   243  | 
LLVM, to interpret programs written in the LLVM-IR. Type on the command line
  | 
| 
 | 
   244  | 
  | 
| 
 | 
   245  | 
\begin{lstlisting}[language=bash,numbers=none]
 | 
| 
 | 
   246  | 
lli sqr.ll
  | 
| 
 | 
   247  | 
\end{lstlisting}
 | 
| 
 | 
   248  | 
  | 
| 
 | 
   249  | 
\noindent
  | 
| 
 | 
   250  | 
and you can see the output of the pragram generated. 
  | 
| 
 | 
   251  | 
This means you can easily check whether the code you produced actually
  | 
| 
 | 
   252  | 
works. To get a running program that does something interesting you
  | 
| 
 | 
   253  | 
need to add some boilerplate about printing out numbers and a
  | 
| 
 | 
   254  | 
main-function that is the entry point for the program (see
  | 
| 
 | 
   255  | 
Figure~\ref{lli} for a complete listing of the square function). Again
 | 
| 
700
 | 
   256  | 
this is very similar to the boilerplate we needed to add in our JVM
  | 
| 
912
 | 
   257  | 
compiler.
  | 
| 
700
 | 
   258  | 
  | 
| 
 | 
   259  | 
You can generate a binary for the program in Figure~\ref{lli} by using
 | 
| 
908
 | 
   260  | 
the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
 | 
| 
 | 
   261  | 
an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
 | 
| 
700
 | 
   262  | 
executable binary:
  | 
| 
678
 | 
   263  | 
  | 
| 
679
 | 
   264  | 
\begin{lstlisting}[language=bash,numbers=none]
 | 
| 
 | 
   265  | 
llc -filetype=obj sqr.ll
  | 
| 
 | 
   266  | 
gcc sqr.o -o a.out
  | 
| 
 | 
   267  | 
./a.out
  | 
| 
680
 | 
   268  | 
> 25
  | 
| 
679
 | 
   269  | 
\end{lstlisting}
 | 
| 
 | 
   270  | 
  | 
| 
 | 
   271  | 
\begin{figure}[t]\small 
 | 
| 
 | 
   272  | 
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
 | 
| 
700
 | 
   273  | 
\caption{An LLVM-IR program for calculating the square function. It
 | 
| 
909
 | 
   274  | 
  calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
 | 
| 
908
 | 
   275  | 
  (Line 20). The
  | 
| 
912
 | 
   276  | 
  code for the \texttt{sqr}-function is in Lines 13 -- 16. The main-function
 | 
| 
 | 
   277  | 
  stores
  | 
| 
909
 | 
   278  | 
  the result of sqr in the variable called \texttt{\%1} and then
 | 
| 
908
 | 
   279  | 
  prints out the contents of this variable in Line 21. The other
  | 
| 
 | 
   280  | 
  code is boilerplate for printing out integers.\label{lli}}
 | 
| 
678
 | 
   281  | 
\end{figure}   
 | 
| 
 | 
   282  | 
  | 
| 
679
 | 
   283  | 
  | 
| 
 | 
   284  | 
    
  | 
| 
 | 
   285  | 
\subsection*{Our Own Intermediate Language}
 | 
| 
 | 
   286  | 
  | 
| 
908
 | 
   287  | 
Let's get back to our compiler: Remember compilers have to solve the
  | 
| 
 | 
   288  | 
problem of bridging the gap between ``high-level'' programs and
  | 
| 
 | 
   289  | 
``low-level'' hardware. If the gap is too wide for one step, then a
  | 
| 
 | 
   290  | 
good strategy is to lay a stepping stone somewhere in between. The
  | 
| 
 | 
   291  | 
LLVM-IR itself is such a stepping stone to make the task of generating
  | 
| 
 | 
   292  | 
and optimising code easier. Like a real compiler we will use our own
  | 
| 
 | 
   293  | 
stepping stone which I call the \emph{K-language}.
 | 
| 
 | 
   294  | 
  | 
| 
 | 
   295  | 
\begin{center}
 | 
| 
 | 
   296  | 
  \begin{tikzpicture}[scale=0.9,font=\bf,
 | 
| 
 | 
   297  | 
                      node/.style={
 | 
| 
 | 
   298  | 
                      rectangle,rounded corners=3mm,
  | 
| 
 | 
   299  | 
                      ultra thick,draw=black!50,minimum height=16mm, 
  | 
| 
 | 
   300  | 
                      minimum width=20mm,
  | 
| 
 | 
   301  | 
                      top color=white,bottom color=black!20}]
  | 
| 
 | 
   302  | 
  | 
| 
 | 
   303  | 
  %\node (0) at (-3,0) {};  
 | 
| 
 | 
   304  | 
  \node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language};
 | 
| 
 | 
   305  | 
  \node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language};
 | 
| 
 | 
   306  | 
  \node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR};
 | 
| 
 | 
   307  | 
 
  | 
| 
 | 
   308  | 
  \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B); 
 | 
| 
 | 
   309  | 
  \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C); 
 | 
| 
 | 
   310  | 
  \end{tikzpicture}
 | 
| 
 | 
   311  | 
  \end{center}
 | 
| 
 | 
   312  | 
  | 
| 
 | 
   313  | 
  \noindent
  | 
| 
 | 
   314  | 
  To see why we need a stepping stone for generating code in SSA-format,
  | 
| 
 | 
   315  | 
  considder again arithmetic expressions such as
  | 
| 
 | 
   316  | 
  $((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller
  | 
| 
 | 
   317  | 
  ``atomic'' steps, like so
  | 
| 
680
 | 
   318  | 
 
  | 
| 
 | 
   319  | 
\begin{lstlisting}[language=LLVMIR,numbers=none]
 | 
| 
 | 
   320  | 
let tmp0 = add 1 a in   
  | 
| 
 | 
   321  | 
let tmp1 = mul b 5 in 
  | 
| 
 | 
   322  | 
let tmp2 = add 3 tmp1 in 
  | 
| 
 | 
   323  | 
let tmp3 = add tmp0 tmp2 in
  | 
| 
908
 | 
   324  | 
  return tmp3 
  | 
| 
 | 
   325  | 
\end{lstlisting}
 | 
| 
 | 
   326  | 
  | 
| 
 | 
   327  | 
\noindent
  | 
| 
 | 
   328  | 
Here \texttt{tmp3} will contain the result of what the whole
 | 
| 
 | 
   329  | 
expression stands for. In each individual step we can only perform an
  | 
| 
 | 
   330  | 
``atomic'' or ``trival'' operation, like addition or multiplication of
  | 
| 
 | 
   331  | 
a number or a variable.  We are not allowed to have for example a
  | 
| 
 | 
   332  | 
nested addition or an if-condition on the right-hand side of an
  | 
| 
909
 | 
   333  | 
assignment. Such constraints are forced upon us because of how the
  | 
| 
 | 
   334  | 
SSA-format works in the LLVM-IR. To simplify matters we represent
  | 
| 
908
 | 
   335  | 
assignments with two kinds of syntactic entities, namely
  | 
| 
 | 
   336  | 
\emph{K-values} and \emph{K-expressions}. K-values are all ``things''
 | 
| 
909
 | 
   337  | 
that can appear on the right-hand side of an equal. Say we have the
  | 
| 
 | 
   338  | 
following definition for K-values:
  | 
| 
908
 | 
   339  | 
  | 
| 
 | 
   340  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   341  | 
enum KVal {
 | 
| 
 | 
   342  | 
  case KVar(s: String)
  | 
| 
 | 
   343  | 
  case KNum(n: Int)
  | 
| 
 | 
   344  | 
  case KAop(op: String, v1: KVal, v2: KVal)
  | 
| 
 | 
   345  | 
  case KCall(fname: String, args: List[KVal])
  | 
| 
 | 
   346  | 
}
  | 
| 
680
 | 
   347  | 
\end{lstlisting}
 | 
| 
 | 
   348  | 
  | 
| 
 | 
   349  | 
\noindent
  | 
| 
908
 | 
   350  | 
where a K-value can be a variable, a number or a ``trivial'' binary
  | 
| 
 | 
   351  | 
operation, such as addition or multiplication. The two arguments of a
  | 
| 
 | 
   352  | 
binary operation need to be K-values as well. Finally, we have
  | 
| 
 | 
   353  | 
function calls, but again each argument of the function call needs to
  | 
| 
912
 | 
   354  | 
be a K-value.  One constraint we need to be careful about is that the
  | 
| 
 | 
   355  | 
arguments of the binary operations and function calls can only be
  | 
| 
 | 
   356  | 
variables or numbers. For example
  | 
| 
 | 
   357  | 
  | 
| 
 | 
   358  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   359  | 
KAop("+", KAop("*", KNum(1), KNum(2)), KNum(3))
 | 
| 
 | 
   360  | 
KCall("foo", List(KAop("+", KNum(4), KNum(5)))  
 | 
| 
 | 
   361  | 
\end{lstlisting}
 | 
| 
 | 
   362  | 
  | 
| 
 | 
   363  | 
\noindent
  | 
| 
 | 
   364  | 
while perfect instances of K-values according to the types, are not
  | 
| 
 | 
   365  | 
allowed in the LLVM-IR. To encode this constraint into the type-system
  | 
| 
 | 
   366  | 
of Scala, however, would make things overly complicated---to satisfy
  | 
| 
 | 
   367  | 
this constraint is therefore on us. For the moment this will be all
  | 
| 
 | 
   368  | 
K-values we are considdering. Later on, we will have some more for our
  | 
| 
 | 
   369  | 
Fun-language.
  | 
| 
679
 | 
   370  | 
  | 
| 
 | 
   371  | 
  | 
| 
908
 | 
   372  | 
Our K-expressions will encode the \texttt{let} and the \texttt{return}
 | 
| 
 | 
   373  | 
from the SSA-format (again for the moment we only consider these two
  | 
| 
 | 
   374  | 
constructors---later on we will have if-branches as well).
  | 
| 
678
 | 
   375  | 
  | 
| 
908
 | 
   376  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   377  | 
enum KExp {
 | 
| 
 | 
   378  | 
  case KLet(x: String, v: KVal, e: KExp)
  | 
| 
 | 
   379  | 
  case KReturn(v: KVal)
  | 
| 
 | 
   380  | 
}
  | 
| 
 | 
   381  | 
\end{lstlisting}
 | 
| 
679
 | 
   382  | 
  | 
| 
908
 | 
   383  | 
\noindent
  | 
| 
913
 | 
   384  | 
By having \texttt{KLet} taking first a string (standing for a
 | 
| 
912
 | 
   385  | 
tmp-variable) and second a value, we can fulfil the SSA constraint in
  | 
| 
 | 
   386  | 
assignments ``by con\-struction''---there is no way we could write
  | 
| 
 | 
   387  | 
anything else than a K-value.  Note that the third argument of a
  | 
| 
 | 
   388  | 
\texttt{KLet} is again a K-expression, meaning either another
 | 
| 
 | 
   389  | 
\texttt{KLet} or a \texttt{KReturn}. In this way we can construct a
 | 
| 
913
 | 
   390  | 
sequence of computations and indicate what the final result of the
  | 
| 
 | 
   391  | 
computations is.  According to the SSA-format, we also have to ensure
  | 
| 
912
 | 
   392  | 
that all intermediary computations are given (fresh) names---we will
  | 
| 
 | 
   393  | 
use an (ugly) counter for this.
  | 
| 
679
 | 
   394  | 
  | 
| 
908
 | 
   395  | 
To sum up, K-values are the atomic operations that can be on the
  | 
| 
 | 
   396  | 
right-hand side of assignemnts. The K-language is restricted such that
  | 
| 
909
 | 
   397  | 
it is easy to generate the SSA-format for the LLVM-IR. What remains to
  | 
| 
908
 | 
   398  | 
be done is a translation of our Fun-language into the K-language. The
  | 
| 
909
 | 
   399  | 
main difficulty is that we need to order the computation---in the
  | 
| 
913
 | 
   400  | 
K-language we only have linear sequences of instructions. Before we
  | 
| 
912
 | 
   401  | 
explain this, we have a look at some programs in CPS-style.
  | 
| 
679
 | 
   402  | 
  | 
| 
 | 
   403  | 
  | 
| 
678
 | 
   404  | 
  | 
| 
 | 
   405  | 
  | 
| 
 | 
   406  | 
\subsection*{CPS-Translations}
 | 
| 
 | 
   407  | 
  | 
| 
912
 | 
   408  | 
CPS stands for \emph{Continuation-Passing-Style}. It is a kind of
 | 
| 
 | 
   409  | 
programming technique often used in advanced functional programming
  | 
| 
 | 
   410  | 
and in particular in compilers. Before we delve into the
  | 
| 
 | 
   411  | 
CPS-translation for our Fun-language compiler, let us look at
  | 
| 
704
 | 
   412  | 
CPS-versions of some well-known functions. Consider
  | 
| 
 | 
   413  | 
  | 
| 
 | 
   414  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   415  | 
def fact(n: Int) : Int = 
  | 
| 
 | 
   416  | 
  if (n == 0) 1 else n * fact(n - 1) 
  | 
| 
 | 
   417  | 
\end{lstlisting}
 | 
| 
 | 
   418  | 
  | 
| 
 | 
   419  | 
\noindent 
  | 
| 
 | 
   420  | 
This is clearly the usual factorial function. But now consider the
  | 
| 
912
 | 
   421  | 
following slight variant of the factorial function:
  | 
| 
704
 | 
   422  | 
  | 
| 
912
 | 
   423  | 
\begin{lstlisting}[language=Scala, numbers=left]
 | 
| 
 | 
   424  | 
def factC(n: Int, k: Int => Int) : Int = 
  | 
| 
 | 
   425  | 
  if (n == 0) k(1) 
  | 
| 
 | 
   426  | 
  else factC(n - 1, x => k(n * x)) 
  | 
| 
704
 | 
   427  | 
  | 
| 
912
 | 
   428  | 
factC(3, id)
  | 
| 
704
 | 
   429  | 
\end{lstlisting}
 | 
| 
 | 
   430  | 
  | 
| 
 | 
   431  | 
\noindent
  | 
| 
912
 | 
   432  | 
The function is is Lines 1--3. The function call is in Line 5.  We
  | 
| 
 | 
   433  | 
call this function with a number, in this case 3, and the
  | 
| 
 | 
   434  | 
identity-function (which returns just its input). The \texttt{k} is
 | 
| 
 | 
   435  | 
the continuation in this function. The recursive calls are:
  | 
| 
704
 | 
   436  | 
  | 
| 
 | 
   437  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
912
 | 
   438  | 
factC(2, x => id(3 * x))
  | 
| 
 | 
   439  | 
factC(1, x => id(3 * (2 * x)))
  | 
| 
 | 
   440  | 
factC(0, x => id(3 * (2 * (1 * x))))
  | 
| 
704
 | 
   441  | 
\end{lstlisting}
 | 
| 
 | 
   442  | 
  | 
| 
 | 
   443  | 
\noindent
  | 
| 
 | 
   444  | 
Having reached 0, we get out of the recursion and apply 1 to the
  | 
| 
912
 | 
   445  | 
continuation (see if-branch above in Line 2). This gives
  | 
| 
704
 | 
   446  | 
  | 
| 
 | 
   447  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
912
 | 
   448  | 
id(3 * (2 * (1 * 1)))
  | 
| 
704
 | 
   449  | 
= 3 * (2 * (1 * 1))
  | 
| 
 | 
   450  | 
= 6
  | 
| 
 | 
   451  | 
\end{lstlisting}
 | 
| 
 | 
   452  | 
  | 
| 
 | 
   453  | 
\noindent
  | 
| 
898
 | 
   454  | 
which is the expected result. If this looks somewhat familiar to you,
  | 
| 
912
 | 
   455  | 
than this is because functions with continuations can be seen as a
  | 
| 
 | 
   456  | 
kind of generalisation of tail-recursive functions.  So far, however,
  | 
| 
 | 
   457  | 
we did not look at this generalisation in earnest.  Anyway notice how
  | 
| 
 | 
   458  | 
the continuations is ``stacked up'' during the recursion and then
  | 
| 
 | 
   459  | 
``unrolled'' when we apply 1 to the continuation. Interestingly, we
  | 
| 
 | 
   460  | 
can do something similar to the Fibonacci function where in the
  | 
| 
 | 
   461  | 
traditional version we have two recursive calls. Consider the
  | 
| 
 | 
   462  | 
following function
  | 
| 
704
 | 
   463  | 
  | 
| 
912
 | 
   464  | 
\begin{lstlisting}[language=Scala, numbers=left]
 | 
| 
 | 
   465  | 
def fibC(n: Int, k: Int => Int) : Int = 
  | 
| 
 | 
   466  | 
  if (n == 0 || n == 1) k(1) 
  | 
| 
704
 | 
   467  | 
  else fibC(n - 1,
  | 
| 
 | 
   468  | 
             r1 => fibC(n - 2,
  | 
| 
912
 | 
   469  | 
               r2 => k(r1 + r2)))
  | 
| 
704
 | 
   470  | 
\end{lstlisting}
 | 
| 
 | 
   471  | 
  | 
| 
 | 
   472  | 
\noindent
  | 
| 
912
 | 
   473  | 
Here the continuation (Lines 4--5) is a nested function essentially wrapping up 
  | 
| 
 | 
   474  | 
the second recursive call plus the original continuation. Let us check how the recursion unfolds
  | 
| 
704
 | 
   475  | 
when called with 3 and the identity function:
  | 
| 
 | 
   476  | 
  | 
| 
 | 
   477  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   478  | 
fibC(3, id)
  | 
| 
 | 
   479  | 
fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
  | 
| 
 | 
   480  | 
fibC(1, r1 => 
  | 
| 
 | 
   481  | 
   fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
  | 
| 
 | 
   482  | 
fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
  | 
| 
 | 
   483  | 
fibC(1, r2a => id((1 + 1) + r2a))
  | 
| 
 | 
   484  | 
id((1 + 1) + 1)
  | 
| 
 | 
   485  | 
(1 + 1) + 1
  | 
| 
 | 
   486  | 
3
  | 
| 
 | 
   487  | 
\end{lstlisting}
 | 
| 
 | 
   488  | 
  | 
| 
908
 | 
   489  | 
\noindent
  | 
| 
909
 | 
   490  | 
The point of this section is that you should be playing around with these
  | 
| 
908
 | 
   491  | 
simple definitions and make sure they calculate the expected result.
  | 
| 
 | 
   492  | 
In the next step, you should understand \emph{how} these functions
 | 
| 
912
 | 
   493  | 
calculate the result with the continuations. Now change the initial
  | 
| 
 | 
   494  | 
continuation which you call the function to
  | 
| 
 | 
   495  | 
  | 
| 
 | 
   496  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   497  | 
r => { println("The result plus 1 is:") ; r + 1 }
 | 
| 
 | 
   498  | 
\end{lstlisting}
 | 
| 
 | 
   499  | 
  | 
| 
 | 
   500  | 
\noindent
  | 
| 
 | 
   501  | 
Does this still calculate the expected result?
  | 
| 
908
 | 
   502  | 
  | 
| 
 | 
   503  | 
  | 
| 
 | 
   504  | 
\section*{Worked Example}
 | 
| 
 | 
   505  | 
  | 
| 
909
 | 
   506  | 
Let us now come back to the CPS-translations for the Fun-language.
  | 
| 
 | 
   507  | 
Though we will start with a simpler subset containing only numbers,
  | 
| 
913
 | 
   508  | 
arithmetic expressions and function calls---no variables for the
  | 
| 
 | 
   509  | 
moment.  The main difficulty of generating instructions in SSA-format
  | 
| 
 | 
   510  | 
is that large compound expressions need to be broken up into smaller
  | 
| 
 | 
   511  | 
pieces and intermediate results need to be chained into later
  | 
| 
 | 
   512  | 
instructions. To do this conveniently, we use the CPS-translation
  | 
| 
 | 
   513  | 
mechanism. What the continuations essentially solve is the following
  | 
| 
 | 
   514  | 
problem: Given an expressions
  | 
| 
908
 | 
   515  | 
  | 
| 
 | 
   516  | 
\begin{equation}\label{exp}
 | 
| 
 | 
   517  | 
(1 + 2) * (3 + 4)
  | 
| 
 | 
   518  | 
\end{equation}  
 | 
| 
 | 
   519  | 
  | 
| 
 | 
   520  | 
\noindent
  | 
| 
913
 | 
   521  | 
which of the subexpressions should be calculated first? We are going 
  | 
| 
 | 
   522  | 
arbitrarily to decide that the calculation will be from left to
  | 
| 
912
 | 
   523  | 
right. Other languages make different choices---C famously is right to
  | 
| 
 | 
   524  | 
left. In our case this means we have to tear the expression shown in
  | 
| 
 | 
   525  | 
\eqref{exp} apart as follows:
 | 
| 
908
 | 
   526  | 
  | 
| 
 | 
   527  | 
\[
  | 
| 
 | 
   528  | 
(1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
 | 
| 
 | 
   529  | 
\]  
  | 
| 
 | 
   530  | 
  | 
| 
 | 
   531  | 
\noindent
  | 
| 
912
 | 
   532  | 
The first subexpression can be easily calculated and will give us a result,
  | 
| 
909
 | 
   533  | 
but the second one is not really an expression that makes sense. It
  | 
| 
 | 
   534  | 
will only make sense as the next step in the computation when we
  | 
| 
 | 
   535  | 
fill-in the result of $1+2$ into the ``hole'' indicated by
  | 
| 
 | 
   536  | 
$\Box$. Such incomplete expressions can be represented in Scala as
  | 
| 
 | 
   537  | 
functions
  | 
| 
908
 | 
   538  | 
  | 
| 
 | 
   539  | 
\begin{lstlisting}[language=Scala, numbers=none]
 | 
| 
 | 
   540  | 
r => r * (3 + 4)
  | 
| 
 | 
   541  | 
\end{lstlisting}  
 | 
| 
 | 
   542  | 
  | 
| 
 | 
   543  | 
\noindent
  | 
| 
912
 | 
   544  | 
where \texttt{r} will represent any result that has been computed
 | 
| 
 | 
   545  | 
earlier. By the way, in lambda-calculus notation we would write
  | 
| 
908
 | 
   546  | 
$\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
  | 
| 
700
 | 
   547  | 
functions (``continuations'') to represent what is coming next in a
  | 
| 
908
 | 
   548  | 
sequence of instructions. In our case, continuations are functions of
  | 
| 
 | 
   549  | 
type \code{KVal} to \code{KExp}. They can be seen as a sequence of
 | 
| 
 | 
   550  | 
\code{KLet}s where there is a ``hole'' that needs to be
 | 
| 
 | 
   551  | 
filled. Consider for example
  | 
| 
678
 | 
   552  | 
  | 
| 
701
 | 
   553  | 
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
 | 
| 
 | 
   554  | 
let tmp0 = add 1 a in   
  | 
| 
 | 
   555  | 
let tmp1 = mul (*@$\Box$@*) 5 in 
  | 
| 
 | 
   556  | 
let tmp2 = add 3 tmp1 in 
  | 
| 
 | 
   557  | 
let tmp3 = add tmp0 tmp2 in
  | 
| 
912
 | 
   558  | 
  return tmp3 
  | 
| 
701
 | 
   559  | 
\end{lstlisting}
 | 
| 
 | 
   560  | 
  | 
| 
 | 
   561  | 
\noindent
  | 
| 
 | 
   562  | 
where in the second line is a $\Box$ which still expects a \code{KVal}
 | 
| 
 | 
   563  | 
to be filled in before it becomes a ``proper'' \code{KExp}. When we 
 | 
| 
898
 | 
   564  | 
apply an argument to the continuation (remember they are functions)
  | 
| 
908
 | 
   565  | 
we essentially fill something into the corresponding hole.
  | 
| 
 | 
   566  | 
  | 
| 
909
 | 
   567  | 
Lets look at some concrete code. To simplify matters, 
  | 
| 
908
 | 
   568  | 
suppose our source language consists just of three kinds of expressions
  | 
| 
 | 
   569  | 
  | 
| 
 | 
   570  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   571  | 
enum Expr {
 | 
| 
 | 
   572  | 
    case Num(n: Int)
  | 
| 
912
 | 
   573  | 
    case Aop(op: String, e1: Expr, e2: Expr)
  | 
| 
908
 | 
   574  | 
    case Call(fname: String, args: List[Expr])
  | 
| 
 | 
   575  | 
}
  | 
| 
 | 
   576  | 
\end{lstlisting}
 | 
| 
 | 
   577  | 
  | 
| 
 | 
   578  | 
\noindent
  | 
| 
912
 | 
   579  | 
This allows us to generate SSA-instructions for expressions like
  | 
| 
 | 
   580  | 
  | 
| 
 | 
   581  | 
\[
  | 
| 
 | 
   582  | 
1 + foo(bar(4 * 7), 3, id(12))  
  | 
| 
 | 
   583  | 
\]
  | 
| 
 | 
   584  | 
  | 
| 
 | 
   585  | 
\noindent
  | 
| 
 | 
   586  | 
The code of the CPS-translation
  | 
| 
 | 
   587  | 
that generates these instructions is then of the form
  | 
| 
679
 | 
   588  | 
  | 
| 
701
 | 
   589  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   590  | 
def CPS(e: Exp)(k: KVal => KExp) : KExp = 
  | 
| 
 | 
   591  | 
  e match { ... }
 | 
| 
 | 
   592  | 
\end{lstlisting}
 | 
| 
679
 | 
   593  | 
  | 
| 
701
 | 
   594  | 
\noindent 
  | 
| 
908
 | 
   595  | 
where \code{k} is the continuation and \code{e} is the expression to
 | 
| 
909
 | 
   596  | 
be compiled. The result of the function is a K-expression, which later
  | 
| 
 | 
   597  | 
can be compiled into LLVM-IR code.
  | 
| 
 | 
   598  | 
  | 
| 
913
 | 
   599  | 
In case we have numbers, then we can just pass them in the CPS-translation
  | 
| 
909
 | 
   600  | 
to the continuations because numbers need not be further teared apart
  | 
| 
912
 | 
   601  | 
as they are already primitive. Passing the number to the continuation
  | 
| 
909
 | 
   602  | 
means we apply the continuation like
  | 
| 
679
 | 
   603  | 
  | 
| 
908
 | 
   604  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   605  | 
  case Num(i) => k(KNum(i))
  | 
| 
 | 
   606  | 
\end{lstlisting}
 | 
| 
679
 | 
   607  | 
  | 
| 
909
 | 
   608  | 
\noindent This would fill in the $\Box$ in a \code{KLet}-expression.
 | 
| 
 | 
   609  | 
Since \texttt{k} is a function from \texttt{KVar} to \texttt{KExp} we
 | 
| 
 | 
   610  | 
also optain the correct type for CPS, namely \texttt{KExp}.  There is
 | 
| 
 | 
   611  | 
no need to create a temporary variable for simple numbers.  More
  | 
| 
 | 
   612  | 
interesting is the case for arithmetic operations.
  | 
| 
908
 | 
   613  | 
  | 
| 
 | 
   614  | 
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
 | 
| 
 | 
   615  | 
case Aop(op, e1, e2) => {
 | 
| 
 | 
   616  | 
  val z = Fresh("tmp")
 | 
| 
 | 
   617  | 
  CPS(e1)(r1 => 
  | 
| 
 | 
   618  | 
    CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))))
  | 
| 
 | 
   619  | 
}
  | 
| 
 | 
   620  | 
\end{lstlisting}
 | 
| 
 | 
   621  | 
  | 
| 
 | 
   622  | 
\noindent
  | 
| 
 | 
   623  | 
What we essentially have to do in this case is the following: compile
  | 
| 
 | 
   624  | 
the subexpressions \texttt{e1} and \texttt{e2}. They will produce some
 | 
| 
913
 | 
   625  | 
result that is stored in two temporary variables (assuming \texttt{e1} and \texttt{e2} are more
 | 
| 
908
 | 
   626  | 
complicated than just numbers). We need to use these two temporary
  | 
| 
 | 
   627  | 
variables and feed them into a new assignment of the form
  | 
| 
 | 
   628  | 
  | 
| 
 | 
   629  | 
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
 | 
| 
 | 
   630  | 
let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in
 | 
| 
 | 
   631  | 
...
  | 
| 
 | 
   632  | 
\end{lstlisting}
 | 
| 
701
 | 
   633  | 
  | 
| 
908
 | 
   634  | 
\noindent
  | 
| 
 | 
   635  | 
Note that this assignment has two ``holes'', one for the left
  | 
| 
 | 
   636  | 
subexpression and one for the right subexpression. We can fill both
  | 
| 
 | 
   637  | 
holes by calling the CPS function twice---this is a bit of the black
  | 
| 
 | 
   638  | 
art in CPS. We can use the second call of CPS as the continuation
  | 
| 
909
 | 
   639  | 
of the first call. The reason is that
  | 
| 
908
 | 
   640  | 
  | 
| 
 | 
   641  | 
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
 | 
| 
 | 
   642  | 
r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
  | 
| 
 | 
   643  | 
\end{lstlisting}
 | 
| 
 | 
   644  | 
  | 
| 
912
 | 
   645  | 
\noindent is a function from \code{KVal} to \code{KExp}, which we need
 | 
| 
 | 
   646  | 
as type for the continuation. Once we created the assignment with the
  | 
| 
 | 
   647  | 
fresh temporary variable \texttt{z}, we need to ``communicate'' that
 | 
| 
 | 
   648  | 
the result of the computation of the arithmetic expression is stored
  | 
| 
 | 
   649  | 
in \texttt{z} to the computations that follow. In this way we apply
 | 
| 
 | 
   650  | 
the continuation \texttt{k} with this new variable (essentially we are
 | 
| 
913
 | 
   651  | 
plugging in a hole further down the line).  Hope this makes sense!? If not,
  | 
| 
 | 
   652  | 
play with the given code yourself.
  | 
| 
908
 | 
   653  | 
  | 
| 
 | 
   654  | 
The last case we need to consider in our small expression language are
  | 
| 
912
 | 
   655  | 
function calls. For them remember each argument of the function
  | 
| 
913
 | 
   656  | 
call can in SSA-format only be a variable or a number. Here is the
  | 
| 
 | 
   657  | 
complete code for this case:
  | 
| 
908
 | 
   658  | 
  | 
| 
909
 | 
   659  | 
\begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
 | 
| 
908
 | 
   660  | 
case Call(fname, args) => {
 | 
| 
 | 
   661  | 
  def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
 | 
| 
 | 
   662  | 
       case Nil => {
 | 
| 
 | 
   663  | 
         val z = Fresh("tmp")
 | 
| 
 | 
   664  | 
         KLet(z, KCall(fname, vs), k(KVar(z)))
  | 
| 
 | 
   665  | 
       }
  | 
| 
 | 
   666  | 
       case a::as => CPS(a)(r => aux(as, vs ::: List(r)))
  | 
| 
 | 
   667  | 
  }
  | 
| 
 | 
   668  | 
  aux(args, Nil)  
  | 
| 
701
 | 
   669  | 
}
  | 
| 
 | 
   670  | 
\end{lstlisting}
 | 
| 
 | 
   671  | 
  | 
| 
 | 
   672  | 
\noindent
  | 
| 
913
 | 
   673  | 
As can be seen, we introduce an auxiliary function that compiles first all
  | 
| 
908
 | 
   674  | 
function arguments---remember in our source language we can have calls
  | 
| 
 | 
   675  | 
such as $foo(3 + 4, g(3))$ where we first have to create temporary
  | 
| 
 | 
   676  | 
variables (and computations) for each argument. Therefore \texttt{aux}
 | 
| 
909
 | 
   677  | 
analyses the argument list from left to right. In case there is an
  | 
| 
 | 
   678  | 
argument \texttt{a} on the front of the list (the case \texttt{a::as}
 | 
| 
 | 
   679  | 
in Line 7), we call CPS recursively for the corresponding
  | 
| 
 | 
   680  | 
subexpression. The temporary variable containing the result for this
  | 
| 
913
 | 
   681  | 
argument, we add to the end of the K-values we have already analysed
  | 
| 
909
 | 
   682  | 
before. Again very conveniently we can use the recursive call to
  | 
| 
 | 
   683  | 
\texttt{aux} as the continuation for the computations that
 | 
| 
 | 
   684  | 
follow. When we reach the end of the argument list (the
  | 
| 
 | 
   685  | 
\texttt{Nil}-case in Lines 3--6), then we collect all the K-values
 | 
| 
 | 
   686  | 
\texttt{v1} to \texttt{vn} and call the function in the K-language
 | 
| 
 | 
   687  | 
like so
  | 
| 
908
 | 
   688  | 
  | 
| 
 | 
   689  | 
  | 
| 
 | 
   690  | 
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
 | 
| 
 | 
   691  | 
let z = call foo(v1,...,vn) in
  | 
| 
 | 
   692  | 
...
  | 
| 
 | 
   693  | 
\end{lstlisting}
 | 
| 
 | 
   694  | 
  | 
| 
 | 
   695  | 
\noindent
  | 
| 
 | 
   696  | 
Again we need to communicate the result of the function call, namely the
  | 
| 
 | 
   697  | 
fresh temporary variable \texttt{z}, to the rest of the computation. Therefore
 | 
| 
 | 
   698  | 
we apply the continuation \texttt{k} with this variable.
 | 
| 
 | 
   699  | 
  | 
| 
 | 
   700  | 
  | 
| 
 | 
   701  | 
  | 
| 
 | 
   702  | 
\begin{figure}[p]\small
 | 
| 
 | 
   703  | 
  \lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc}
 | 
| 
 | 
   704  | 
  \hspace{10mm}...
 | 
| 
 | 
   705  | 
  \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
 | 
| 
 | 
   706  | 
\caption{CPS-translation for a simple expression language.\label{cps}}
 | 
| 
 | 
   707  | 
\end{figure}
 | 
| 
 | 
   708  | 
  | 
| 
912
 | 
   709  | 
The last question we need to answer in the code (see Figure~\ref{cps})
 | 
| 
 | 
   710  | 
is how we start the CPS-translation function, or more precisely with
  | 
| 
 | 
   711  | 
which continuation we should start CPS? It turns out that this initial
  | 
| 
 | 
   712  | 
continuation will be the last one that is called. What should be the
  | 
| 
 | 
   713  | 
last step in the computation? Yes, we need to return the temporary
  | 
| 
 | 
   714  | 
variable where the last result is stored (if it is a simple number,
  | 
| 
 | 
   715  | 
then we can just return this number). Therefore we call CPS with the
  | 
| 
 | 
   716  | 
code
  | 
| 
908
 | 
   717  | 
  | 
| 
 | 
   718  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   719  | 
def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
  | 
| 
 | 
   720  | 
\end{lstlisting}
 | 
| 
 | 
   721  | 
  | 
| 
 | 
   722  | 
\noindent
  | 
| 
909
 | 
   723  | 
where we give the function \code{KReturn(_)} as the continuation. With
 | 
| 
912
 | 
   724  | 
this we completed the translation of simple expressions into our
  | 
| 
 | 
   725  | 
K-language. Play around with some more expressions and see how the
  | 
| 
 | 
   726  | 
CPS-translation generates the correct code. I know this is not easy to
  | 
| 
 | 
   727  | 
follow code when you see it the first time.  Figure~\ref{absfun} gives
 | 
| 
 | 
   728  | 
the complete datatypes for the ASTs of the Fun-language and the
  | 
| 
 | 
   729  | 
K-values and K-expressions for the K-language. The complete
  | 
| 
 | 
   730  | 
CPS-translation you can find in the code.
  | 
| 
 | 
   731  | 
  | 
| 
 | 
   732  | 
\section*{Next Steps}
 | 
| 
909
 | 
   733  | 
  | 
| 
 | 
   734  | 
Having obtained a K-expression, it is relatively straightforward to
  | 
| 
913
 | 
   735  | 
generate a valid program for the LLVM-IR---remember the K-language
  | 
| 
 | 
   736  | 
already enforces the SSA convention of a linear sequence of primitive
  | 
| 
 | 
   737  | 
instructions involving only unique temporary variables. 
  | 
| 
 | 
   738  | 
We leave this step to the attentive reader.
  | 
| 
 | 
   739  | 
  | 
| 
 | 
   740  | 
What else can we do?  Well it should be relatively easy to apply some
  | 
| 
 | 
   741  | 
common optimisations to the K-expressions. One optimisations is called
  | 
| 
 | 
   742  | 
constant folding---for example if we have an expression $3 + 4$ then
  | 
| 
 | 
   743  | 
we can replace it by just $5$ instead of generating code to compute
  | 
| 
 | 
   744  | 
$5$. Now this information needs to be propagated to the next
  | 
| 
 | 
   745  | 
computation step to see whether any further constant foldings are
  | 
| 
 | 
   746  | 
possible. Another useful technique is common subexpression
  | 
| 
 | 
   747  | 
elimination, meaning if you have twice a calculation of a function
  | 
| 
 | 
   748  | 
$foo(a)$, then we want to call it only once and store the result in a
  | 
| 
 | 
   749  | 
temporary variable that can be used instead of the second, or third,
  | 
| 
 | 
   750  | 
call to $foo(a)$. Again I leave this to the attentive reader to work
  | 
| 
 | 
   751  | 
out and implement.
  | 
| 
908
 | 
   752  | 
  | 
| 
 | 
   753  | 
  | 
| 
 | 
   754  | 
\begin{figure}[p]\small
 | 
| 
 | 
   755  | 
\begin{lstlisting}[language=Scala,numbers=none]
 | 
| 
 | 
   756  | 
// Fun language (expressions)
  | 
| 
 | 
   757  | 
abstract class Exp 
  | 
| 
 | 
   758  | 
abstract class BExp 
  | 
| 
 | 
   759  | 
  | 
| 
 | 
   760  | 
case class Call(name: String, args: List[Exp]) extends Exp
  | 
| 
 | 
   761  | 
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
  | 
| 
 | 
   762  | 
case class Write(e: Exp) extends Exp
  | 
| 
 | 
   763  | 
case class Var(s: String) extends Exp
  | 
| 
 | 
   764  | 
case class Num(i: Int) extends Exp
  | 
| 
 | 
   765  | 
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
  | 
| 
 | 
   766  | 
case class Sequence(e1: Exp, e2: Exp) extends Exp
  | 
| 
 | 
   767  | 
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
  | 
| 
 | 
   768  | 
  | 
| 
 | 
   769  | 
  | 
| 
 | 
   770  | 
// K-language (K-expressions, K-values)
  | 
| 
 | 
   771  | 
abstract class KExp
  | 
| 
 | 
   772  | 
abstract class KVal
  | 
| 
 | 
   773  | 
  | 
| 
 | 
   774  | 
case class KVar(s: String) extends KVal
  | 
| 
 | 
   775  | 
case class KNum(i: Int) extends KVal
  | 
| 
 | 
   776  | 
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
  | 
| 
 | 
   777  | 
case class KCall(o: String, vrs: List[KVal]) extends KVal
  | 
| 
 | 
   778  | 
case class KWrite(v: KVal) extends KVal
  | 
| 
 | 
   779  | 
  | 
| 
 | 
   780  | 
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
  | 
| 
 | 
   781  | 
case class KLet(x: String, v: KVal, e: KExp) extends KExp
  | 
| 
 | 
   782  | 
case class KReturn(v: KVal) extends KExp
  | 
| 
 | 
   783  | 
\end{lstlisting}
 | 
| 
909
 | 
   784  | 
\caption{Abstract syntax trees for the Fun-language and the K-language.\label{absfun}}
 | 
| 
908
 | 
   785  | 
\end{figure}
 | 
| 
 | 
   786  | 
  | 
| 
 | 
   787  | 
  | 
| 
898
 | 
   788  | 
  | 
| 
 | 
   789  | 
\noindent
  | 
| 
539
 | 
   790  | 
\end{document}
 | 
| 
 | 
   791  | 
  | 
| 
 | 
   792  | 
  | 
| 
 | 
   793  | 
%%% Local Variables: 
  | 
| 
 | 
   794  | 
%%% mode: latex
  | 
| 
 | 
   795  | 
%%% TeX-master: t
  | 
| 
 | 
   796  | 
%%% End: 
  |