handouts/ho09.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Tue, 30 May 2023 13:27:54 +0100
changeset 908 0138618eff73
parent 898 45a48c47dcca
child 909 a04efdd5e7a3
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     1
% !TEX program = xelatex
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
\documentclass{article}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     3
\usepackage{../style}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
\usepackage{../langs}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
\usepackage{../graphics}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
\usepackage{../grammar}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     7
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
\begin{document}
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
     9
\fnote{\copyright{} Christian Urban, King's College London, 2019, 2023}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
722
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    12
% CPS translations
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    13
% https://felleisen.org/matthias/4400-s20/lecture17.html
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    14
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    15
%% pattern matching resources
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    16
%% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
14914b57e207 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    17
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    18
\section*{Handout 9 (LLVM, SSA and CPS)}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    19
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    20
Reflecting on our two tiny compilers targetting the JVM, the code
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    21
generation part was actually not so hard, no? Pretty much just some
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    22
post-traversal of the abstract syntax tree, yes? One of the reasons
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    23
for this ease is that the JVM is a stack-based virtual machine and it
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    24
is therefore not hard to translate deeply-nested arithmetic
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    25
expressions into a sequence of instructions manipulating the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    26
stack. The problem is that ``real'' CPUs, although supporting stack
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    27
operations, are not really designed to be \emph{stack machines}.  The
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    28
design of CPUs is more like: Here are some operations and a chunk of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    29
memory---compiler, or better compiler writers, do something with
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    30
them. Consequently, modern compilers need to go the extra mile in
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    31
order to generate code that is much easier and faster to process by
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    32
actual CPUs, rather than running code on virtual machines that
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    33
introduce an additional layer of indirection. To make this all
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    34
tractable for this module, we target the LLVM Intermediate
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    35
Language. In this way we can take advantage of the tools coming with
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    36
LLVM.\footnote{\url{http://llvm.org}} For example we do not have to
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    37
worry about things like register allocations. By using LLVM, however,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    38
we also have to pay price in the sense that generating code gets
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    39
harder\ldots{}unfortunately.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    40
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    41
\subsection*{LLVM and LLVM-IR}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    42
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    43
\noindent LLVM is a beautiful example
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    44
that projects from Academia can make a difference in the World. LLVM
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    45
started in 2000 as a project by two researchers at the  University of
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    46
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    47
gcc with its myriad of front-ends for different programming languages (C++, Fortran,
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    48
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    49
time into a monolithic gigantic piece of m\ldots ehm complicated
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    50
software, which you
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    51
could not mess about in an afternoon. In contrast, LLVM is designed to
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    52
be a modular suite of tools with which you can play around easily and
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    53
try out something new. LLVM became a big player once Apple hired one of
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    54
the original developers (I cannot remember the reason why Apple did not
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
    55
want to use gcc, but maybe they were also just disgusted by gcc's big
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    56
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    57
or less legacy. This does not mean that programming languages like C and
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    58
C++ are dying out any time soon---they are nicely supported by LLVM.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    59
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    60
We will target the LLVM Intermediate Language, or LLVM Intermediate
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    61
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    62
assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    63
allow us to benefit from the modular structure of the LLVM compiler
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    64
and we can let, for example, the compiler generate code for different
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    65
CPUs, say X86 or ARM.  That means we can be agnostic about where our
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    66
code is actually going to run.\footnote{Anybody want to try to run our
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    67
  programs on Android phones?}  We can also be somewhat ignorant about
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    68
optimising our code and about allocating memory efficiently: the LLVM
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    69
tools will take care of all this.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    70
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    71
However, what we have to do in order to make LLVM to play ball is to
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    72
generate code in \emph{Static Single-Assignment} format (short SSA),
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    73
because that is what the LLVM-IR expects from us. A reason why LLVM
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    74
uses the SSA format, rather than JVM-like stack instructions, is that
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    75
stack instructions are difficult to optimise---you cannot just
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    76
re-arrange instructions without messing about with what is calculated
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    77
on the stack. Also it is hard to find out if all the calculations on
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    78
the stack are actually necessary and not by chance dead code. The JVM
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    79
has for all these obstacles sophisticated machinery to make such
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    80
``high-level'' code still run fast, but let's say that for the sake of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    81
argument we do not want to rely on it. We want to generate fast code
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    82
ourselves. This means we have to work around the intricacies of what
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    83
instructions CPUs can actually process fast. This is what the SSA
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    84
format is designed for.
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    85
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    86
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    87
The main idea behind the SSA format is to use very simple variable
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    88
assignments where every tmp-variable is assigned only once. The
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    89
assignments also need to be primitive in the sense that they can be
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    90
just simple operations like addition, multiplication, jumps,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    91
comparisons, function calls and so on.  Say, we have an expression
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    92
$((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is
680
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
    93
 
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
    94
\begin{lstlisting}[language=LLVMIR,numbers=left]
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    95
let tmp0 = add 1 a in
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    96
let tmp1 = add 3 2 in  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    97
let tmp2 = call foo(tmp1)  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    98
let tmp3 = mul b 5 in 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    99
let tmp4 = add tmp2 tmp3 in 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   100
let tmp5 = add tmp0 tmp4 in
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   101
  return tmp5 
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   102
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   103
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   104
\noindent where every tmp-variable is used only once (we could, for
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   105
example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   106
would not change the overall result).
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   107
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   108
There are sophisticated algorithms for imperative languages, like C,
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   109
that efficiently transform a high-level program into SSA format. But
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   110
we can ignore them here. We want to compile a functional language and
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   111
there things get much more interesting than just sophisticated. We
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   112
will need to have a look at CPS translations, where the CPS stands for
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   113
Continuation-Passing-Style---basically black programming art or
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   114
abracadabra programming. So sit tight.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   115
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   116
\subsection*{LLVM-IR}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   117
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   118
Before we start, let's however first have a look at the \emph{LLVM Intermediate
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   119
Representation} in more detail. The LLVM-IR is in between the frontends
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   120
and backends of the LLVM framework. It allows compilation of multiple
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   121
source languages to multiple targets. It is also the place where most of
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   122
the target independent optimisations are performed. 
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   123
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   124
What is good about our toy Fun-language is that it basically only
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   125
contains expressions (be they arithmetic expressions, boolean
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   126
expressions or if-expressions). The exception are function definitions.
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   127
Luckily, for them we can use the mechanism of defining functions in the
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   128
LLVM-IR (this is similar to using JVM methods for functions in our
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   129
earlier compiler). For example the simple Fun-program 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   130
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   131
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   132
\begin{lstlisting}[language=Scala,numbers=none]
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   133
def sqr(x) = x * x
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   134
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   135
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   136
\noindent
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   137
can be compiled to the following LLVM-IR function:
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   138
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   139
\begin{lstlisting}[language=LLVM]
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   140
define i32 @sqr(i32 %x) {
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   141
   %tmp = mul i32 %x, %x
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   142
   ret i32 %tmp
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   143
}    
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   144
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   145
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   146
\noindent First notice that all ``local'' variable names, in this case
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   147
\texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   148
LLVM-IR.  Temporary variables can be named with an identifier, such as
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   149
\texttt{tmp}, or numbers. In contrast, function names, since they are
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   150
``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   151
a fully typed language. The \texttt{i32} type stands for 32-bit
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   152
integers. There are also types for 64-bit integers (\texttt{i64}),
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   153
chars (\texttt{i8}), floats, arrays and even pointer types. In the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   154
code above, \texttt{sqr} takes an argument of type \texttt{i32} and
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   155
produces a result of type \texttt{i32} (the result type is in front of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   156
the function name, like in C). Each arithmetic operation, for example
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   157
addition and multiplication, are also prefixed with the type they
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   158
operate on. Obviously these types need to match up\ldots{} but since
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   159
we have in our programs only integers, for the moment \texttt{i32}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   160
everywhere will do. We do not have to generate any other types, but
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   161
obviously this is a limitation in our Fun-language (and which we
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   162
lift in the final CW).
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   163
 
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   164
There are a few interesting instructions in the LLVM-IR which are quite
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   165
different than what we have seen in the JVM. Can you remember the
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   166
kerfuffle we had to go through with boolean expressions and negating the
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   167
condition? In the LLVM-IR, branching  if-conditions are implemented
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   168
differently: there is a separate \texttt{br}-instruction as follows:
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   169
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   170
\begin{lstlisting}[language=LLVM]
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   171
br i1 %var, label %if_br, label %else_br
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   172
\end{lstlisting}
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   173
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   174
\noindent
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   175
The type \texttt{i1} stands for booleans. If the variable is true,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   176
then this instruction jumps to the if-branch, which needs an explicit
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   177
label; otherwise it jumps to the else-branch, again with its own
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   178
label. This allows us to keep the meaning of the boolean expression
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   179
``as is'' when compiling if's---thanks god no more negating of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   180
booleans.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   181
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   182
A value of type boolean is generated in the LLVM-IR by the
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   183
\texttt{icmp}-instruction. This instruction is for integers (hence the
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   184
\texttt{i}) and takes the comparison operation as argument. For example
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   185
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   186
\begin{lstlisting}[language=LLVM]
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   187
icmp eq  i32 %x, %y     ; for equal
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   188
icmp sle i32 %x, %y     ; signed less or equal
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   189
icmp slt i32 %x, %y     ; signed less than
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   190
icmp ult i32 %x, %y     ; unsigned less than 
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   191
\end{lstlisting}
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   192
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   193
\noindent
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   194
Note that in some operations the LLVM-IR distinguishes between signed and 
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   195
unsigned representations of integers. I assume you know what this means,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   196
otherwise just ask.
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   197
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   198
It is also easy to call another function in LLVM-IR: as can be 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   199
seen from Figure~\ref{lli} we can just call a function with the 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   200
instruction \texttt{call} and can also assign the result to 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   201
a variable. The syntax is as follows
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   202
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   203
\begin{lstlisting}[language=LLVM]
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   204
%var = call i32 @foo(...args...)
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   205
\end{lstlisting}
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   206
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   207
\noindent
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   208
where the arguments can only be simple variables and numbers, but not compound
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   209
expressions.
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   210
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   211
Conveniently, you can use the program \texttt{lli}, which comes with
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   212
LLVM, to interpret programs written in the LLVM-IR. So you can easily
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   213
check whether the code you produced actually works. To get a running
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   214
program that does something interesting you need to add some boilerplate
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   215
about printing out numbers and a main-function that is the entry point
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   216
for the program (see Figure~\ref{lli} for a complete listing of the square function). Again
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   217
this is very similar to the boilerplate we needed to add in our JVM
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   218
compiler. 
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   219
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   220
You can generate a binary for the program in Figure~\ref{lli} by using
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   221
the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   222
an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   223
executable binary:
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   224
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   225
\begin{lstlisting}[language=bash,numbers=none]
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   226
llc -filetype=obj sqr.ll
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   227
gcc sqr.o -o a.out
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   228
./a.out
680
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   229
> 25
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   230
\end{lstlisting}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   231
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   232
\begin{figure}[t]\small 
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   233
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   234
\caption{An LLVM-IR program for calculating the square function. It
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   235
  calls the sqr-function in \texttt{@main} with the argument \texttt{5}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   236
  (Line 20). The
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   237
  code for the \texttt{sqr} function is in Lines 13 -- 16. It stores
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   238
  the result of sqr in the variable called \texttt{\%i} and then
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   239
  prints out the contents of this variable in Line 21. The other
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   240
  code is boilerplate for printing out integers.\label{lli}}
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   241
\end{figure}   
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   242
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   243
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   244
    
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   245
\subsection*{Our Own Intermediate Language}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   246
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   247
Let's get back to our compiler: Remember compilers have to solve the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   248
problem of bridging the gap between ``high-level'' programs and
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   249
``low-level'' hardware. If the gap is too wide for one step, then a
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   250
good strategy is to lay a stepping stone somewhere in between. The
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   251
LLVM-IR itself is such a stepping stone to make the task of generating
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   252
and optimising code easier. Like a real compiler we will use our own
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   253
stepping stone which I call the \emph{K-language}.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   254
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   255
\begin{center}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   256
  \begin{tikzpicture}[scale=0.9,font=\bf,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   257
                      node/.style={
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   258
                      rectangle,rounded corners=3mm,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   259
                      ultra thick,draw=black!50,minimum height=16mm, 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   260
                      minimum width=20mm,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   261
                      top color=white,bottom color=black!20}]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   262
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   263
  %\node (0) at (-3,0) {};  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   264
  \node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language};
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   265
  \node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language};
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   266
  \node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR};
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   267
 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   268
  \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B); 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   269
  \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C); 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   270
  \end{tikzpicture}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   271
  \end{center}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   272
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   273
  \noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   274
  To see why we need a stepping stone for generating code in SSA-format,
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   275
  considder again arithmetic expressions such as
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   276
  $((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   277
  ``atomic'' steps, like so
680
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   278
 
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   279
\begin{lstlisting}[language=LLVMIR,numbers=none]
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   280
let tmp0 = add 1 a in   
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   281
let tmp1 = mul b 5 in 
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   282
let tmp2 = add 3 tmp1 in 
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   283
let tmp3 = add tmp0 tmp2 in
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   284
  return tmp3 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   285
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   286
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   287
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   288
Here \texttt{tmp3} will contain the result of what the whole
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   289
expression stands for. In each individual step we can only perform an
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   290
``atomic'' or ``trival'' operation, like addition or multiplication of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   291
a number or a variable.  We are not allowed to have for example a
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   292
nested addition or an if-condition on the right-hand side of an
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   293
assignment. Such constraints are forced upon us because of how the SSA
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   294
format works in the LLVM-IR. To simplify matters we represent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   295
assignments with two kinds of syntactic entities, namely
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   296
\emph{K-values} and \emph{K-expressions}. K-values are all ``things''
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   297
that can appear on the right-hand side of an equal. Say we have
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   298
the following definition for K-values:
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   299
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   300
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   301
enum KVal {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   302
  case KVar(s: String)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   303
  case KNum(n: Int)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   304
  case KAop(op: String, v1: KVal, v2: KVal)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   305
  case KCall(fname: String, args: List[KVal])
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   306
}
680
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   307
\end{lstlisting}
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   308
eecc4d5a2172 updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   309
\noindent
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   310
where a K-value can be a variable, a number or a ``trivial'' binary
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   311
operation, such as addition or multiplication. The two arguments of a
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   312
binary operation need to be K-values as well. Finally, we have
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   313
function calls, but again each argument of the function call needs to
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   314
be a K-value.  One point we need to be careful, however, is that the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   315
arguments of the binary operations and function calls are in fact only
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   316
variables or numbers. To encode this constraint into the type-system
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   317
of Scala would make things too complicated---to satisfy this
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   318
constraint is therefore on us. For our Fun-language, we will later on
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   319
consider some further K-values.
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   320
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   321
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   322
Our K-expressions will encode the \texttt{let} and the \texttt{return}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   323
from the SSA-format (again for the moment we only consider these two
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   324
constructors---later on we will have if-branches as well).
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   325
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   326
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   327
enum KExp {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   328
  case KLet(x: String, v: KVal, e: KExp)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   329
  case KReturn(v: KVal)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   330
}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   331
\end{lstlisting}
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   332
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   333
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   334
By having in \texttt{KLet} taking first a string (standing for an
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   335
intermediate variable) and second a value, we can fulfil the SSA
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   336
constraint in assignments ``by construction''---there is no way we
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   337
could write anything else than a K-value.  Note that the third
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   338
argument of a \texttt{KLet} is again a K-expression, meaning either
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   339
another \texttt{KLet} or a \texttt{KReturn}. In this way we can
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   340
construct a sequence of computations and return a final result.  Of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   341
course we also have to ensure that all intermediary computations are
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   342
given (fresh) names---we will use an (ugly) counter for this.
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   343
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   344
To sum up, K-values are the atomic operations that can be on the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   345
right-hand side of assignemnts. The K-language is restricted such that
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   346
it is easy to generate the SSA format for the LLVM-IR. What remains to
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   347
be done is a translation of our Fun-language into the K-language. The
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   348
main difficulty is that we need to order the computationa---in teh
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   349
K-language we only have linear sequence of instructions to need to be
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   350
followed. Before we explain this, we have a look at some
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   351
CPS-translation.
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   352
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   353
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   354
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   355
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   356
\subsection*{CPS-Translations}
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   357
704
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   358
CPS stands for Continuation-Passing-Style. It is a kind of programming
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   359
technique often used in advanced functional programming. Before we delve
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   360
into the CPS-translation for our Fun-language, let us look at 
704
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   361
CPS-versions of some well-known functions. Consider
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   362
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   363
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   364
def fact(n: Int) : Int = 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   365
  if (n == 0) 1 else n * fact(n - 1) 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   366
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   367
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   368
\noindent 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   369
This is clearly the usual factorial function. But now consider the
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   370
following version of the factorial function:
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   371
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   372
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   373
def factC(n: Int, ret: Int => Int) : Int = 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   374
  if (n == 0) ret(1) 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   375
  else factC(n - 1, x => ret(n * x)) 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   376
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   377
factC(3, identity)
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   378
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   379
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   380
\noindent
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   381
This function is called with the number, in this case 3, and the
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   382
identity-function (which returns just its input). The recursive
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   383
calls are:
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   384
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   385
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   386
factC(2, x => identity(3 * x))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   387
factC(1, x => identity(3 * (2 * x)))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   388
factC(0, x => identity(3 * (2 * (1 * x))))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   389
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   390
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   391
\noindent
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   392
Having reached 0, we get out of the recursion and apply 1 to the
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   393
continuation (see if-branch above). This gives
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   394
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   395
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   396
identity(3 * (2 * (1 * 1)))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   397
= 3 * (2 * (1 * 1))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   398
= 6
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   399
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   400
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   401
\noindent
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   402
which is the expected result. If this looks somewhat familiar to you,
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   403
than this is because functions with continuations can be
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   404
seen as a kind of generalisation of tail-recursive functions.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   405
So far we did not look at this generalisation in earnest.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   406
Anyway
704
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   407
notice how the continuations is ``stacked up'' during the recursion and
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   408
then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   409
can do something similar to the Fibonacci function where in the traditional
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   410
version we have two recursive calls. Consider the following function
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   411
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   412
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   413
def fibC(n: Int, ret: Int => Int) : Int = 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   414
  if (n == 0 || n == 1) ret(1) 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   415
  else fibC(n - 1,
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   416
             r1 => fibC(n - 2,
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   417
               r2 => ret(r1 + r2)))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   418
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   419
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   420
\noindent
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   421
Here the continuation is a nested function essentially wrapping up 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   422
the second recursive call. Let us check how the recursion unfolds
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   423
when called with 3 and the identity function:
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   424
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   425
\begin{lstlisting}[language=Scala, numbers=none]
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   426
fibC(3, id)
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   427
fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   428
fibC(1, r1 => 
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   429
   fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   430
fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   431
fibC(1, r2a => id((1 + 1) + r2a))
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   432
id((1 + 1) + 1)
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   433
(1 + 1) + 1
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   434
3
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   435
\end{lstlisting}
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   436
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   437
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   438
The point of this section is that you are playing around with these
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   439
simple definitions and make sure they calculate the expected result.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   440
In the next step, you should understand \emph{how} these functions
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   441
calculate the result with the continuations. 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   442
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   443
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   444
\section*{Worked Example}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   445
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   446
Let us now come back to the CPS-translations for the Fun-language. The
704
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   447
main difficulty of generating instructions in SSA format is that large
6d9c960a2b26 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   448
compound expressions need to be broken up into smaller pieces and
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   449
intermediate results need to be chained into later instructions. To do
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   450
this conveniently, we use the CPS-translation mechanism. What continuations
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   451
essentially solve is the following problem: Given an expressions
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   452
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   453
\begin{equation}\label{exp}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   454
(1 + 2) * (3 + 4)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   455
\end{equation}  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   456
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   457
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   458
which of the subexpressions should be calculated first? We just arbitrarily
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   459
going to decide that it will be from left to right. This means we have
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   460
to tear the expression shown in \eqref{exp} apart as follows:
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   461
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   462
\[
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   463
(1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   464
\]  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   465
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   466
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   467
The first one will give us a result, but the second one is not
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   468
really an expression that makes sense. It will only make sense
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   469
as the next step in the computation when we fill-in the result of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   470
$1+2$ into the ``hole'' indicated by $\Box$. Such incomplete
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   471
expressions can be represented in Scala as functions
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   472
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   473
\begin{lstlisting}[language=Scala, numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   474
r => r * (3 + 4)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   475
\end{lstlisting}  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   476
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   477
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   478
where \texttt{r} is any result that has been computed earlier. By the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   479
way in lambda-calculus notation we would write
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   480
$\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
700
52263ffd17b9 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   481
functions (``continuations'') to represent what is coming next in a
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   482
sequence of instructions. In our case, continuations are functions of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   483
type \code{KVal} to \code{KExp}. They can be seen as a sequence of
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   484
\code{KLet}s where there is a ``hole'' that needs to be
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   485
filled. Consider for example
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   486
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   487
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   488
let tmp0 = add 1 a in   
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   489
let tmp1 = mul (*@$\Box$@*) 5 in 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   490
let tmp2 = add 3 tmp1 in 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   491
let tmp3 = add tmp0 tmp2 in
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   492
  retrun tmp3 
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   493
\end{lstlisting}
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   494
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   495
\noindent
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   496
where in the second line is a $\Box$ which still expects a \code{KVal}
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   497
to be filled in before it becomes a ``proper'' \code{KExp}. When we 
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   498
apply an argument to the continuation (remember they are functions)
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   499
we essentially fill something into the corresponding hole.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   500
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   501
Lets look at concrete code. To simplify matters first, 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   502
suppose our source language consists just of three kinds of expressions
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   503
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   504
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   505
enum Expr {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   506
    case Num(n: Int)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   507
    case Bop(op: String, e1: Expr, e2: Expr)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   508
    case Call(fname: String, args: List[Expr])
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   509
}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   510
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   511
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   512
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   513
The code of the CPS-translation is then of the form
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   514
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   515
\begin{lstlisting}[language=Scala,numbers=none]
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   516
def CPS(e: Exp)(k: KVal => KExp) : KExp = 
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   517
  e match { ... }
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   518
\end{lstlisting}
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   519
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   520
\noindent 
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   521
where \code{k} is the continuation and \code{e} is the expression to
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   522
be compiled. In case we have numbers, we can just pass them to the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   523
continuations because numbers need not be further teared apart as they
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   524
are already atomic. Passing the number to the continuation means we 
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   525
apply the continuation like 
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   526
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   527
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   528
  case Num(i) => k(KNum(i))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   529
\end{lstlisting}
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   530
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   531
\noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   532
There is no need to create a temporary variable for simple numbers.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   533
More interesting is the case for arithmetic operations.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   534
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   535
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   536
case Aop(op, e1, e2) => {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   537
  val z = Fresh("tmp")
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   538
  CPS(e1)(r1 => 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   539
    CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   540
}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   541
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   542
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   543
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   544
What we essentially have to do in this case is the following: compile
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   545
the subexpressions \texttt{e1} and \texttt{e2}. They will produce some
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   546
result that is stored in two temporary variables (assuming they are more
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   547
complicated than just numbers). We need to use these two temporary
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   548
variables and feed them into a new assignment of the form
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   549
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   550
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   551
let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   552
...
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   553
\end{lstlisting}
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   554
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   555
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   556
Note that this assignment has two ``holes'', one for the left
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   557
subexpression and one for the right subexpression. We can fill both
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   558
holes by calling the CPS function twice---this is a bit of the black
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   559
art in CPS. We can use the second call of CPS as the continuation
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   560
of the first call: The reason is that
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   561
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   562
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   563
r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   564
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   565
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   566
\noindent is a function from \code{KVal} to \code{KExp}. Once
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   567
we created the assignment with the fresh temporary variable
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   568
\texttt{z}, we need to ``communicate'' that the result of the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   569
computation of the arithmetic expression is stored in \texttt{z} to the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   570
computations that follow. In this way we apply the continuation \texttt{k} with this
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   571
new variable (essentially we are plugging in a hole further down the line).
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   572
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   573
The last case we need to consider in our small expression language are
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   574
function calls.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   575
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   576
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   577
case Call(fname, args) => {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   578
  def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   579
       case Nil => {
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   580
         val z = Fresh("tmp")
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   581
         KLet(z, KCall(fname, vs), k(KVar(z)))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   582
       }
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   583
       case a::as => CPS(a)(r => aux(as, vs ::: List(r)))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   584
  }
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   585
  aux(args, Nil)  
701
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   586
}
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   587
\end{lstlisting}
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   588
681c36b2af27 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   589
\noindent
908
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   590
For them we introduce an auxiliary function that compiles first all
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   591
function arguments---remember in our source language we can have calls
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   592
such as $foo(3 + 4, g(3))$ where we first have to create temporary
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   593
variables (and computations) for each argument. Therefore \texttt{aux}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   594
analyses the arguments from left to right. In case there is an argument
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   595
\texttt{a} on the front of the list (the case \texttt{a::as}), we call
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   596
CPS recursively for the corresponding subexpression. The temporary variable
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   597
containing the result for this argument we add to the end of the K-values we
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   598
have already analysed before. Again very conveniently we can use the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   599
recursive call to \texttt{aux} as the continuation for the computations
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   600
that follow. If we reach the end of the argument list (the \texttt{Nil}-case),
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   601
then we collect all the K-values \texttt{v1} to \texttt{vn} and call the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   602
function in the K-language like so
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   603
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   604
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   605
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   606
let z = call foo(v1,...,vn) in
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   607
...
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   608
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   609
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   610
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   611
Again we need to communicate the result of the function call, namely the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   612
fresh temporary variable \texttt{z}, to the rest of the computation. Therefore
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   613
we apply the continuation \texttt{k} with this variable.
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   614
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   615
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   616
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   617
\begin{figure}[p]\small
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   618
  \lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   619
  \hspace{10mm}...
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   620
  \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   621
\caption{CPS-translation for a simple expression language.\label{cps}}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   622
\end{figure}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   623
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   624
The last question we need to answer in the code (see Figure~\ref{cps}) is how we
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   625
start the CPS-translation function, or more precisely with which continuation we
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   626
should start CPS? It turns out that this initial continuation will be the last one that is
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   627
called. What should be the last step in the computation? Yes, we need to return the
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   628
temporary variable where the last result is stored (if it is a simple number, then we can
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   629
just return this number). Therefore we cal CPS with the code
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   630
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   631
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   632
def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   633
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   634
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   635
\noindent
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   636
where we give the function \code{KReturn(_)}. 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   637
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   638
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   639
\begin{figure}[p]\small
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   640
\begin{lstlisting}[language=Scala,numbers=none]
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   641
// Fun language (expressions)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   642
abstract class Exp 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   643
abstract class BExp 
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   644
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   645
case class Call(name: String, args: List[Exp]) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   646
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   647
case class Write(e: Exp) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   648
case class Var(s: String) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   649
case class Num(i: Int) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   650
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   651
case class Sequence(e1: Exp, e2: Exp) extends Exp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   652
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   653
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   654
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   655
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   656
// K-language (K-expressions, K-values)
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   657
abstract class KExp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   658
abstract class KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   659
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   660
case class KVar(s: String) extends KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   661
case class KNum(i: Int) extends KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   662
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   663
case class KCall(o: String, vrs: List[KVal]) extends KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   664
case class KWrite(v: KVal) extends KVal
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   665
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   666
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   667
case class KLet(x: String, v: KVal, e: KExp) extends KExp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   668
case class KReturn(v: KVal) extends KExp
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   669
\end{lstlisting}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   670
\caption{Abstract syntax trees for the Fun-language.\label{absfun}}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   671
\end{figure}
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   672
0138618eff73 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   673
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   674
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   675
\noindent
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   676
\end{document}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   677
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   678
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   679
%%% Local Variables: 
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   680
%%% mode: latex
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   681
%%% TeX-master: t
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   682
%%% End: