handouts/ho09.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Mon, 30 Oct 2023 18:46:27 +0000
changeset 950 285da21f44c0
parent 917 d3d371ae5fab
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     1
% !TEX program = xelatex
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
\documentclass{article}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     3
\usepackage{../style}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
\usepackage{../langs}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
\usepackage{../graphics}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
\usepackage{../grammar}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     7
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
\begin{document}
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
     9
\fnote{\copyright{} Christian Urban, King's College London, 2019, 2023}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
722
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    12
% CPS translations
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    13
% https://felleisen.org/matthias/4400-s20/lecture17.html
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    14
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    15
%% pattern matching resources
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    16
%% https://www.reddit.com/r/ProgrammingLanguages/comments/g1vno3/beginner_resources_for_compiling_pattern_matching/
7c09b7eadc6b updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 704
diff changeset
    17
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    18
\section*{Handout 9 (LLVM, SSA and CPS)}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    19
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    20
Reflecting on our two tiny compilers targetting the JVM, the code
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    21
generation part was actually not so hard, no? Pretty much just some
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    22
post-traversal of the abstract syntax tree. Yes? One of the reasons
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    23
for this ease is that the JVM is a stack-based virtual machine and it
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    24
is therefore not hard to translate deeply-nested arithmetic
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    25
expressions into a sequence of instructions manipulating the
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    26
stack. That is pretty much the whole point of the JVM. The problem is
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    27
that ``real'' CPUs, although supporting stack operations, are not
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    28
really designed to be \emph{stack machines}.  The design of CPUs is
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    29
more like: Here are some instructions and a chunk of
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    30
memory---compiler, or better compiler writers, do something with
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    31
them. Consequently, modern compilers need to go the extra mile in
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    32
order to generate code that is much easier and faster to process by
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    33
actual CPUs, rather than running code on virtual machines that
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    34
introduce an additional layer of indirection. To make this all
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    35
tractable for this module, we target the \emph{LLVM Intermediate
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    36
  Language} (LLVM-IR). In this way we can take advantage of the tools
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    37
coming with LLVM.\footnote{\url{http://llvm.org}} For example we do
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    38
not have to worry about things like register allocations or peephole
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    39
optimisations. By using the LLVM-IR, however, we also have to pay
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    40
price in the sense that generating code gets
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    41
harder\ldots{}unfor\-tunately nothing comes for free in life.
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    42
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
    43
\subsection*{LLVM and the LLVM-IR}
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    44
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    45
\noindent LLVM is a beautiful example
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    46
that projects from Academia can make a difference in the Real World. LLVM
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    47
started in 2000 as a project by two researchers at the  University of
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    48
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    49
gcc with its myriad of front-ends for different programming languages (C++, Fortran,
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    50
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    51
time into a monolithic gigantic piece of m\ldots ehm complicated
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    52
software, which you
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    53
could not mess about in an afternoon. In contrast, LLVM is designed to
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    54
be a modular suite of tools with which you can play around easily and
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    55
try out something new. LLVM became a big player once Apple hired one of
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    56
the original developers (I cannot remember the reason why Apple did not
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
    57
want to use gcc, but maybe they were also just disgusted by gcc's big
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    58
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    59
or less legacy. This does not mean that programming languages like C and
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    60
C++ are dying out any time soon---they are nicely supported by LLVM.
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    61
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    62
We will target the LLVM Intermediate Language, or LLVM Intermediate
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
    63
Representation (short LLVM-IR). The LLVM-IR looks very similar to the
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    64
assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    65
allow us to benefit from the modular structure of the LLVM compiler
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
    66
and we can let the compiler generate code for different CPUs, for
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
    67
example X86 or ARM.  That means we can be agnostic about where our
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    68
code is actually going to run.\footnote{Anybody want to try to run our
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    69
  programs on Android phones?}  We can also be somewhat ignorant about
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
    70
optimising our code and about allocating memory efficiently---the LLVM
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    71
tools will take care of all this.
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    72
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
    73
However, what we have to do in order to make LLVM to play ball is to
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
    74
generate code in \emph{Static Single-Assignment} format (short SSA). A
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
    75
reason why LLVM uses the SSA-format, rather than JVM-like stack
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
    76
instructions, is that stack instructions are difficult to
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
    77
optimise---you cannot just re-arrange instructions without messing
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    78
about with what is calculated on the stack. Have a look at the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    79
expression $((a + b) * 4) - (3 * (a + b))$ and the corresponding JVM
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    80
instructions:
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    81
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    82
\begin{lstlisting}[language=JVMIS, numbers=none,mathescape]
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    83
iload a
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    84
iload b
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    85
iadd
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    86
ldc 4
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    87
imul
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    88
ldc 3
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    89
iload a
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    90
iload b
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    91
iadd
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    92
imul
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    93
isub
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    94
\end{lstlisting}
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    95
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    96
\noindent
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    97
and try to reorganise the code such that you calculate the expression
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    98
$(a + b)$ only once. This requires either quite a bit of
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
    99
math-understanding from the compiler or you need to add ``copy-and-fetch''
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   100
of a result from a local variable.  Also it is hard to find out if all
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   101
the calculations on the stack are actually necessary and not by chance
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   102
dead code. The JVM has for all these obstacles sophisticated machinery
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   103
to make such ``high-level'' code still run fast, but let's say that
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   104
for the sake of argument we do not want to rely on it. We want to
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   105
generate fast code ourselves. This means we have to work around the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   106
intricacies of what instructions CPUs can actually process fast. This
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   107
is what the SSA format is designed for.
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   108
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   109
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   110
The main idea behind the SSA-format is to have sequences of very
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   111
simple variable assignments where every (tmp)variable is assigned only
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   112
once. The assignments need to be simple in the sense that they can be
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   113
just be primitive operations like addition, multiplication, jumps,
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   114
comparisons, function calls and so on.  Say, we have an expression
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   115
$((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding sequence
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   116
of assignments in SSA-format are:
680
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   117
 
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   118
\begin{lstlisting}[language=LLVMIR,numbers=left]
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   119
let tmp0 = add 1 a in
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   120
let tmp1 = add 3 2 in  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   121
let tmp2 = call foo(tmp1)  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   122
let tmp3 = mul b 5 in 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   123
let tmp4 = add tmp2 tmp3 in 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   124
let tmp5 = add tmp0 tmp4 in
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   125
  return tmp5 
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   126
\end{lstlisting}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   127
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   128
\noindent where every tmpX-variable is used only once (we could, for
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   129
example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   130
this would not change the overall result). At the end we have a
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   131
return-instruction wich contains the final result of the
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   132
expression. As can be seen the task we have to solve for generating
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   133
SSA-code is to take apart compound expressions into its most basic
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   134
''particles'' and transfrom them into a sequence of simple assignments
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   135
that calculates the desired result. Note that this means we have to
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   136
fix the order in which the expression is calculated, like from the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   137
left to right.
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   138
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   139
There are sophisticated algorithms for imperative languages, like C,
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   140
that efficiently transform high-level programs into SSA-format. But
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   141
we can ignore them here. We want to compile a functional language and
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   142
there things get much more interesting than just sophisticated. We
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   143
will need to have a look at CPS translations, where the CPS stands for
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   144
\emph{Continuation-Passing-Style}---basically black programming art or
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   145
abracadabra programming. So sit tight.
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   146
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   147
\subsection*{LLVM-IR}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   148
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   149
Before we start, let's however first have a look at the \emph{LLVM Intermediate
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   150
Representation} in more detail. The LLVM-IR is in between the frontends
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   151
and backends of the LLVM framework. It allows compilation of multiple
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   152
source languages to multiple targets. It is also the place where most of
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   153
the target independent optimisations are performed. 
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   154
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   155
What is good about our toy Fun-language is that it basically only
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   156
contains expressions (be they arithmetic expressions, boolean
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   157
expressions or if-expressions). The exception are function definitions.
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   158
Luckily, for them we can use the mechanism of defining functions in the
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   159
LLVM-IR (this is similar to using JVM methods for functions in our
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   160
earlier compiler). For example the simple Fun-program 
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   161
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   162
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   163
\begin{lstlisting}[language=Scala,numbers=none]
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   164
def sqr(x) = x * x
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   165
\end{lstlisting}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   166
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   167
\noindent
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   168
can be compiled to the following LLVM-IR function:
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   169
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   170
\begin{lstlisting}[language=LLVM]
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   171
define i32 @sqr(i32 %x) {
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   172
   %tmp = mul i32 %x, %x
677
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   173
   ret i32 %tmp
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   174
}    
3787d4fae375 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   175
\end{lstlisting}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   176
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   177
\noindent First notice that all ``local'' variable names, in this case
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   178
\texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   179
LLVM-IR.  Temporary variables can be named with an identifier, such as
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   180
\texttt{tmp}, or numbers. In contrast, function names, since they are
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   181
``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   182
a fully typed language. The \texttt{i32} type stands for 32-bit
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   183
integers. There are also types for 64-bit integers (\texttt{i64}),
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   184
chars (\texttt{i8}), floats, arrays and even pointer types. In the
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   185
code above, \texttt{sqr} takes an argument of type \texttt{i32} and
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   186
produces a result of type \texttt{i32} (the result type is in front of
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   187
the function name, like in C). Each arithmetic operation, for example
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   188
addition and multiplication, are also prefixed with the type they
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   189
operate on. Obviously these types need to match up\ldots{} but since
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   190
we have in our programs only integers, for the moment \texttt{i32}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   191
everywhere will do. We do not have to generate any other types, but
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   192
obviously this is a limitation in our Fun-language (and which we
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   193
are going to lift in the final CW).
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   194
 
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   195
There are a few interesting instructions in the LLVM-IR which are quite
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   196
different than what we have seen in the JVM. Can you remember the
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   197
kerfuffle we had to go through with boolean expressions and negating the
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   198
condition? In the LLVM-IR, branching  if-conditions are implemented
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   199
differently: there is a separate \texttt{br}-instruction as follows:
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   200
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   201
\begin{lstlisting}[language=LLVM]
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   202
br i1 %var, label %if_br, label %else_br
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   203
\end{lstlisting}
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   204
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   205
\noindent
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   206
The type \texttt{i1} stands for booleans. If the variable is true,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   207
then this instruction jumps to the if-branch, which needs an explicit
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   208
label; otherwise it jumps to the else-branch, again with its own
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   209
label. This allows us to keep the meaning of the boolean expression
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   210
``as is'' when compiling if's---thanks god no more negating of
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   211
booleans.
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   212
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   213
A value of type boolean is generated in the LLVM-IR by the
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   214
\texttt{icmp}-instruction. This instruction is for integers (hence the
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   215
\texttt{i}) and takes the comparison operation as argument. For example
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   216
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   217
\begin{lstlisting}[language=LLVM]
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   218
icmp eq  i32 %x, %y     ; for equal
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   219
icmp sle i32 %x, %y     ; signed less or equal
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   220
icmp slt i32 %x, %y     ; signed less than
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   221
icmp ult i32 %x, %y     ; unsigned less than 
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   222
\end{lstlisting}
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   223
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   224
\noindent
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   225
Note that in some operations the LLVM-IR distinguishes between signed and 
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   226
unsigned representations of integers. I assume you know what this means,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   227
otherwise just ask.
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   228
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   229
It is also easy to call another function in LLVM-IR: as can be 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   230
seen from Figure~\ref{lli} we can just call a function with the 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   231
instruction \texttt{call} and can also assign the result to 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   232
a variable. The syntax is as follows
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   233
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   234
\begin{lstlisting}[language=LLVM]
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   235
%var = call i32 @foo(...args...)
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   236
\end{lstlisting}
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   237
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   238
\noindent
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   239
where the arguments can only be simple variables and numbers, but not compound
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   240
expressions.
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   241
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   242
Conveniently, you can use the program \texttt{lli}, which comes with
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   243
LLVM, to interpret programs written in the LLVM-IR. Type on the command line
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   244
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   245
\begin{lstlisting}[language=bash,numbers=none]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   246
lli sqr.ll
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   247
\end{lstlisting}
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   248
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   249
\noindent
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   250
and you can see the output of the pragram generated. 
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   251
This means you can easily check whether the code you produced actually
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   252
works. To get a running program that does something interesting you
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   253
need to add some boilerplate about printing out numbers and a
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   254
main-function that is the entry point for the program (see
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   255
Figure~\ref{lli} for a complete listing of the square function). Again
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   256
this is very similar to the boilerplate we needed to add in our JVM
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   257
compiler.
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   258
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   259
You can generate a binary for the program in Figure~\ref{lli} by using
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   260
the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   261
an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   262
executable binary:
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   263
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   264
\begin{lstlisting}[language=bash,numbers=none]
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   265
llc -filetype=obj sqr.ll
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   266
gcc sqr.o -o a.out
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   267
./a.out
680
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   268
> 25
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   269
\end{lstlisting}
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   270
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   271
\begin{figure}[t]\small 
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   272
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   273
\caption{An LLVM-IR program for calculating the square function. It
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   274
  calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   275
  (Line 20). The
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   276
  code for the \texttt{sqr}-function is in Lines 13 -- 16. The main-function
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   277
  stores
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   278
  the result of sqr in the variable called \texttt{\%1} and then
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   279
  prints out the contents of this variable in Line 21. The other
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   280
  code is boilerplate for printing out integers.\label{lli}}
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   281
\end{figure}   
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   282
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   283
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   284
    
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   285
\subsection*{Our Own Intermediate Language}
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   286
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   287
Let's get back to our compiler: Remember compilers have to solve the
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   288
problem of bridging the gap between ``high-level'' programs and
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   289
``low-level'' hardware. If the gap is too wide for one step, then a
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   290
good strategy is to lay a stepping stone somewhere in between. The
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   291
LLVM-IR itself is such a stepping stone to make the task of generating
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   292
and optimising code easier. Like a real compiler we will use our own
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   293
stepping stone which I call the \emph{K-language}.
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   294
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   295
\begin{center}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   296
  \begin{tikzpicture}[scale=0.9,font=\bf,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   297
                      node/.style={
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   298
                      rectangle,rounded corners=3mm,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   299
                      ultra thick,draw=black!50,minimum height=16mm, 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   300
                      minimum width=20mm,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   301
                      top color=white,bottom color=black!20}]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   302
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   303
  %\node (0) at (-3,0) {};  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   304
  \node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language};
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   305
  \node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language};
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   306
  \node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR};
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   307
 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   308
  \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B); 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   309
  \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C); 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   310
  \end{tikzpicture}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   311
  \end{center}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   312
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   313
  \noindent
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   314
  To see why we need a stepping stone for generating code in SSA-format,
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   315
  considder again arithmetic expressions such as
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   316
  $((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   317
  ``atomic'' steps, like so
680
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   318
 
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   319
\begin{lstlisting}[language=LLVMIR,numbers=none]
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   320
let tmp0 = add 1 a in   
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   321
let tmp1 = mul b 5 in 
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   322
let tmp2 = add 3 tmp1 in 
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   323
let tmp3 = add tmp0 tmp2 in
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   324
  return tmp3 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   325
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   326
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   327
\noindent
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   328
Here \texttt{tmp3} will contain the result of what the whole
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   329
expression stands for. In each individual step we can only perform an
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   330
``atomic'' or ``trival'' operation, like addition or multiplication of
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   331
a number or a variable.  We are not allowed to have for example a
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   332
nested addition or an if-condition on the right-hand side of an
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   333
assignment. Such constraints are forced upon us because of how the
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   334
SSA-format works in the LLVM-IR. To simplify matters we represent
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   335
assignments with two kinds of syntactic entities, namely
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   336
\emph{K-values} and \emph{K-expressions}. K-values are all ``things''
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   337
that can appear on the right-hand side of an equal. Say we have the
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   338
following definition for K-values:
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   339
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   340
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   341
enum KVal {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   342
  case KVar(s: String)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   343
  case KNum(n: Int)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   344
  case KAop(op: String, v1: KVal, v2: KVal)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   345
  case KCall(fname: String, args: List[KVal])
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   346
}
680
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   347
\end{lstlisting}
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   348
242c1f0e60df updated
Christian Urban <urbanc@in.tum.de>
parents: 679
diff changeset
   349
\noindent
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   350
where a K-value can be a variable, a number or a ``trivial'' binary
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   351
operation, such as addition or multiplication. The two arguments of a
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   352
binary operation need to be K-values as well. Finally, we have
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   353
function calls, but again each argument of the function call needs to
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   354
be a K-value.  One constraint we need to be careful about is that the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   355
arguments of the binary operations and function calls can only be
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   356
variables or numbers. For example
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   357
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   358
\begin{lstlisting}[language=Scala,numbers=none]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   359
KAop("+", KAop("*", KNum(1), KNum(2)), KNum(3))
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   360
KCall("foo", List(KAop("+", KNum(4), KNum(5)))  
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   361
\end{lstlisting}
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   362
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   363
\noindent
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   364
while perfect instances of K-values according to the types, are not
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   365
allowed in the LLVM-IR. To encode this constraint into the type-system
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   366
of Scala, however, would make things overly complicated---to satisfy
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   367
this constraint is therefore on us. For the moment this will be all
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   368
K-values we are considdering. Later on, we will have some more for our
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   369
Fun-language.
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   370
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   371
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   372
Our K-expressions will encode the \texttt{let} and the \texttt{return}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   373
from the SSA-format (again for the moment we only consider these two
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   374
constructors---later on we will have if-branches as well).
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   375
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   376
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   377
enum KExp {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   378
  case KLet(x: String, v: KVal, e: KExp)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   379
  case KReturn(v: KVal)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   380
}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   381
\end{lstlisting}
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   382
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   383
\noindent
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   384
By having \texttt{KLet} taking first a string (standing for a
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   385
tmp-variable) and second a value, we can fulfil the SSA constraint in
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   386
assignments ``by con\-struction''---there is no way we could write
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   387
anything else than a K-value.  Note that the third argument of a
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   388
\texttt{KLet} is again a K-expression, meaning either another
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   389
\texttt{KLet} or a \texttt{KReturn}. In this way we can construct a
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   390
sequence of computations and indicate what the final result of the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   391
computations is.  According to the SSA-format, we also have to ensure
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   392
that all intermediary computations are given (fresh) names---we will
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   393
use an (ugly) counter for this.
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   394
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   395
To sum up, K-values are the atomic operations that can be on the
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   396
right-hand side of assignemnts. The K-language is restricted such that
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   397
it is easy to generate the SSA-format for the LLVM-IR. What remains to
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   398
be done is a translation of our Fun-language into the K-language. The
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   399
main difficulty is that we need to order the computation---in the
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   400
K-language we only have linear sequences of instructions. Before we
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   401
explain this, we have a look at some programs in CPS-style.
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   402
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   403
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   404
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   405
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   406
\subsection*{CPS-Translations}
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   407
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   408
CPS stands for \emph{Continuation-Passing-Style}. It is a kind of
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   409
programming technique often used in advanced functional programming
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   410
and in particular in compilers. Before we delve into the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   411
CPS-translation for our Fun-language compiler, let us look at
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   412
CPS-versions of some well-known functions. Consider
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   413
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   414
\begin{lstlisting}[language=Scala, numbers=none]
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   415
def fact(n: Int) : Int = 
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   416
  if (n == 0) 1 else n * fact(n - 1) 
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   417
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   418
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   419
\noindent 
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   420
This is clearly the usual factorial function. But now consider the
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   421
following slight variant of the factorial function:
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   422
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   423
\begin{lstlisting}[language=Scala, numbers=left]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   424
def factC(n: Int, k: Int => Int) : Int = 
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   425
  if (n == 0) k(1) 
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   426
  else factC(n - 1, x => k(n * x)) 
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   427
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   428
factC(3, id)
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   429
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   430
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   431
\noindent
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   432
The function is is Lines 1--3. The function call is in Line 5.  We
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   433
call this function with a number, in this case 3, and the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   434
identity-function (which returns just its input). The \texttt{k} is
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   435
the continuation in this function. The recursive calls are:
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   436
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   437
\begin{lstlisting}[language=Scala, numbers=none]
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   438
factC(2, x => id(3 * x))
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   439
factC(1, x => id(3 * (2 * x)))
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   440
factC(0, x => id(3 * (2 * (1 * x))))
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   441
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   442
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   443
\noindent
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   444
Having reached 0, we get out of the recursion and apply 1 to the
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   445
continuation (see if-branch above in Line 2). This gives
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   446
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   447
\begin{lstlisting}[language=Scala, numbers=none]
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   448
id(3 * (2 * (1 * 1)))
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   449
= 3 * (2 * (1 * 1))
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   450
= 6
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   451
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   452
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   453
\noindent
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   454
which is the expected result. If this looks somewhat familiar to you,
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   455
than this is because functions with continuations can be seen as a
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   456
kind of generalisation of tail-recursive functions.  So far, however,
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   457
we did not look at this generalisation in earnest.  Anyway notice how
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   458
the continuations is ``stacked up'' during the recursion and then
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   459
``unrolled'' when we apply 1 to the continuation. Interestingly, we
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   460
can do something similar to the Fibonacci function where in the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   461
traditional version we have two recursive calls. Consider the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   462
following function
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   463
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   464
\begin{lstlisting}[language=Scala, numbers=left]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   465
def fibC(n: Int, k: Int => Int) : Int = 
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   466
  if (n == 0 || n == 1) k(1) 
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   467
  else fibC(n - 1,
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   468
             r1 => fibC(n - 2,
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   469
               r2 => k(r1 + r2)))
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   470
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   471
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   472
\noindent
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   473
Here the continuation (Lines 4--5) is a nested function essentially wrapping up 
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   474
the second recursive call plus the original continuation. Let us check how the recursion unfolds
704
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   475
when called with 3 and the identity function:
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   476
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   477
\begin{lstlisting}[language=Scala, numbers=none]
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   478
fibC(3, id)
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   479
fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   480
fibC(1, r1 => 
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   481
   fibC(0, r2 => fibC(1, r2a => id((r1 + r2) + r2a))))
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   482
fibC(0, r2 => fibC(1, r2a => id((1 + r2) + r2a)))
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   483
fibC(1, r2a => id((1 + 1) + r2a))
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   484
id((1 + 1) + 1)
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   485
(1 + 1) + 1
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   486
3
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   487
\end{lstlisting}
27959a711959 updated
Christian Urban <urbanc@in.tum.de>
parents: 701
diff changeset
   488
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   489
\noindent
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   490
The point of this section is that you should be playing around with these
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   491
simple definitions and make sure they calculate the expected result.
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   492
In the next step, you should understand \emph{how} these functions
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   493
calculate the result with the continuations. Now change the initial
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   494
continuation which you call the function to
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   495
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   496
\begin{lstlisting}[language=Scala, numbers=none]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   497
r => { println("The result plus 1 is:") ; r + 1 }
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   498
\end{lstlisting}
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   499
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   500
\noindent
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   501
Does this still calculate the expected result?
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   502
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   503
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   504
\section*{Worked Example}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   505
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   506
Let us now come back to the CPS-translations for the Fun-language.
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   507
Though we will start with a simpler subset containing only numbers,
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   508
arithmetic expressions and function calls---no variables for the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   509
moment.  The main difficulty of generating instructions in SSA-format
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   510
is that large compound expressions need to be broken up into smaller
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   511
pieces and intermediate results need to be chained into later
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   512
instructions. To do this conveniently, we use the CPS-translation
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   513
mechanism. What the continuations essentially solve is the following
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   514
problem: Given an expressions
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   515
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   516
\begin{equation}\label{exp}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   517
(1 + 2) * (3 + 4)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   518
\end{equation}  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   519
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   520
\noindent
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   521
which of the subexpressions should be calculated first? We are going 
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   522
arbitrarily to decide that the calculation will be from left to
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   523
right. Other languages make different choices---C famously is right to
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   524
left. In our case this means we have to tear the expression shown in
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   525
\eqref{exp} apart as follows:
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   526
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   527
\[
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   528
(1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   529
\]  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   530
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   531
\noindent
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   532
The first subexpression can be easily calculated and will give us a result,
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   533
but the second one is not really an expression that makes sense. It
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   534
will only make sense as the next step in the computation when we
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   535
fill-in the result of $1+2$ into the ``hole'' indicated by
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   536
$\Box$. Such incomplete expressions can be represented in Scala as
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   537
functions
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   538
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   539
\begin{lstlisting}[language=Scala, numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   540
r => r * (3 + 4)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   541
\end{lstlisting}  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   542
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   543
\noindent
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   544
where \texttt{r} will represent any result that has been computed
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   545
earlier. By the way, in lambda-calculus notation we would write
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   546
$\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
700
f1d4d582ac29 updated
Christian Urban <urbanc@in.tum.de>
parents: 680
diff changeset
   547
functions (``continuations'') to represent what is coming next in a
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   548
sequence of instructions. In our case, continuations are functions of
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   549
type \code{KVal} to \code{KExp}. They can be seen as a sequence of
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   550
\code{KLet}s where there is a ``hole'' that needs to be
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   551
filled. Consider for example
678
6601ff1d9e0a updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   552
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   553
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   554
let tmp0 = add 1 a in   
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   555
let tmp1 = mul (*@$\Box$@*) 5 in 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   556
let tmp2 = add 3 tmp1 in 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   557
let tmp3 = add tmp0 tmp2 in
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   558
  return tmp3 
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   559
\end{lstlisting}
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   560
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   561
\noindent
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   562
where in the second line is a $\Box$ which still expects a \code{KVal}
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   563
to be filled in before it becomes a ``proper'' \code{KExp}. When we 
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   564
apply an argument to the continuation (remember they are functions)
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   565
we essentially fill something into the corresponding hole.
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   566
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   567
Lets look at some concrete code. To simplify matters, 
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   568
suppose our source language consists just of three kinds of expressions
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   569
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   570
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   571
enum Expr {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   572
    case Num(n: Int)
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   573
    case Aop(op: String, e1: Expr, e2: Expr)
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   574
    case Call(fname: String, args: List[Expr])
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   575
}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   576
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   577
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   578
\noindent
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   579
This allows us to generate SSA-instructions for expressions like
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   580
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   581
\[
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   582
1 + foo(bar(4 * 7), 3, id(12))  
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   583
\]
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   584
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   585
\noindent
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   586
The code of the CPS-translation
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   587
that generates these instructions is then of the form
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   588
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   589
\begin{lstlisting}[language=Scala,numbers=none]
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   590
def CPS(e: Exp)(k: KVal => KExp) : KExp = 
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   591
  e match { ... }
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   592
\end{lstlisting}
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   593
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   594
\noindent 
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   595
where \code{k} is the continuation and \code{e} is the expression to
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   596
be compiled. The result of the function is a K-expression, which later
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   597
can be compiled into LLVM-IR code.
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   598
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   599
In case we have numbers, then we can just pass them in the CPS-translation
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   600
to the continuations because numbers need not be further teared apart
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   601
as they are already primitive. Passing the number to the continuation
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   602
means we apply the continuation like
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   603
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   604
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   605
  case Num(i) => k(KNum(i))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   606
\end{lstlisting}
679
9a4404f65b63 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   607
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   608
\noindent This would fill in the $\Box$ in a \code{KLet}-expression.
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   609
Since \texttt{k} is a function from \texttt{KVar} to \texttt{KExp} we
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   610
also optain the correct type for CPS, namely \texttt{KExp}.  There is
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   611
no need to create a temporary variable for simple numbers.  More
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   612
interesting is the case for arithmetic operations.
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   613
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   614
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   615
case Aop(op, e1, e2) => {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   616
  val z = Fresh("tmp")
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   617
  CPS(e1)(r1 => 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   618
    CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   619
}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   620
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   621
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   622
\noindent
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   623
What we essentially have to do in this case is the following: compile
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   624
the subexpressions \texttt{e1} and \texttt{e2}. They will produce some
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   625
result that is stored in two temporary variables (assuming \texttt{e1} and \texttt{e2} are more
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   626
complicated than just numbers). We need to use these two temporary
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   627
variables and feed them into a new assignment of the form
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   628
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   629
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   630
let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   631
...
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   632
\end{lstlisting}
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   633
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   634
\noindent
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   635
Note that this assignment has two ``holes'', one for the left
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   636
subexpression and one for the right subexpression. We can fill both
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   637
holes by calling the CPS function twice---this is a bit of the black
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   638
art in CPS. We can use the second call of CPS as the continuation
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   639
of the first call. The reason is that
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   640
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   641
\begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   642
r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   643
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   644
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   645
\noindent is a function from \code{KVal} to \code{KExp}, which we need
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   646
as type for the continuation. Once we created the assignment with the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   647
fresh temporary variable \texttt{z}, we need to ``communicate'' that
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   648
the result of the computation of the arithmetic expression is stored
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   649
in \texttt{z} to the computations that follow. In this way we apply
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   650
the continuation \texttt{k} with this new variable (essentially we are
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   651
plugging in a hole further down the line).  Hope this makes sense!? If not,
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   652
play with the given code yourself.
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   653
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   654
The last case we need to consider in our small expression language are
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   655
function calls. For them remember each argument of the function
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   656
call can in SSA-format only be a variable or a number. Here is the
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   657
complete code for this case:
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   658
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   659
\begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   660
case Call(fname, args) => {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   661
  def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   662
       case Nil => {
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   663
         val z = Fresh("tmp")
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   664
         KLet(z, KCall(fname, vs), k(KVar(z)))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   665
       }
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   666
       case a::as => CPS(a)(r => aux(as, vs ::: List(r)))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   667
  }
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   668
  aux(args, Nil)  
701
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   669
}
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   670
\end{lstlisting}
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   671
81377a3eb717 updated
Christian Urban <urbanc@in.tum.de>
parents: 700
diff changeset
   672
\noindent
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   673
As can be seen, we introduce an auxiliary function that compiles first all
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   674
function arguments---remember in our source language we can have calls
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   675
such as $foo(3 + 4, g(3))$ where we first have to create temporary
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   676
variables (and computations) for each argument. Therefore \texttt{aux}
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   677
analyses the argument list from left to right. In case there is an
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   678
argument \texttt{a} on the front of the list (the case \texttt{a::as}
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   679
in Line 7), we call CPS recursively for the corresponding
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   680
subexpression. The temporary variable containing the result for this
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   681
argument, we add to the end of the K-values we have already analysed
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   682
before. Again very conveniently we can use the recursive call to
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   683
\texttt{aux} as the continuation for the computations that
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   684
follow. When we reach the end of the argument list (the
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   685
\texttt{Nil}-case in Lines 3--6), then we collect all the K-values
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   686
\texttt{v1} to \texttt{vn} and call the function in the K-language
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   687
like so
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   688
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   689
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   690
\begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   691
let z = call foo(v1,...,vn) in
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   692
...
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   693
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   694
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   695
\noindent
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   696
Again we need to communicate the result of the function call, namely the
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   697
fresh temporary variable \texttt{z}, to the rest of the computation. Therefore
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   698
we apply the continuation \texttt{k} with this variable.
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   699
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   700
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   701
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   702
\begin{figure}[p]\small
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   703
  \lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   704
  \hspace{10mm}...
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   705
  \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   706
\caption{CPS-translation for a simple expression language.\label{cps}}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   707
\end{figure}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   708
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   709
The last question we need to answer in the code (see Figure~\ref{cps})
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   710
is how we start the CPS-translation function, or more precisely with
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   711
which continuation we should start CPS? It turns out that this initial
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   712
continuation will be the last one that is called. What should be the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   713
last step in the computation? Yes, we need to return the temporary
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   714
variable where the last result is stored (if it is a simple number,
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   715
then we can just return this number). Therefore we call CPS with the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   716
code
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   717
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   718
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   719
def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   720
\end{lstlisting}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   721
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   722
\noindent
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   723
where we give the function \code{KReturn(_)} as the continuation. With
912
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   724
this we completed the translation of simple expressions into our
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   725
K-language. Play around with some more expressions and see how the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   726
CPS-translation generates the correct code. I know this is not easy to
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   727
follow code when you see it the first time.  Figure~\ref{absfun} gives
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   728
the complete datatypes for the ASTs of the Fun-language and the
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   729
K-values and K-expressions for the K-language. The complete
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   730
CPS-translation you can find in the code.
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   731
81f014160a48 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 909
diff changeset
   732
\section*{Next Steps}
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   733
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   734
Having obtained a K-expression, it is relatively straightforward to
913
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   735
generate a valid program for the LLVM-IR---remember the K-language
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   736
already enforces the SSA convention of a linear sequence of primitive
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   737
instructions involving only unique temporary variables. 
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   738
We leave this step to the attentive reader.
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   739
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   740
What else can we do?  Well it should be relatively easy to apply some
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   741
common optimisations to the K-expressions. One optimisations is called
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   742
constant folding---for example if we have an expression $3 + 4$ then
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   743
we can replace it by just $5$ instead of generating code to compute
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   744
$5$. Now this information needs to be propagated to the next
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   745
computation step to see whether any further constant foldings are
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   746
possible. Another useful technique is common subexpression
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   747
elimination, meaning if you have twice a calculation of a function
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   748
$foo(a)$, then we want to call it only once and store the result in a
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   749
temporary variable that can be used instead of the second, or third,
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   750
call to $foo(a)$. Again I leave this to the attentive reader to work
8f07908dec8d updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 912
diff changeset
   751
out and implement.
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   752
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   753
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   754
\begin{figure}[p]\small
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   755
\begin{lstlisting}[language=Scala,numbers=none]
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   756
// Fun language (expressions)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   757
abstract class Exp 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   758
abstract class BExp 
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   759
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   760
case class Call(name: String, args: List[Exp]) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   761
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   762
case class Write(e: Exp) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   763
case class Var(s: String) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   764
case class Num(i: Int) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   765
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   766
case class Sequence(e1: Exp, e2: Exp) extends Exp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   767
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   768
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   769
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   770
// K-language (K-expressions, K-values)
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   771
abstract class KExp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   772
abstract class KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   773
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   774
case class KVar(s: String) extends KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   775
case class KNum(i: Int) extends KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   776
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   777
case class KCall(o: String, vrs: List[KVal]) extends KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   778
case class KWrite(v: KVal) extends KVal
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   779
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   780
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   781
case class KLet(x: String, v: KVal, e: KExp) extends KExp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   782
case class KReturn(v: KVal) extends KExp
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   783
\end{lstlisting}
909
a29d74fb8697 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 908
diff changeset
   784
\caption{Abstract syntax trees for the Fun-language and the K-language.\label{absfun}}
908
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   785
\end{figure}
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   786
68df565a6134 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
   787
917
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   788
\section*{Alternatives to CPS}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   789
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   790
While I appreciate that this handout is already pretty long, this
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   791
section is for students who think the CPS-translation is too much of
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   792
voodoo programming---there is a perhaps simpler alternative. This
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   793
alternative is along the lines: if you cannot bridge the gap in
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   794
a single step, do it in two simpler steps. Let's look at the
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   795
simple expression $1 + (2 + 3)$. The CPS-translation correctly
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   796
generates the expression
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   797
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   798
\begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   799
let tmp0 = add 2 3 in   
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   800
let tmp1 = add 1 tmp0 in 
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   801
  return tmp1 
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   802
\end{lstlisting}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   803
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   804
\noindent
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   805
where $(2 + 3)$ is pulled out and calculated first. The problem is that it
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   806
requires a bit of magic. But with the ability to give a separate variable
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   807
to each indivifual computation, we could do the following: The expression
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   808
$1 + (2 + 3)$ is a tree like this
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   809
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   810
\begin{center}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   811
\begin{tikzpicture}[line width=1mm]
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   812
  \node {root} [grow'=up]
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   813
    child {node {1}}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   814
    child {node[circle,draw] {+}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   815
      child {node {2}}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   816
      child {node {3}}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   817
    };
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   818
\end{tikzpicture}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   819
\end{center}
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   820
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   821
\noindent
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   822
and we could perform a completely standard recursive traversal of the
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   823
tree: each inner node gets a new variable and assignment.  
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   824
d3d371ae5fab updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 913
diff changeset
   825
898
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   826
5075b69142d9 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 722
diff changeset
   827
\noindent
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   828
\end{document}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   829
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   830
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   831
%%% Local Variables: 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   832
%%% mode: latex
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   833
%%% TeX-master: t
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   834
%%% End: