handouts/ho09.tex
author Christian Urban <urbanc@in.tum.de>
Fri, 01 Nov 2019 13:21:51 +0000
changeset 679 8fc109f36b78
parent 678 ff3b48da282c
child 680 eecc4d5a2172
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     1
% !TEX program = xelatex
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
\documentclass{article}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     3
\usepackage{../style}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
\usepackage{../langs}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
\usepackage{../graphics}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
\usepackage{../grammar}
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     7
%%\usepackage{multicol}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     9
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
\begin{document}
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    12
\fnote{\copyright{} Christian Urban, King's College London, 2019}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    13
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    14
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    15
\section*{Handout 9 (LLVM, SSA and CPS)}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    16
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    17
Reflecting on our tiny compiler targetting the JVM, the code generation
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    18
part was actually not so hard, no? Pretty much just some post-traversal
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    19
of the abstract syntax tree, yes? One of the main reason for this ease
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    20
is that the JVM is a stack-based virtual machine and it is therefore not
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    21
hard to translate arithmetic expressions into a sequence of instructions
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    22
manipulating the stack. The problem is that ``real'' CPUs, although
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    23
supporting stack operations, are not really designed to be \emph{stack
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    24
machines}.  The design of CPUs is more like, here is a chunk of
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    25
memory---compiler, or better compiler writers, do something with it.
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    26
Consequently, modern compilers need to go the extra mile in order to
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    27
generate code that is much easier and faster to process by CPUs. To make
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    28
this all tractable for this module, we target the LLVM Intermediate
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    29
Language. In this way we can take advantage of the tools coming with
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    30
LLVM. For example we do not have to worry about things like register
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
    31
allocations.\bigskip 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    32
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    33
\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    34
that projects from Academia can make a difference in the world. LLVM
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    35
started in 2000 as a project by two researchers at the  University of
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    36
Illinois at Urbana-Champaign. At the time the behemoth of compilers was
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    37
gcc with its myriad of front-ends for other languages (e.g.~Fortran,
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    38
Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    39
time into a monolithic gigantic piece of m\ldots ehm software, which you
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    40
could not mess about in an afternoon. In contrast, LLVM is designed to
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    41
be a modular suite of tools with which you could play around easily and
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    42
try out something new. LLVM became a big player once Apple hired one of
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    43
the original developers (I cannot remember the reason why Apple did not
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    44
want to use gcc, but maybe they were also just disgusted by its big
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    45
monolithic codebase). Anyway, LLVM is now the big player and gcc is more
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    46
or less legacy. This does not mean that programming languages like C and
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    47
C++ are dying out any time soon---they are nicely supported by LLVM.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    48
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    49
Targetting the LLVM Intermediate Language, or Intermediate
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    50
Representation (short LLVM-IR), also means we can profit from the very
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    51
modular structure of the LLVM compiler and let for example the compiler
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    52
generate code for X86, or ARM etc. That means we can be agnostic about
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    53
where our code actually runs. However, what we have to do is to generate
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    54
code in \emph{Static Single-Assignment} format (short SSA), because that
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    55
is what the LLVM-IR expects from us. LLVM-IR is the intermediate format
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    56
that LLVM uses for doing cool things, like targetting strange
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    57
architectures, optimising code and allocating memory efficiently. 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    58
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    59
The idea behind the SSA format is to use very simple variable
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    60
assignments where every variable is assigned only once. The assignments
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    61
also need to be primitive in the sense that they can be just simple
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    62
operations like addition, multiplication, jumps, comparisons and so on.
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    63
An idealised snippet of a program in SSA is 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    64
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    65
\begin{lstlisting}[language=LLVM,numbers=none]
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    66
    x := 1
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    67
    y := 2     
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    68
    z := x + y
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    69
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    70
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    71
\noindent where every variable is used only once (we could not write
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    72
\texttt{x := x + y} in the last line for example).  There are
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    73
sophisticated algorithms for imperative languages, like C, that
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    74
efficiently transform a high-level program into SSA format. But we can
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    75
ignore them here. We want to compile a functional language and there
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    76
things get much more interesting than just sophisticated. We will need
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    77
to have a look at CPS translations, where the CPS stands for
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    78
Continuation-Passing-Style---basically black programming art or
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    79
abracadabra programming. So sit tight.
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    80
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    81
\subsection*{LLVM-IR}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    82
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    83
Before we start, lets first have a look at the \emph{LLVM Intermediate
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    84
Representation}. What is good about our simple Fun language is that it
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    85
basically only contains expressions (be they arithmetic expressions or
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    86
boolean expressions). The exception is function definitions. Luckily,
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    87
for them we can use the mechanism of defining functions in LLVM-IR. For
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    88
example the simple Fun program 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    89
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    90
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    91
\begin{lstlisting}[language=Scala,numbers=none]
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    92
def sqr(x) = x * x
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    93
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    94
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    95
\noindent
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    96
can be compiled into the following LLVM-IR function:
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    97
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    98
\begin{lstlisting}[language=LLVM]
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
    99
define i32 @sqr(i32 %x) {
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   100
   %tmp = mul i32 %x, %x
677
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   101
   ret i32 %tmp
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   102
}    
decfd8cf8180 updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
   103
\end{lstlisting}
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   104
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   105
\noindent First to notice is that all variable names in the LLVM-IR are
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   106
prefixed by \texttt{\%}; function names need to be prefixed with @.
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   107
Also, the LLVM-IR is a fully typed language. The \texttt{i32} type stands
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   108
for a 32-bit integer. There are also types for 64-bit integers, chars
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   109
(\texttt{i8}), floats, arrays and even pointer types. In teh code above,
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   110
\texttt{sqr} takes an argument of type \texttt{i32} and produces a
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   111
result of type \texttt{i32}. Each arithmetic operation, like addition or
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   112
multiplication, are also prefixed with the type they operate on.
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   113
Obviously these types need to match up\ldots{} but since we have in our
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   114
programs only integers, \texttt{i32} everywhere will do. 
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   115
 
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   116
Conveniently, you can use the program \texttt{lli}, which comes with
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   117
LLVM, to interpret programs written in the LLVM-IR. So you can easily
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   118
check whether the code you produced actually works. To get a running
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   119
program that does something interesting you need to add some boilerplate
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   120
about printing out numbers and a main-function that is the entrypoint
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   121
for the program (see Figure~\ref{lli}). You can generate a binary for
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   122
this program using \texttt{llc}-compiler in order to generate an object
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   123
file and then use gcc (clang) for generating a binary:
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   124
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   125
\begin{lstlisting}[language=bash,numbers=none]
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   126
llc -filetype=obj sqr.ll
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   127
gcc sqr.o -o a.out
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   128
./a.out
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   129
\end{lstlisting}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   130
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   131
\begin{figure}[t]\small 
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   132
\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   133
\caption{An LLVM-IR program for calculating the square function. The 
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   134
interesting function is \texttt{sqr} in Lines 13 -- 16. The main
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   135
function calls \texttt{sqr} and prints out the result. The other
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   136
code is boilerplate for printing out integers.\label{lli}}
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   137
\end{figure}   
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   138
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   139
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   140
    
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   141
\subsection*{Our Own Intermediate Language}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   142
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   143
Remember compilers have to solve the problem of bridging the gap between
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   144
``high-level'' programs and ``low-level'' hardware. If the gap is tool
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   145
wide then a good strategy is to lay a stepping stone somewhere in
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   146
between. The LLVM-IR itself is such a stepping stone to make the task of
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   147
generating code easier. Like a real compiler we will use another
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   148
stepping stone which I call \emph{K-language}. For this remember
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   149
expressions (and boolean expressions) in the Fun language are given by
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   150
the code on top of Figure~\ref{absfun}
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   151
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   152
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   153
\begin{figure}[p]\small
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   154
\begin{lstlisting}[language=Scala,numbers=none]
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   155
// Fun-language (expressions)
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   156
abstract class Exp 
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   157
abstract class BExp 
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   158
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   159
case class Call(name: String, args: List[Exp]) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   160
case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   161
case class Write(e: Exp) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   162
case class Var(s: String) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   163
case class Num(i: Int) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   164
case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   165
case class Sequence(e1: Exp, e2: Exp) extends Exp
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   166
case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   167
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   168
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   169
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   170
// K-language (K-expressions, K-values)
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   171
abstract class KExp
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   172
abstract class KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   173
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   174
case class KVar(s: String) extends KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   175
case class KNum(i: Int) extends KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   176
case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   177
case class KCall(o: String, vrs: List[KVal]) extends KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   178
case class KWrite(v: KVal) extends KVal
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   179
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   180
case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   181
case class KLet(x: String, e1: KVal, e2: KExp) extends KExp
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   182
case class KReturn(v: KVal) extends KExp
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   183
\end{lstlisting}
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   184
\caption{Abstract syntax trees for the Fun language.\label{absfun}}
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   185
\end{figure}
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   186
678
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   187
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   188
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   189
\subsection*{CPS-Translations}
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   190
ff3b48da282c updated
Christian Urban <urbanc@in.tum.de>
parents: 677
diff changeset
   191
679
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   192
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   193
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   194
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   195
Another reason why it makes sense to go the extra mile is that stack
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   196
instructions are very difficult to optimise---you cannot just re-arrange
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   197
instructions without messing about with what is calculated on the stack.
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   198
Also it is hard to find out if all the calculations on the stack are
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   199
actually necessary and not by chance dead code. The JVM has for all this
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   200
sophisticated machinery to make such ``high-level'' code still run fast,
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   201
but let's say that for the sake of argument we do not want to rely on
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   202
it. We want to generate fast code ourselves. This means we have to work
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   203
around the intricacies of what instructions CPUs can actually process.
8fc109f36b78 updated
Christian Urban <urbanc@in.tum.de>
parents: 678
diff changeset
   204
539
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   205
\end{document}
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   206
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   207
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   208
%%% Local Variables: 
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   209
%%% mode: latex
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   210
%%% TeX-master: t
ed8f014217be updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   211
%%% End: