handouts/ho09.tex
changeset 908 0138618eff73
parent 898 45a48c47dcca
child 909 a04efdd5e7a3
equal deleted inserted replaced
907:8160c55991b9 908:0138618eff73
     2 \documentclass{article}
     2 \documentclass{article}
     3 \usepackage{../style}
     3 \usepackage{../style}
     4 \usepackage{../langs}
     4 \usepackage{../langs}
     5 \usepackage{../graphics}
     5 \usepackage{../graphics}
     6 \usepackage{../grammar}
     6 \usepackage{../grammar}
     7 %%\usepackage{multicol}
       
     8 
       
     9 %%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
       
    10 
     7 
    11 \begin{document}
     8 \begin{document}
    12 \fnote{\copyright{} Christian Urban, King's College London, 2019}
     9 \fnote{\copyright{} Christian Urban, King's College London, 2019, 2023}
    13 
    10 
    14 
    11 
    15 % CPS translations
    12 % CPS translations
    16 % https://felleisen.org/matthias/4400-s20/lecture17.html
    13 % https://felleisen.org/matthias/4400-s20/lecture17.html
    17 
    14 
    20 
    17 
    21 \section*{Handout 9 (LLVM, SSA and CPS)}
    18 \section*{Handout 9 (LLVM, SSA and CPS)}
    22 
    19 
    23 Reflecting on our two tiny compilers targetting the JVM, the code
    20 Reflecting on our two tiny compilers targetting the JVM, the code
    24 generation part was actually not so hard, no? Pretty much just some
    21 generation part was actually not so hard, no? Pretty much just some
    25 post-traversal of the abstract syntax tree, yes? One of the reasons for
    22 post-traversal of the abstract syntax tree, yes? One of the reasons
    26 this ease is that the JVM is a stack-based virtual machine and it is
    23 for this ease is that the JVM is a stack-based virtual machine and it
    27 therefore not hard to translate deeply-nested arithmetic expressions
    24 is therefore not hard to translate deeply-nested arithmetic
    28 into a sequence of instructions manipulating the stack. The problem is
    25 expressions into a sequence of instructions manipulating the
    29 that ``real'' CPUs, although supporting stack operations, are not really
    26 stack. The problem is that ``real'' CPUs, although supporting stack
    30 designed to be \emph{stack machines}.  The design of CPUs is more like,
    27 operations, are not really designed to be \emph{stack machines}.  The
    31 here is a chunk of memory---compiler, or better compiler writers, do
    28 design of CPUs is more like: Here are some operations and a chunk of
    32 something with it. Consequently, modern compilers need to go the extra
    29 memory---compiler, or better compiler writers, do something with
    33 mile in order to generate code that is much easier and faster to process
    30 them. Consequently, modern compilers need to go the extra mile in
    34 by CPUs. To make this all tractable for this module, we target the LLVM
    31 order to generate code that is much easier and faster to process by
    35 Intermediate Language. In this way we can take advantage of the tools
    32 actual CPUs, rather than running code on virtual machines that
    36 coming with LLVM. For example we do not have to worry about things like
    33 introduce an additional layer of indirection. To make this all
    37 register allocations.\bigskip 
    34 tractable for this module, we target the LLVM Intermediate
    38 
    35 Language. In this way we can take advantage of the tools coming with
    39 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
    36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to
       
    37 worry about things like register allocations. By using LLVM, however,
       
    38 we also have to pay price in the sense that generating code gets
       
    39 harder\ldots{}unfortunately.
       
    40 
       
    41 \subsection*{LLVM and LLVM-IR}
       
    42 
       
    43 \noindent LLVM is a beautiful example
    40 that projects from Academia can make a difference in the World. LLVM
    44 that projects from Academia can make a difference in the World. LLVM
    41 started in 2000 as a project by two researchers at the  University of
    45 started in 2000 as a project by two researchers at the  University of
    42 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    43 gcc with its myriad of front-ends for other languages (C++, Fortran,
    47 gcc with its myriad of front-ends for different programming languages (C++, Fortran,
    44 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
    48 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
    45 time into a monolithic gigantic piece of m\ldots ehm software, which you
    49 time into a monolithic gigantic piece of m\ldots ehm complicated
       
    50 software, which you
    46 could not mess about in an afternoon. In contrast, LLVM is designed to
    51 could not mess about in an afternoon. In contrast, LLVM is designed to
    47 be a modular suite of tools with which you can play around easily and
    52 be a modular suite of tools with which you can play around easily and
    48 try out something new. LLVM became a big player once Apple hired one of
    53 try out something new. LLVM became a big player once Apple hired one of
    49 the original developers (I cannot remember the reason why Apple did not
    54 the original developers (I cannot remember the reason why Apple did not
    50 want to use gcc, but maybe they were also just disgusted by gcc's big
    55 want to use gcc, but maybe they were also just disgusted by gcc's big
    52 or less legacy. This does not mean that programming languages like C and
    57 or less legacy. This does not mean that programming languages like C and
    53 C++ are dying out any time soon---they are nicely supported by LLVM.
    58 C++ are dying out any time soon---they are nicely supported by LLVM.
    54 
    59 
    55 We will target the LLVM Intermediate Language, or LLVM Intermediate
    60 We will target the LLVM Intermediate Language, or LLVM Intermediate
    56 Representation (short LLVM-IR). The LLVM-IR looks very similar to the
    61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the
    57 assembly language of Jasmin and Krakatau. It will also allow us to
    62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
    58 benefit from the modular structure of the LLVM compiler and let for
    63 allow us to benefit from the modular structure of the LLVM compiler
    59 example the compiler generate code for different CPUs, like X86 or ARM.
    64 and we can let, for example, the compiler generate code for different
    60 That means we can be agnostic about where our code actually runs. We can
    65 CPUs, say X86 or ARM.  That means we can be agnostic about where our
    61 also be ignorant about optimising code and allocating memory
    66 code is actually going to run.\footnote{Anybody want to try to run our
    62 efficiently. 
    67   programs on Android phones?}  We can also be somewhat ignorant about
    63 
    68 optimising our code and about allocating memory efficiently: the LLVM
    64 However, what we have to do for LLVM is to generate code in \emph{Static
    69 tools will take care of all this.
    65 Single-Assignment} format (short SSA), because that is what the LLVM-IR
    70 
    66 expects from us. A reason why LLVM uses the SSA format, rather than
    71 However, what we have to do in order to make LLVM to play ball is to
    67 JVM-like stack instructions, is that stack instructions are difficult to
    72 generate code in \emph{Static Single-Assignment} format (short SSA),
    68 optimise---you cannot just re-arrange instructions without messing about
    73 because that is what the LLVM-IR expects from us. A reason why LLVM
    69 with what is calculated on the stack. Also it is hard to find out if all
    74 uses the SSA format, rather than JVM-like stack instructions, is that
    70 the calculations on the stack are actually necessary and not by chance
    75 stack instructions are difficult to optimise---you cannot just
    71 dead code. The JVM has for all these obstacles sophisticated machinery
    76 re-arrange instructions without messing about with what is calculated
    72 to make such ``high-level'' code still run fast, but let's say that for
    77 on the stack. Also it is hard to find out if all the calculations on
    73 the sake of argument we do not want to rely on it. We want to generate
    78 the stack are actually necessary and not by chance dead code. The JVM
    74 fast code ourselves. This means we have to work around the intricacies
    79 has for all these obstacles sophisticated machinery to make such
    75 of what instructions CPUs can actually process fast. This is what the
    80 ``high-level'' code still run fast, but let's say that for the sake of
    76 SSA format is designed for.
    81 argument we do not want to rely on it. We want to generate fast code
       
    82 ourselves. This means we have to work around the intricacies of what
       
    83 instructions CPUs can actually process fast. This is what the SSA
       
    84 format is designed for.
    77 
    85 
    78 
    86 
    79 The main idea behind the SSA format is to use very simple variable
    87 The main idea behind the SSA format is to use very simple variable
    80 assignments where every tmp-variable is assigned only once. The assignments
    88 assignments where every tmp-variable is assigned only once. The
    81 also need to be primitive in the sense that they can be just simple
    89 assignments also need to be primitive in the sense that they can be
    82 operations like addition, multiplication, jumps, comparisons and so on.
    90 just simple operations like addition, multiplication, jumps,
    83 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
    91 comparisons, function calls and so on.  Say, we have an expression
    84 corresponding SSA format is 
    92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is
    85  
    93  
    86 \begin{lstlisting}[language=LLVMIR,numbers=left]
    94 \begin{lstlisting}[language=LLVMIR,numbers=left]
    87 let tmp0 = add 1 a in   
    95 let tmp0 = add 1 a in
    88 let tmp1 = mul b 5 in 
    96 let tmp1 = add 3 2 in  
    89 let tmp2 = add 3 tmp1 in 
    97 let tmp2 = call foo(tmp1)  
    90 let tmp3 = add tmp0 tmp2 in tmp3 
    98 let tmp3 = mul b 5 in 
    91 \end{lstlisting}
    99 let tmp4 = add tmp2 tmp3 in 
    92 
   100 let tmp5 = add tmp0 tmp4 in
    93 \noindent where every variable is used only once (we could not write
   101   return tmp5 
    94 \texttt{tmp1 = add 3 tmp1} in Line 3 for example).
   102 \end{lstlisting}
       
   103 
       
   104 \noindent where every tmp-variable is used only once (we could, for
       
   105 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
       
   106 would not change the overall result).
    95 
   107 
    96 There are sophisticated algorithms for imperative languages, like C,
   108 There are sophisticated algorithms for imperative languages, like C,
    97 that efficiently transform a high-level program into SSA format. But
   109 that efficiently transform a high-level program into SSA format. But
    98 we can ignore them here. We want to compile a functional language and
   110 we can ignore them here. We want to compile a functional language and
    99 there things get much more interesting than just sophisticated. We
   111 there things get much more interesting than just sophisticated. We
   101 Continuation-Passing-Style---basically black programming art or
   113 Continuation-Passing-Style---basically black programming art or
   102 abracadabra programming. So sit tight.
   114 abracadabra programming. So sit tight.
   103 
   115 
   104 \subsection*{LLVM-IR}
   116 \subsection*{LLVM-IR}
   105 
   117 
   106 Before we start, let's first have a look at the \emph{LLVM Intermediate
   118 Before we start, let's however first have a look at the \emph{LLVM Intermediate
   107 Representation} in more detail. The LLVM-IR is in between the frontends
   119 Representation} in more detail. The LLVM-IR is in between the frontends
   108 and backends of the LLVM framework. It allows compilation of multiple
   120 and backends of the LLVM framework. It allows compilation of multiple
   109 source languages to multiple targets. It is also the place where most of
   121 source languages to multiple targets. It is also the place where most of
   110 the target independent optimisations are performed. 
   122 the target independent optimisations are performed. 
   111 
   123 
   112 What is good about our toy Fun language is that it basically only
   124 What is good about our toy Fun-language is that it basically only
   113 contains expressions (be they arithmetic expressions, boolean
   125 contains expressions (be they arithmetic expressions, boolean
   114 expressions or if-expressions). The exception are function definitions.
   126 expressions or if-expressions). The exception are function definitions.
   115 Luckily, for them we can use the mechanism of defining functions in the
   127 Luckily, for them we can use the mechanism of defining functions in the
   116 LLVM-IR (this is similar to using JVM methods for functions in our
   128 LLVM-IR (this is similar to using JVM methods for functions in our
   117 earlier compiler). For example the simple Fun program 
   129 earlier compiler). For example the simple Fun-program 
   118 
   130 
   119 
   131 
   120 \begin{lstlisting}[language=Scala,numbers=none]
   132 \begin{lstlisting}[language=Scala,numbers=none]
   121 def sqr(x) = x * x
   133 def sqr(x) = x * x
   122 \end{lstlisting}
   134 \end{lstlisting}
   129    %tmp = mul i32 %x, %x
   141    %tmp = mul i32 %x, %x
   130    ret i32 %tmp
   142    ret i32 %tmp
   131 }    
   143 }    
   132 \end{lstlisting}
   144 \end{lstlisting}
   133 
   145 
   134 \noindent First notice that all variable names, in this case \texttt{x}
   146 \noindent First notice that all ``local'' variable names, in this case
   135 and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
   147 \texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the
   136 Temporary variables can be named with an identifier, such as
   148 LLVM-IR.  Temporary variables can be named with an identifier, such as
   137 \texttt{tmp}, or numbers. In contrast, function names, since they are ``global'',
   149 \texttt{tmp}, or numbers. In contrast, function names, since they are
   138 need to be prefixed with an @-symbol. Also, the LLVM-IR is a fully typed
   150 ``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is
   139 language. The \texttt{i32} type stands for 32-bit integers. There are
   151 a fully typed language. The \texttt{i32} type stands for 32-bit
   140 also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
   152 integers. There are also types for 64-bit integers (\texttt{i64}),
   141 floats, arrays and even pointer types. In the code above, \texttt{sqr}
   153 chars (\texttt{i8}), floats, arrays and even pointer types. In the
   142 takes an argument of type \texttt{i32} and produces a result of type
   154 code above, \texttt{sqr} takes an argument of type \texttt{i32} and
   143 \texttt{i32} (the result type is in front of the function name, like in
   155 produces a result of type \texttt{i32} (the result type is in front of
   144 C). Each arithmetic operation, for example addition and multiplication,
   156 the function name, like in C). Each arithmetic operation, for example
   145 are also prefixed with the type they operate on. Obviously these types
   157 addition and multiplication, are also prefixed with the type they
   146 need to match up\ldots{} but since we have in our programs only
   158 operate on. Obviously these types need to match up\ldots{} but since
   147 integers, for the moment \texttt{i32} everywhere will do. We do not have to generate
   159 we have in our programs only integers, for the moment \texttt{i32}
   148 any other types, but obviously this is a limitation in our Fun language.
   160 everywhere will do. We do not have to generate any other types, but
       
   161 obviously this is a limitation in our Fun-language (and which we
       
   162 lift in the final CW).
   149  
   163  
   150 There are a few interesting instructions in the LLVM-IR which are quite
   164 There are a few interesting instructions in the LLVM-IR which are quite
   151 different than what we have seen in the JVM. Can you remember the
   165 different than what we have seen in the JVM. Can you remember the
   152 kerfuffle we had to go through with boolean expressions and negating the
   166 kerfuffle we had to go through with boolean expressions and negating the
   153 condition? In the LLVM-IR, branching  if-conditions is implemented
   167 condition? In the LLVM-IR, branching  if-conditions are implemented
   154 differently: there is a separate \texttt{br}-instruction as follows:
   168 differently: there is a separate \texttt{br}-instruction as follows:
   155 
   169 
   156 \begin{lstlisting}[language=LLVM]
   170 \begin{lstlisting}[language=LLVM]
   157 br i1 %var, label %if_br, label %else_br
   171 br i1 %var, label %if_br, label %else_br
   158 \end{lstlisting}
   172 \end{lstlisting}
   159 
   173 
   160 \noindent
   174 \noindent
   161 The type \texttt{i1} stands for booleans. If the variable is true, then
   175 The type \texttt{i1} stands for booleans. If the variable is true,
   162 this instruction jumps to the if-branch, which needs an explicit label;
   176 then this instruction jumps to the if-branch, which needs an explicit
   163 otherwise it jumps to the else-branch, again with its own label. This allows us
   177 label; otherwise it jumps to the else-branch, again with its own
   164 to keep the meaning of the boolean expression ``as is'' when compiling if's---thanks god no more negating the boolean.
   178 label. This allows us to keep the meaning of the boolean expression
       
   179 ``as is'' when compiling if's---thanks god no more negating of
       
   180 booleans.
       
   181 
   165 A value of type boolean is generated in the LLVM-IR by the
   182 A value of type boolean is generated in the LLVM-IR by the
   166 \texttt{icmp}-instruction. This instruction is for integers (hence the
   183 \texttt{icmp}-instruction. This instruction is for integers (hence the
   167 \texttt{i}) and takes the comparison operation as argument. For example
   184 \texttt{i}) and takes the comparison operation as argument. For example
   168 
   185 
   169 \begin{lstlisting}[language=LLVM]
   186 \begin{lstlisting}[language=LLVM]
   173 icmp ult i32 %x, %y     ; unsigned less than 
   190 icmp ult i32 %x, %y     ; unsigned less than 
   174 \end{lstlisting}
   191 \end{lstlisting}
   175 
   192 
   176 \noindent
   193 \noindent
   177 Note that in some operations the LLVM-IR distinguishes between signed and 
   194 Note that in some operations the LLVM-IR distinguishes between signed and 
   178 unsigned representations of integers.
   195 unsigned representations of integers. I assume you know what this means,
       
   196 otherwise just ask.
   179 
   197 
   180 It is also easy to call another function in LLVM-IR: as can be 
   198 It is also easy to call another function in LLVM-IR: as can be 
   181 seen from Figure~\ref{lli} we can just call a function with the 
   199 seen from Figure~\ref{lli} we can just call a function with the 
   182 instruction \texttt{call} and can also assign the result to 
   200 instruction \texttt{call} and can also assign the result to 
   183 a variable. The syntax is as follows
   201 a variable. The syntax is as follows
   185 \begin{lstlisting}[language=LLVM]
   203 \begin{lstlisting}[language=LLVM]
   186 %var = call i32 @foo(...args...)
   204 %var = call i32 @foo(...args...)
   187 \end{lstlisting}
   205 \end{lstlisting}
   188 
   206 
   189 \noindent
   207 \noindent
   190 where the arguments can only be simple variables, not compound
   208 where the arguments can only be simple variables and numbers, but not compound
   191 expressions.
   209 expressions.
   192 
   210 
   193 Conveniently, you can use the program \texttt{lli}, which comes with
   211 Conveniently, you can use the program \texttt{lli}, which comes with
   194 LLVM, to interpret programs written in the LLVM-IR. So you can easily
   212 LLVM, to interpret programs written in the LLVM-IR. So you can easily
   195 check whether the code you produced actually works. To get a running
   213 check whether the code you produced actually works. To get a running
   196 program that does something interesting you need to add some boilerplate
   214 program that does something interesting you need to add some boilerplate
   197 about printing out numbers and a main-function that is the entry point
   215 about printing out numbers and a main-function that is the entry point
   198 for the program (see Figure~\ref{lli} for a complete listing). Again
   216 for the program (see Figure~\ref{lli} for a complete listing of the square function). Again
   199 this is very similar to the boilerplate we needed to add in our JVM
   217 this is very similar to the boilerplate we needed to add in our JVM
   200 compiler. 
   218 compiler. 
   201 
   219 
   202 You can generate a binary for the program in Figure~\ref{lli} by using
   220 You can generate a binary for the program in Figure~\ref{lli} by using
   203 the \texttt{llc}-compiler and then \texttt{gcc}, whereby \texttt{llc} generates
   221 the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
   204 an object file and \texttt{gcc} (that is clang) generates the
   222 an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
   205 executable binary:
   223 executable binary:
   206 
   224 
   207 \begin{lstlisting}[language=bash,numbers=none]
   225 \begin{lstlisting}[language=bash,numbers=none]
   208 llc -filetype=obj sqr.ll
   226 llc -filetype=obj sqr.ll
   209 gcc sqr.o -o a.out
   227 gcc sqr.o -o a.out
   212 \end{lstlisting}
   230 \end{lstlisting}
   213 
   231 
   214 \begin{figure}[t]\small 
   232 \begin{figure}[t]\small 
   215 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   233 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   216 \caption{An LLVM-IR program for calculating the square function. It
   234 \caption{An LLVM-IR program for calculating the square function. It
   217 calls this function in \texttt{@main} with the argument \texttt{5}. The
   235   calls the sqr-function in \texttt{@main} with the argument \texttt{5}
   218 code for the \texttt{sqr} function is in Lines 13 -- 16. The main
   236   (Line 20). The
   219 function calls \texttt{sqr} and then prints out the result. The other
   237   code for the \texttt{sqr} function is in Lines 13 -- 16. It stores
   220 code is boilerplate for printing out integers.\label{lli}}
   238   the result of sqr in the variable called \texttt{\%i} and then
       
   239   prints out the contents of this variable in Line 21. The other
       
   240   code is boilerplate for printing out integers.\label{lli}}
   221 \end{figure}   
   241 \end{figure}   
   222 
   242 
   223 
   243 
   224     
   244     
   225 \subsection*{Our Own Intermediate Language}
   245 \subsection*{Our Own Intermediate Language}
   226 
   246 
   227 Remember compilers have to solve the problem of bridging the gap between
   247 Let's get back to our compiler: Remember compilers have to solve the
   228 ``high-level'' programs and ``low-level'' hardware. If the gap is too
   248 problem of bridging the gap between ``high-level'' programs and
   229 wide for one step, then a good strategy is to lay a stepping stone
   249 ``low-level'' hardware. If the gap is too wide for one step, then a
   230 somewhere in between. The LLVM-IR itself is such a stepping stone to
   250 good strategy is to lay a stepping stone somewhere in between. The
   231 make the task of generating and optimising code easier. Like a real
   251 LLVM-IR itself is such a stepping stone to make the task of generating
   232 compiler we will use our own stepping stone which I call the
   252 and optimising code easier. Like a real compiler we will use our own
   233 \emph{K-language}. For what follows recall the various kinds of
   253 stepping stone which I call the \emph{K-language}.
   234 expressions in the Fun language. For convenience the Scala code of the
   254 
   235 corresponding abstract syntax trees is shown on top of
   255 \begin{center}
   236 Figure~\ref{absfun}. Below is the code for the abstract syntax trees in
   256   \begin{tikzpicture}[scale=0.9,font=\bf,
   237 the K-language. In K, here are two kinds of syntactic entities, namely
   257                       node/.style={
   238 \emph{K-values} and \emph{K-expressions}. The central constructor of the
   258                       rectangle,rounded corners=3mm,
   239 K-language is \texttt{KLet}. For this recall in SSA that arithmetic
   259                       ultra thick,draw=black!50,minimum height=16mm, 
   240 expressions such as $((1 + a) + (3 + (b * 5)))$ need to be broken up
   260                       minimum width=20mm,
   241 into smaller ``atomic'' steps, like so
   261                       top color=white,bottom color=black!20}]
       
   262 
       
   263   %\node (0) at (-3,0) {};  
       
   264   \node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language};
       
   265   \node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language};
       
   266   \node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR};
       
   267  
       
   268   \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B); 
       
   269   \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C); 
       
   270   \end{tikzpicture}
       
   271   \end{center}
       
   272 
       
   273   \noindent
       
   274   To see why we need a stepping stone for generating code in SSA-format,
       
   275   considder again arithmetic expressions such as
       
   276   $((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller
       
   277   ``atomic'' steps, like so
   242  
   278  
   243 \begin{lstlisting}[language=LLVMIR,numbers=none]
   279 \begin{lstlisting}[language=LLVMIR,numbers=none]
   244 let tmp0 = add 1 a in   
   280 let tmp0 = add 1 a in   
   245 let tmp1 = mul b 5 in 
   281 let tmp1 = mul b 5 in 
   246 let tmp2 = add 3 tmp1 in 
   282 let tmp2 = add 3 tmp1 in 
   247 let tmp3 = add tmp0 tmp2 in
   283 let tmp3 = add tmp0 tmp2 in
   248   tmp3 
   284   return tmp3 
   249 \end{lstlisting}
   285 \end{lstlisting}
   250 
   286 
   251 \noindent
   287 \noindent
   252 Here \texttt{tmp3} will contain the result of what the whole expression
   288 Here \texttt{tmp3} will contain the result of what the whole
   253 stands for. In each individual step we can only perform an ``atomic''
   289 expression stands for. In each individual step we can only perform an
   254 operation, like addition or multiplication of a number and a variable.
   290 ``atomic'' or ``trival'' operation, like addition or multiplication of
   255 We are not allowed to have for example an if-condition on the right-hand
   291 a number or a variable.  We are not allowed to have for example a
   256 side of an equals. Such constraints are enforced upon us because of how
   292 nested addition or an if-condition on the right-hand side of an
   257 the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking
   293 assignment. Such constraints are forced upon us because of how the SSA
   258 first a string (standing for an intermediate result) and second a value,
   294 format works in the LLVM-IR. To simplify matters we represent
   259 we can fulfil this constraint ``by construction''---there is no way we
   295 assignments with two kinds of syntactic entities, namely
   260 could write anything else than a value. 
   296 \emph{K-values} and \emph{K-expressions}. K-values are all ``things''
       
   297 that can appear on the right-hand side of an equal. Say we have
       
   298 the following definition for K-values:
       
   299 
       
   300 \begin{lstlisting}[language=Scala,numbers=none]
       
   301 enum KVal {
       
   302   case KVar(s: String)
       
   303   case KNum(n: Int)
       
   304   case KAop(op: String, v1: KVal, v2: KVal)
       
   305   case KCall(fname: String, args: List[KVal])
       
   306 }
       
   307 \end{lstlisting}
       
   308 
       
   309 \noindent
       
   310 where a K-value can be a variable, a number or a ``trivial'' binary
       
   311 operation, such as addition or multiplication. The two arguments of a
       
   312 binary operation need to be K-values as well. Finally, we have
       
   313 function calls, but again each argument of the function call needs to
       
   314 be a K-value.  One point we need to be careful, however, is that the
       
   315 arguments of the binary operations and function calls are in fact only
       
   316 variables or numbers. To encode this constraint into the type-system
       
   317 of Scala would make things too complicated---to satisfy this
       
   318 constraint is therefore on us. For our Fun-language, we will later on
       
   319 consider some further K-values.
       
   320 
       
   321 
       
   322 Our K-expressions will encode the \texttt{let} and the \texttt{return}
       
   323 from the SSA-format (again for the moment we only consider these two
       
   324 constructors---later on we will have if-branches as well).
       
   325 
       
   326 \begin{lstlisting}[language=Scala,numbers=none]
       
   327 enum KExp {
       
   328   case KLet(x: String, v: KVal, e: KExp)
       
   329   case KReturn(v: KVal)
       
   330 }
       
   331 \end{lstlisting}
       
   332 
       
   333 \noindent
       
   334 By having in \texttt{KLet} taking first a string (standing for an
       
   335 intermediate variable) and second a value, we can fulfil the SSA
       
   336 constraint in assignments ``by construction''---there is no way we
       
   337 could write anything else than a K-value.  Note that the third
       
   338 argument of a \texttt{KLet} is again a K-expression, meaning either
       
   339 another \texttt{KLet} or a \texttt{KReturn}. In this way we can
       
   340 construct a sequence of computations and return a final result.  Of
       
   341 course we also have to ensure that all intermediary computations are
       
   342 given (fresh) names---we will use an (ugly) counter for this.
   261 
   343 
   262 To sum up, K-values are the atomic operations that can be on the
   344 To sum up, K-values are the atomic operations that can be on the
   263 right-hand side of equal-signs. The K-language is restricted such that
   345 right-hand side of assignemnts. The K-language is restricted such that
   264 it is easy to generate the SSA format for the LLVM-IR. 
   346 it is easy to generate the SSA format for the LLVM-IR. What remains to
   265 
   347 be done is a translation of our Fun-language into the K-language. The
   266 
   348 main difficulty is that we need to order the computationa---in teh
   267 
   349 K-language we only have linear sequence of instructions to need to be
   268 \begin{figure}[p]\small
   350 followed. Before we explain this, we have a look at some
   269 \begin{lstlisting}[language=Scala,numbers=none]
   351 CPS-translation.
   270 // Fun language (expressions)
   352 
   271 abstract class Exp 
       
   272 abstract class BExp 
       
   273 
       
   274 case class Call(name: String, args: List[Exp]) extends Exp
       
   275 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
       
   276 case class Write(e: Exp) extends Exp
       
   277 case class Var(s: String) extends Exp
       
   278 case class Num(i: Int) extends Exp
       
   279 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
       
   280 case class Sequence(e1: Exp, e2: Exp) extends Exp
       
   281 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
       
   282 
       
   283 
       
   284 
       
   285 // K-language (K-expressions, K-values)
       
   286 abstract class KExp
       
   287 abstract class KVal
       
   288 
       
   289 case class KVar(s: String) extends KVal
       
   290 case class KNum(i: Int) extends KVal
       
   291 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
       
   292 case class KCall(o: String, vrs: List[KVal]) extends KVal
       
   293 case class KWrite(v: KVal) extends KVal
       
   294 
       
   295 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
       
   296 case class KLet(x: String, v: KVal, e: KExp) extends KExp
       
   297 case class KReturn(v: KVal) extends KExp
       
   298 \end{lstlisting}
       
   299 \caption{Abstract syntax trees for the Fun language.\label{absfun}}
       
   300 \end{figure}
       
   301 
   353 
   302 
   354 
   303 
   355 
   304 \subsection*{CPS-Translations}
   356 \subsection*{CPS-Translations}
   305 
   357 
   306 CPS stands for Continuation-Passing-Style. It is a kind of programming
   358 CPS stands for Continuation-Passing-Style. It is a kind of programming
   307 technique often used in advanced functional programming. Before we delve
   359 technique often used in advanced functional programming. Before we delve
   308 into the CPS-translation for our Fun language, let us look at 
   360 into the CPS-translation for our Fun-language, let us look at 
   309 CPS-versions of some well-known functions. Consider
   361 CPS-versions of some well-known functions. Consider
   310 
   362 
   311 \begin{lstlisting}[language=Scala, numbers=none]
   363 \begin{lstlisting}[language=Scala, numbers=none]
   312 def fact(n: Int) : Int = 
   364 def fact(n: Int) : Int = 
   313   if (n == 0) 1 else n * fact(n - 1) 
   365   if (n == 0) 1 else n * fact(n - 1) 
   347 \end{lstlisting}
   399 \end{lstlisting}
   348 
   400 
   349 \noindent
   401 \noindent
   350 which is the expected result. If this looks somewhat familiar to you,
   402 which is the expected result. If this looks somewhat familiar to you,
   351 than this is because functions with continuations can be
   403 than this is because functions with continuations can be
   352 seen as a kind of generalisation of tail-recursive functions. Anyway
   404 seen as a kind of generalisation of tail-recursive functions.
       
   405 So far we did not look at this generalisation in earnest.
       
   406 Anyway
   353 notice how the continuations is ``stacked up'' during the recursion and
   407 notice how the continuations is ``stacked up'' during the recursion and
   354 then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
   408 then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
   355 can do something similar to the Fibonacci function where in the traditional
   409 can do something similar to the Fibonacci function where in the traditional
   356 version we have two recursive calls. Consider the following function
   410 version we have two recursive calls. Consider the following function
   357 
   411 
   378 id((1 + 1) + 1)
   432 id((1 + 1) + 1)
   379 (1 + 1) + 1
   433 (1 + 1) + 1
   380 3
   434 3
   381 \end{lstlisting}
   435 \end{lstlisting}
   382 
   436 
   383 Let us now come back to the CPS-translations for the Fun language. The
   437 \noindent
       
   438 The point of this section is that you are playing around with these
       
   439 simple definitions and make sure they calculate the expected result.
       
   440 In the next step, you should understand \emph{how} these functions
       
   441 calculate the result with the continuations. 
       
   442 
       
   443 
       
   444 \section*{Worked Example}
       
   445 
       
   446 Let us now come back to the CPS-translations for the Fun-language. The
   384 main difficulty of generating instructions in SSA format is that large
   447 main difficulty of generating instructions in SSA format is that large
   385 compound expressions need to be broken up into smaller pieces and
   448 compound expressions need to be broken up into smaller pieces and
   386 intermediate results need to be chained into later instructions. To do
   449 intermediate results need to be chained into later instructions. To do
   387 this conveniently, CPS-translations have been developed. They use
   450 this conveniently, we use the CPS-translation mechanism. What continuations
       
   451 essentially solve is the following problem: Given an expressions
       
   452 
       
   453 \begin{equation}\label{exp}
       
   454 (1 + 2) * (3 + 4)
       
   455 \end{equation}  
       
   456 
       
   457 \noindent
       
   458 which of the subexpressions should be calculated first? We just arbitrarily
       
   459 going to decide that it will be from left to right. This means we have
       
   460 to tear the expression shown in \eqref{exp} apart as follows:
       
   461 
       
   462 \[
       
   463 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
       
   464 \]  
       
   465 
       
   466 \noindent
       
   467 The first one will give us a result, but the second one is not
       
   468 really an expression that makes sense. It will only make sense
       
   469 as the next step in the computation when we fill-in the result of
       
   470 $1+2$ into the ``hole'' indicated by $\Box$. Such incomplete
       
   471 expressions can be represented in Scala as functions
       
   472 
       
   473 \begin{lstlisting}[language=Scala, numbers=none]
       
   474 r => r * (3 + 4)
       
   475 \end{lstlisting}  
       
   476 
       
   477 \noindent
       
   478 where \texttt{r} is any result that has been computed earlier. By the
       
   479 way in lambda-calculus notation we would write
       
   480 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
   388 functions (``continuations'') to represent what is coming next in a
   481 functions (``continuations'') to represent what is coming next in a
   389 sequence of instructions. In our case, continuations are functions of type
   482 sequence of instructions. In our case, continuations are functions of
   390 \code{KVal} to \code{KExp}. They can be seen as a sequence of
   483 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
   391 \code{KLet}s where there is a ``hole'' that needs to be filled. Consider
   484 \code{KLet}s where there is a ``hole'' that needs to be
   392 for example
   485 filled. Consider for example
   393 
   486 
   394 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
   487 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
   395 let tmp0 = add 1 a in   
   488 let tmp0 = add 1 a in   
   396 let tmp1 = mul (*@$\Box$@*) 5 in 
   489 let tmp1 = mul (*@$\Box$@*) 5 in 
   397 let tmp2 = add 3 tmp1 in 
   490 let tmp2 = add 3 tmp1 in 
   398 let tmp3 = add tmp0 tmp2 in
   491 let tmp3 = add tmp0 tmp2 in
   399   tmp3 
   492   retrun tmp3 
   400 \end{lstlisting}
   493 \end{lstlisting}
   401 
   494 
   402 \noindent
   495 \noindent
   403 where in the second line is a $\Box$ which still expects a \code{KVal}
   496 where in the second line is a $\Box$ which still expects a \code{KVal}
   404 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   497 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   405 apply an argument to the continuation (remember they are functions)
   498 apply an argument to the continuation (remember they are functions)
   406 we essentially fill something into the corresponding hole. The code
   499 we essentially fill something into the corresponding hole.
   407 of the CPS-translation is 
   500 
       
   501 Lets look at concrete code. To simplify matters first, 
       
   502 suppose our source language consists just of three kinds of expressions
       
   503 
       
   504 \begin{lstlisting}[language=Scala,numbers=none]
       
   505 enum Expr {
       
   506     case Num(n: Int)
       
   507     case Bop(op: String, e1: Expr, e2: Expr)
       
   508     case Call(fname: String, args: List[Expr])
       
   509 }
       
   510 \end{lstlisting}
       
   511 
       
   512 \noindent
       
   513 The code of the CPS-translation is then of the form
   408 
   514 
   409 \begin{lstlisting}[language=Scala,numbers=none]
   515 \begin{lstlisting}[language=Scala,numbers=none]
   410 def CPS(e: Exp)(k: KVal => KExp) : KExp = 
   516 def CPS(e: Exp)(k: KVal => KExp) : KExp = 
   411   e match { ... }
   517   e match { ... }
   412 \end{lstlisting}
   518 \end{lstlisting}
   413 
   519 
   414 \noindent 
   520 \noindent 
   415 where \code{k} is the continuation and \code{e} is the expression 
   521 where \code{k} is the continuation and \code{e} is the expression to
   416 to be compiled. In case we have numbers or variables, we can just
   522 be compiled. In case we have numbers, we can just pass them to the
       
   523 continuations because numbers need not be further teared apart as they
       
   524 are already atomic. Passing the number to the continuation means we 
   417 apply the continuation like 
   525 apply the continuation like 
   418 
   526 
   419 \begin{center}
   527 \begin{lstlisting}[language=Scala,numbers=none]
   420 \code{k(KNum(n))} \qquad \code{k(KVar(x))}
   528   case Num(i) => k(KNum(i))
   421 \end{center}
   529 \end{lstlisting}
   422 
   530 
   423 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
   531 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
   424 More interesting is the case for an arithmetic expression.
   532 There is no need to create a temporary variable for simple numbers.
   425 
   533 More interesting is the case for arithmetic operations.
   426 \begin{lstlisting}[language=Scala,numbers=none]
   534 
   427 case Aop(o, e1, e2) => {
   535 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   428     val z = Fresh("tmp")
   536 case Aop(op, e1, e2) => {
   429     CPS(e1)(y1 => 
   537   val z = Fresh("tmp")
   430       CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z)))))
   538   CPS(e1)(r1 => 
       
   539     CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))))
   431 }
   540 }
   432 \end{lstlisting}
   541 \end{lstlisting}
   433 
   542 
   434 \noindent
   543 \noindent
   435 For more such rules, have a look to the code of the fun-llvm
   544 What we essentially have to do in this case is the following: compile
   436 compiler.
   545 the subexpressions \texttt{e1} and \texttt{e2}. They will produce some
       
   546 result that is stored in two temporary variables (assuming they are more
       
   547 complicated than just numbers). We need to use these two temporary
       
   548 variables and feed them into a new assignment of the form
       
   549 
       
   550 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
       
   551 let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in
       
   552 ...
       
   553 \end{lstlisting}
       
   554 
       
   555 \noindent
       
   556 Note that this assignment has two ``holes'', one for the left
       
   557 subexpression and one for the right subexpression. We can fill both
       
   558 holes by calling the CPS function twice---this is a bit of the black
       
   559 art in CPS. We can use the second call of CPS as the continuation
       
   560 of the first call: The reason is that
       
   561 
       
   562 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
       
   563 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
       
   564 \end{lstlisting}
       
   565 
       
   566 \noindent is a function from \code{KVal} to \code{KExp}. Once
       
   567 we created the assignment with the fresh temporary variable
       
   568 \texttt{z}, we need to ``communicate'' that the result of the
       
   569 computation of the arithmetic expression is stored in \texttt{z} to the
       
   570 computations that follow. In this way we apply the continuation \texttt{k} with this
       
   571 new variable (essentially we are plugging in a hole further down the line).
       
   572 
       
   573 The last case we need to consider in our small expression language are
       
   574 function calls.
       
   575 
       
   576 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
       
   577 case Call(fname, args) => {
       
   578   def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
       
   579        case Nil => {
       
   580          val z = Fresh("tmp")
       
   581          KLet(z, KCall(fname, vs), k(KVar(z)))
       
   582        }
       
   583        case a::as => CPS(a)(r => aux(as, vs ::: List(r)))
       
   584   }
       
   585   aux(args, Nil)  
       
   586 }
       
   587 \end{lstlisting}
       
   588 
       
   589 \noindent
       
   590 For them we introduce an auxiliary function that compiles first all
       
   591 function arguments---remember in our source language we can have calls
       
   592 such as $foo(3 + 4, g(3))$ where we first have to create temporary
       
   593 variables (and computations) for each argument. Therefore \texttt{aux}
       
   594 analyses the arguments from left to right. In case there is an argument
       
   595 \texttt{a} on the front of the list (the case \texttt{a::as}), we call
       
   596 CPS recursively for the corresponding subexpression. The temporary variable
       
   597 containing the result for this argument we add to the end of the K-values we
       
   598 have already analysed before. Again very conveniently we can use the
       
   599 recursive call to \texttt{aux} as the continuation for the computations
       
   600 that follow. If we reach the end of the argument list (the \texttt{Nil}-case),
       
   601 then we collect all the K-values \texttt{v1} to \texttt{vn} and call the
       
   602 function in the K-language like so
       
   603 
       
   604 
       
   605 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
       
   606 let z = call foo(v1,...,vn) in
       
   607 ...
       
   608 \end{lstlisting}
       
   609 
       
   610 \noindent
       
   611 Again we need to communicate the result of the function call, namely the
       
   612 fresh temporary variable \texttt{z}, to the rest of the computation. Therefore
       
   613 we apply the continuation \texttt{k} with this variable.
       
   614 
       
   615 
       
   616 
       
   617 \begin{figure}[p]\small
       
   618   \lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc}
       
   619   \hspace{10mm}...
       
   620   \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
       
   621 \caption{CPS-translation for a simple expression language.\label{cps}}
       
   622 \end{figure}
       
   623 
       
   624 The last question we need to answer in the code (see Figure~\ref{cps}) is how we
       
   625 start the CPS-translation function, or more precisely with which continuation we
       
   626 should start CPS? It turns out that this initial continuation will be the last one that is
       
   627 called. What should be the last step in the computation? Yes, we need to return the
       
   628 temporary variable where the last result is stored (if it is a simple number, then we can
       
   629 just return this number). Therefore we cal CPS with the code
       
   630 
       
   631 \begin{lstlisting}[language=Scala,numbers=none]
       
   632 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
       
   633 \end{lstlisting}
       
   634 
       
   635 \noindent
       
   636 where we give the function \code{KReturn(_)}. 
       
   637 
       
   638 
       
   639 \begin{figure}[p]\small
       
   640 \begin{lstlisting}[language=Scala,numbers=none]
       
   641 // Fun language (expressions)
       
   642 abstract class Exp 
       
   643 abstract class BExp 
       
   644 
       
   645 case class Call(name: String, args: List[Exp]) extends Exp
       
   646 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
       
   647 case class Write(e: Exp) extends Exp
       
   648 case class Var(s: String) extends Exp
       
   649 case class Num(i: Int) extends Exp
       
   650 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
       
   651 case class Sequence(e1: Exp, e2: Exp) extends Exp
       
   652 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
       
   653 
       
   654 
       
   655 
       
   656 // K-language (K-expressions, K-values)
       
   657 abstract class KExp
       
   658 abstract class KVal
       
   659 
       
   660 case class KVar(s: String) extends KVal
       
   661 case class KNum(i: Int) extends KVal
       
   662 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
       
   663 case class KCall(o: String, vrs: List[KVal]) extends KVal
       
   664 case class KWrite(v: KVal) extends KVal
       
   665 
       
   666 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
       
   667 case class KLet(x: String, v: KVal, e: KExp) extends KExp
       
   668 case class KReturn(v: KVal) extends KExp
       
   669 \end{lstlisting}
       
   670 \caption{Abstract syntax trees for the Fun-language.\label{absfun}}
       
   671 \end{figure}
       
   672 
       
   673 
   437 
   674 
   438 \noindent
   675 \noindent
   439 \end{document}
   676 \end{document}
   440 
   677 
   441 
   678