handouts/ho09.tex
changeset 909 a04efdd5e7a3
parent 908 0138618eff73
child 912 e32802acf952
equal deleted inserted replaced
908:0138618eff73 909:a04efdd5e7a3
    23 for this ease is that the JVM is a stack-based virtual machine and it
    23 for this ease is that the JVM is a stack-based virtual machine and it
    24 is therefore not hard to translate deeply-nested arithmetic
    24 is therefore not hard to translate deeply-nested arithmetic
    25 expressions into a sequence of instructions manipulating the
    25 expressions into a sequence of instructions manipulating the
    26 stack. The problem is that ``real'' CPUs, although supporting stack
    26 stack. The problem is that ``real'' CPUs, although supporting stack
    27 operations, are not really designed to be \emph{stack machines}.  The
    27 operations, are not really designed to be \emph{stack machines}.  The
    28 design of CPUs is more like: Here are some operations and a chunk of
    28 design of CPUs is more like: Here are some instructions and a chunk of
    29 memory---compiler, or better compiler writers, do something with
    29 memory---compiler, or better compiler writers, do something with
    30 them. Consequently, modern compilers need to go the extra mile in
    30 them. Consequently, modern compilers need to go the extra mile in
    31 order to generate code that is much easier and faster to process by
    31 order to generate code that is much easier and faster to process by
    32 actual CPUs, rather than running code on virtual machines that
    32 actual CPUs, rather than running code on virtual machines that
    33 introduce an additional layer of indirection. To make this all
    33 introduce an additional layer of indirection. To make this all
    34 tractable for this module, we target the LLVM Intermediate
    34 tractable for this module, we target the LLVM Intermediate
    35 Language. In this way we can take advantage of the tools coming with
    35 Language. In this way we can take advantage of the tools coming with
    36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to
    36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to
    37 worry about things like register allocations. By using LLVM, however,
    37 worry about things like register allocations. By using the LLVM-IR,
    38 we also have to pay price in the sense that generating code gets
    38 however, we also have to pay price in the sense that generating code
    39 harder\ldots{}unfortunately.
    39 gets harder\ldots{}unfor\-tunately.
    40 
    40 
    41 \subsection*{LLVM and LLVM-IR}
    41 \subsection*{LLVM and the LLVM-IR}
    42 
    42 
    43 \noindent LLVM is a beautiful example
    43 \noindent LLVM is a beautiful example
    44 that projects from Academia can make a difference in the World. LLVM
    44 that projects from Academia can make a difference in the World. LLVM
    45 started in 2000 as a project by two researchers at the  University of
    45 started in 2000 as a project by two researchers at the  University of
    46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    59 
    59 
    60 We will target the LLVM Intermediate Language, or LLVM Intermediate
    60 We will target the LLVM Intermediate Language, or LLVM Intermediate
    61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the
    61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the
    62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
    62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also
    63 allow us to benefit from the modular structure of the LLVM compiler
    63 allow us to benefit from the modular structure of the LLVM compiler
    64 and we can let, for example, the compiler generate code for different
    64 and we can let the compiler generate code for different CPUs, for
    65 CPUs, say X86 or ARM.  That means we can be agnostic about where our
    65 example X86 or ARM.  That means we can be agnostic about where our
    66 code is actually going to run.\footnote{Anybody want to try to run our
    66 code is actually going to run.\footnote{Anybody want to try to run our
    67   programs on Android phones?}  We can also be somewhat ignorant about
    67   programs on Android phones?}  We can also be somewhat ignorant about
    68 optimising our code and about allocating memory efficiently: the LLVM
    68 optimising our code and about allocating memory efficiently: the LLVM
    69 tools will take care of all this.
    69 tools will take care of all this.
    70 
    70 
    71 However, what we have to do in order to make LLVM to play ball is to
    71 However, what we have to do in order to make LLVM to play ball is to
    72 generate code in \emph{Static Single-Assignment} format (short SSA),
    72 generate code in \emph{Static Single-Assignment} format (short SSA),
    73 because that is what the LLVM-IR expects from us. A reason why LLVM
    73 because that is what the LLVM-IR expects from us. A reason why LLVM
    74 uses the SSA format, rather than JVM-like stack instructions, is that
    74 uses the SSA-format, rather than JVM-like stack instructions, is that
    75 stack instructions are difficult to optimise---you cannot just
    75 stack instructions are difficult to optimise---you cannot just
    76 re-arrange instructions without messing about with what is calculated
    76 re-arrange instructions without messing about with what is calculated
    77 on the stack. Also it is hard to find out if all the calculations on
    77 on the stack. Also it is hard to find out if all the calculations on
    78 the stack are actually necessary and not by chance dead code. The JVM
    78 the stack are actually necessary and not by chance dead code. The JVM
    79 has for all these obstacles sophisticated machinery to make such
    79 has for all these obstacles sophisticated machinery to make such
    82 ourselves. This means we have to work around the intricacies of what
    82 ourselves. This means we have to work around the intricacies of what
    83 instructions CPUs can actually process fast. This is what the SSA
    83 instructions CPUs can actually process fast. This is what the SSA
    84 format is designed for.
    84 format is designed for.
    85 
    85 
    86 
    86 
    87 The main idea behind the SSA format is to use very simple variable
    87 The main idea behind the SSA-format is to have sequences of very
    88 assignments where every tmp-variable is assigned only once. The
    88 simple variable assignments where every (tmp)variable is assigned only
    89 assignments also need to be primitive in the sense that they can be
    89 once. The assignments need to be simple in the sense that they can be
    90 just simple operations like addition, multiplication, jumps,
    90 just be primitive operations like addition, multiplication, jumps,
    91 comparisons, function calls and so on.  Say, we have an expression
    91 comparisons, function calls and so on.  Say, we have an expression
    92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is
    92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding sequence
       
    93 of assignments in SSA-format are:
    93  
    94  
    94 \begin{lstlisting}[language=LLVMIR,numbers=left]
    95 \begin{lstlisting}[language=LLVMIR,numbers=left]
    95 let tmp0 = add 1 a in
    96 let tmp0 = add 1 a in
    96 let tmp1 = add 3 2 in  
    97 let tmp1 = add 3 2 in  
    97 let tmp2 = call foo(tmp1)  
    98 let tmp2 = call foo(tmp1)  
    99 let tmp4 = add tmp2 tmp3 in 
   100 let tmp4 = add tmp2 tmp3 in 
   100 let tmp5 = add tmp0 tmp4 in
   101 let tmp5 = add tmp0 tmp4 in
   101   return tmp5 
   102   return tmp5 
   102 \end{lstlisting}
   103 \end{lstlisting}
   103 
   104 
   104 \noindent where every tmp-variable is used only once (we could, for
   105 \noindent where every tmpX-variable is used only once (we could, for
   105 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
   106 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
   106 would not change the overall result).
   107 would not change the overall result). At the end we have a return-instruction
       
   108 wich contains the final result of the expression.
   107 
   109 
   108 There are sophisticated algorithms for imperative languages, like C,
   110 There are sophisticated algorithms for imperative languages, like C,
   109 that efficiently transform a high-level program into SSA format. But
   111 that efficiently transform high-level programs into SSA-format. But
   110 we can ignore them here. We want to compile a functional language and
   112 we can ignore them here. We want to compile a functional language and
   111 there things get much more interesting than just sophisticated. We
   113 there things get much more interesting than just sophisticated. We
   112 will need to have a look at CPS translations, where the CPS stands for
   114 will need to have a look at CPS translations, where the CPS stands for
   113 Continuation-Passing-Style---basically black programming art or
   115 \emph{Continuation-Passing-Style}---basically black programming art or
   114 abracadabra programming. So sit tight.
   116 abracadabra programming. So sit tight.
   115 
   117 
   116 \subsection*{LLVM-IR}
   118 \subsection*{LLVM-IR}
   117 
   119 
   118 Before we start, let's however first have a look at the \emph{LLVM Intermediate
   120 Before we start, let's however first have a look at the \emph{LLVM Intermediate
   230 \end{lstlisting}
   232 \end{lstlisting}
   231 
   233 
   232 \begin{figure}[t]\small 
   234 \begin{figure}[t]\small 
   233 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   235 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   234 \caption{An LLVM-IR program for calculating the square function. It
   236 \caption{An LLVM-IR program for calculating the square function. It
   235   calls the sqr-function in \texttt{@main} with the argument \texttt{5}
   237   calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
   236   (Line 20). The
   238   (Line 20). The
   237   code for the \texttt{sqr} function is in Lines 13 -- 16. It stores
   239   code for the \texttt{sqr}-function is in Lines 13 -- 16. It stores
   238   the result of sqr in the variable called \texttt{\%i} and then
   240   the result of sqr in the variable called \texttt{\%1} and then
   239   prints out the contents of this variable in Line 21. The other
   241   prints out the contents of this variable in Line 21. The other
   240   code is boilerplate for printing out integers.\label{lli}}
   242   code is boilerplate for printing out integers.\label{lli}}
   241 \end{figure}   
   243 \end{figure}   
   242 
   244 
   243 
   245 
   288 Here \texttt{tmp3} will contain the result of what the whole
   290 Here \texttt{tmp3} will contain the result of what the whole
   289 expression stands for. In each individual step we can only perform an
   291 expression stands for. In each individual step we can only perform an
   290 ``atomic'' or ``trival'' operation, like addition or multiplication of
   292 ``atomic'' or ``trival'' operation, like addition or multiplication of
   291 a number or a variable.  We are not allowed to have for example a
   293 a number or a variable.  We are not allowed to have for example a
   292 nested addition or an if-condition on the right-hand side of an
   294 nested addition or an if-condition on the right-hand side of an
   293 assignment. Such constraints are forced upon us because of how the SSA
   295 assignment. Such constraints are forced upon us because of how the
   294 format works in the LLVM-IR. To simplify matters we represent
   296 SSA-format works in the LLVM-IR. To simplify matters we represent
   295 assignments with two kinds of syntactic entities, namely
   297 assignments with two kinds of syntactic entities, namely
   296 \emph{K-values} and \emph{K-expressions}. K-values are all ``things''
   298 \emph{K-values} and \emph{K-expressions}. K-values are all ``things''
   297 that can appear on the right-hand side of an equal. Say we have
   299 that can appear on the right-hand side of an equal. Say we have the
   298 the following definition for K-values:
   300 following definition for K-values:
   299 
   301 
   300 \begin{lstlisting}[language=Scala,numbers=none]
   302 \begin{lstlisting}[language=Scala,numbers=none]
   301 enum KVal {
   303 enum KVal {
   302   case KVar(s: String)
   304   case KVar(s: String)
   303   case KNum(n: Int)
   305   case KNum(n: Int)
   310 where a K-value can be a variable, a number or a ``trivial'' binary
   312 where a K-value can be a variable, a number or a ``trivial'' binary
   311 operation, such as addition or multiplication. The two arguments of a
   313 operation, such as addition or multiplication. The two arguments of a
   312 binary operation need to be K-values as well. Finally, we have
   314 binary operation need to be K-values as well. Finally, we have
   313 function calls, but again each argument of the function call needs to
   315 function calls, but again each argument of the function call needs to
   314 be a K-value.  One point we need to be careful, however, is that the
   316 be a K-value.  One point we need to be careful, however, is that the
   315 arguments of the binary operations and function calls are in fact only
   317 arguments of the binary operations and function calls can in fact only
   316 variables or numbers. To encode this constraint into the type-system
   318 be variables or numbers. To encode this constraint into the type-system
   317 of Scala would make things too complicated---to satisfy this
   319 of Scala, however, would make things too complicated---to satisfy this
   318 constraint is therefore on us. For our Fun-language, we will later on
   320 constraint is therefore on us. For our Fun-language, we will later on
   319 consider some further K-values.
   321 consider some further K-values.
   320 
   322 
   321 
   323 
   322 Our K-expressions will encode the \texttt{let} and the \texttt{return}
   324 Our K-expressions will encode the \texttt{let} and the \texttt{return}
   331 \end{lstlisting}
   333 \end{lstlisting}
   332 
   334 
   333 \noindent
   335 \noindent
   334 By having in \texttt{KLet} taking first a string (standing for an
   336 By having in \texttt{KLet} taking first a string (standing for an
   335 intermediate variable) and second a value, we can fulfil the SSA
   337 intermediate variable) and second a value, we can fulfil the SSA
   336 constraint in assignments ``by construction''---there is no way we
   338 constraint in assignments ``by con\-struction''---there is no way we
   337 could write anything else than a K-value.  Note that the third
   339 could write anything else than a K-value.  Note that the third
   338 argument of a \texttt{KLet} is again a K-expression, meaning either
   340 argument of a \texttt{KLet} is again a K-expression, meaning either
   339 another \texttt{KLet} or a \texttt{KReturn}. In this way we can
   341 another \texttt{KLet} or a \texttt{KReturn}. In this way we can
   340 construct a sequence of computations and return a final result.  Of
   342 construct a sequence of computations and return a final result.  Of
   341 course we also have to ensure that all intermediary computations are
   343 course we also have to ensure that all intermediary computations are
   342 given (fresh) names---we will use an (ugly) counter for this.
   344 given (fresh) names---we will use an (ugly) counter for this.
   343 
   345 
   344 To sum up, K-values are the atomic operations that can be on the
   346 To sum up, K-values are the atomic operations that can be on the
   345 right-hand side of assignemnts. The K-language is restricted such that
   347 right-hand side of assignemnts. The K-language is restricted such that
   346 it is easy to generate the SSA format for the LLVM-IR. What remains to
   348 it is easy to generate the SSA-format for the LLVM-IR. What remains to
   347 be done is a translation of our Fun-language into the K-language. The
   349 be done is a translation of our Fun-language into the K-language. The
   348 main difficulty is that we need to order the computationa---in teh
   350 main difficulty is that we need to order the computation---in the
   349 K-language we only have linear sequence of instructions to need to be
   351 K-language we only have linear sequence of instructions. Before we
   350 followed. Before we explain this, we have a look at some
   352 explain this, we have a look at some CPS-translation.
   351 CPS-translation.
       
   352 
   353 
   353 
   354 
   354 
   355 
   355 
   356 
   356 \subsection*{CPS-Translations}
   357 \subsection*{CPS-Translations}
   357 
   358 
   358 CPS stands for Continuation-Passing-Style. It is a kind of programming
   359 CPS stands for \emph{Continuation-Passing-Style}. It is a kind of programming
   359 technique often used in advanced functional programming. Before we delve
   360 technique often used in advanced functional programming. Before we delve
   360 into the CPS-translation for our Fun-language, let us look at 
   361 into the CPS-translation for our Fun-language compiler, let us look at 
   361 CPS-versions of some well-known functions. Consider
   362 CPS-versions of some well-known functions. Consider
   362 
   363 
   363 \begin{lstlisting}[language=Scala, numbers=none]
   364 \begin{lstlisting}[language=Scala, numbers=none]
   364 def fact(n: Int) : Int = 
   365 def fact(n: Int) : Int = 
   365   if (n == 0) 1 else n * fact(n - 1) 
   366   if (n == 0) 1 else n * fact(n - 1) 
   376 
   377 
   377 factC(3, identity)
   378 factC(3, identity)
   378 \end{lstlisting}
   379 \end{lstlisting}
   379 
   380 
   380 \noindent
   381 \noindent
   381 This function is called with the number, in this case 3, and the
   382 This function is called with a number, in this case 3, and the
   382 identity-function (which returns just its input). The recursive
   383 identity-function (which returns just its input). The recursive
   383 calls are:
   384 calls are:
   384 
   385 
   385 \begin{lstlisting}[language=Scala, numbers=none]
   386 \begin{lstlisting}[language=Scala, numbers=none]
   386 factC(2, x => identity(3 * x))
   387 factC(2, x => identity(3 * x))
   433 (1 + 1) + 1
   434 (1 + 1) + 1
   434 3
   435 3
   435 \end{lstlisting}
   436 \end{lstlisting}
   436 
   437 
   437 \noindent
   438 \noindent
   438 The point of this section is that you are playing around with these
   439 The point of this section is that you should be playing around with these
   439 simple definitions and make sure they calculate the expected result.
   440 simple definitions and make sure they calculate the expected result.
   440 In the next step, you should understand \emph{how} these functions
   441 In the next step, you should understand \emph{how} these functions
   441 calculate the result with the continuations. 
   442 calculate the result with the continuations. 
   442 
   443 
   443 
   444 
   444 \section*{Worked Example}
   445 \section*{Worked Example}
   445 
   446 
   446 Let us now come back to the CPS-translations for the Fun-language. The
   447 Let us now come back to the CPS-translations for the Fun-language.
   447 main difficulty of generating instructions in SSA format is that large
   448 Though we will start with a simpler subset containing only numbers,
   448 compound expressions need to be broken up into smaller pieces and
   449 arithmetic expressions and function calls.  The main difficulty of
   449 intermediate results need to be chained into later instructions. To do
   450 generating instructions in SSA-format is that large compound
   450 this conveniently, we use the CPS-translation mechanism. What continuations
   451 expressions need to be broken up into smaller pieces and intermediate
       
   452 results need to be chained into later instructions. To do this
       
   453 conveniently, we use the CPS-translation mechanism. What continuations
   451 essentially solve is the following problem: Given an expressions
   454 essentially solve is the following problem: Given an expressions
   452 
   455 
   453 \begin{equation}\label{exp}
   456 \begin{equation}\label{exp}
   454 (1 + 2) * (3 + 4)
   457 (1 + 2) * (3 + 4)
   455 \end{equation}  
   458 \end{equation}  
   456 
   459 
   457 \noindent
   460 \noindent
   458 which of the subexpressions should be calculated first? We just arbitrarily
   461 which of the subexpressions should be calculated first? We just
   459 going to decide that it will be from left to right. This means we have
   462 arbitrarily going to decide that it will be from left to right. Other
   460 to tear the expression shown in \eqref{exp} apart as follows:
   463 languages make different choices---C famously is right to left. In our
       
   464 case this means we have to tear the expression shown in \eqref{exp}
       
   465 apart as follows:
   461 
   466 
   462 \[
   467 \[
   463 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
   468 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
   464 \]  
   469 \]  
   465 
   470 
   466 \noindent
   471 \noindent
   467 The first one will give us a result, but the second one is not
   472 The first subexpression can be calculated and will give us a result,
   468 really an expression that makes sense. It will only make sense
   473 but the second one is not really an expression that makes sense. It
   469 as the next step in the computation when we fill-in the result of
   474 will only make sense as the next step in the computation when we
   470 $1+2$ into the ``hole'' indicated by $\Box$. Such incomplete
   475 fill-in the result of $1+2$ into the ``hole'' indicated by
   471 expressions can be represented in Scala as functions
   476 $\Box$. Such incomplete expressions can be represented in Scala as
       
   477 functions
   472 
   478 
   473 \begin{lstlisting}[language=Scala, numbers=none]
   479 \begin{lstlisting}[language=Scala, numbers=none]
   474 r => r * (3 + 4)
   480 r => r * (3 + 4)
   475 \end{lstlisting}  
   481 \end{lstlisting}  
   476 
   482 
   477 \noindent
   483 \noindent
   478 where \texttt{r} is any result that has been computed earlier. By the
   484 where \texttt{r} is any result that has been computed earlier. By the
   479 way in lambda-calculus notation we would write
   485 way, in lambda-calculus notation we would write
   480 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
   486 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
   481 functions (``continuations'') to represent what is coming next in a
   487 functions (``continuations'') to represent what is coming next in a
   482 sequence of instructions. In our case, continuations are functions of
   488 sequence of instructions. In our case, continuations are functions of
   483 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
   489 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
   484 \code{KLet}s where there is a ``hole'' that needs to be
   490 \code{KLet}s where there is a ``hole'' that needs to be
   496 where in the second line is a $\Box$ which still expects a \code{KVal}
   502 where in the second line is a $\Box$ which still expects a \code{KVal}
   497 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   503 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   498 apply an argument to the continuation (remember they are functions)
   504 apply an argument to the continuation (remember they are functions)
   499 we essentially fill something into the corresponding hole.
   505 we essentially fill something into the corresponding hole.
   500 
   506 
   501 Lets look at concrete code. To simplify matters first, 
   507 Lets look at some concrete code. To simplify matters, 
   502 suppose our source language consists just of three kinds of expressions
   508 suppose our source language consists just of three kinds of expressions
   503 
   509 
   504 \begin{lstlisting}[language=Scala,numbers=none]
   510 \begin{lstlisting}[language=Scala,numbers=none]
   505 enum Expr {
   511 enum Expr {
   506     case Num(n: Int)
   512     case Num(n: Int)
   517   e match { ... }
   523   e match { ... }
   518 \end{lstlisting}
   524 \end{lstlisting}
   519 
   525 
   520 \noindent 
   526 \noindent 
   521 where \code{k} is the continuation and \code{e} is the expression to
   527 where \code{k} is the continuation and \code{e} is the expression to
   522 be compiled. In case we have numbers, we can just pass them to the
   528 be compiled. The result of the function is a K-expression, which later
   523 continuations because numbers need not be further teared apart as they
   529 can be compiled into LLVM-IR code.
   524 are already atomic. Passing the number to the continuation means we 
   530 
   525 apply the continuation like 
   531 In case we have numbers, then we can just pass them in CPS-translation
       
   532 to the continuations because numbers need not be further teared apart
       
   533 as they are already atomic. Passing the number to the continuation
       
   534 means we apply the continuation like
   526 
   535 
   527 \begin{lstlisting}[language=Scala,numbers=none]
   536 \begin{lstlisting}[language=Scala,numbers=none]
   528   case Num(i) => k(KNum(i))
   537   case Num(i) => k(KNum(i))
   529 \end{lstlisting}
   538 \end{lstlisting}
   530 
   539 
   531 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression.
   540 \noindent This would fill in the $\Box$ in a \code{KLet}-expression.
   532 There is no need to create a temporary variable for simple numbers.
   541 Since \texttt{k} is a function from \texttt{KVar} to \texttt{KExp} we
   533 More interesting is the case for arithmetic operations.
   542 also optain the correct type for CPS, namely \texttt{KExp}.  There is
       
   543 no need to create a temporary variable for simple numbers.  More
       
   544 interesting is the case for arithmetic operations.
   534 
   545 
   535 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   546 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   536 case Aop(op, e1, e2) => {
   547 case Aop(op, e1, e2) => {
   537   val z = Fresh("tmp")
   548   val z = Fresh("tmp")
   538   CPS(e1)(r1 => 
   549   CPS(e1)(r1 => 
   555 \noindent
   566 \noindent
   556 Note that this assignment has two ``holes'', one for the left
   567 Note that this assignment has two ``holes'', one for the left
   557 subexpression and one for the right subexpression. We can fill both
   568 subexpression and one for the right subexpression. We can fill both
   558 holes by calling the CPS function twice---this is a bit of the black
   569 holes by calling the CPS function twice---this is a bit of the black
   559 art in CPS. We can use the second call of CPS as the continuation
   570 art in CPS. We can use the second call of CPS as the continuation
   560 of the first call: The reason is that
   571 of the first call. The reason is that
   561 
   572 
   562 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   573 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   563 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
   574 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
   564 \end{lstlisting}
   575 \end{lstlisting}
   565 
   576 
   567 we created the assignment with the fresh temporary variable
   578 we created the assignment with the fresh temporary variable
   568 \texttt{z}, we need to ``communicate'' that the result of the
   579 \texttt{z}, we need to ``communicate'' that the result of the
   569 computation of the arithmetic expression is stored in \texttt{z} to the
   580 computation of the arithmetic expression is stored in \texttt{z} to the
   570 computations that follow. In this way we apply the continuation \texttt{k} with this
   581 computations that follow. In this way we apply the continuation \texttt{k} with this
   571 new variable (essentially we are plugging in a hole further down the line).
   582 new variable (essentially we are plugging in a hole further down the line).
       
   583 Hope this makes sense.
   572 
   584 
   573 The last case we need to consider in our small expression language are
   585 The last case we need to consider in our small expression language are
   574 function calls.
   586 function calls.
   575 
   587 
   576 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   588 \begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
   577 case Call(fname, args) => {
   589 case Call(fname, args) => {
   578   def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
   590   def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
   579        case Nil => {
   591        case Nil => {
   580          val z = Fresh("tmp")
   592          val z = Fresh("tmp")
   581          KLet(z, KCall(fname, vs), k(KVar(z)))
   593          KLet(z, KCall(fname, vs), k(KVar(z)))
   589 \noindent
   601 \noindent
   590 For them we introduce an auxiliary function that compiles first all
   602 For them we introduce an auxiliary function that compiles first all
   591 function arguments---remember in our source language we can have calls
   603 function arguments---remember in our source language we can have calls
   592 such as $foo(3 + 4, g(3))$ where we first have to create temporary
   604 such as $foo(3 + 4, g(3))$ where we first have to create temporary
   593 variables (and computations) for each argument. Therefore \texttt{aux}
   605 variables (and computations) for each argument. Therefore \texttt{aux}
   594 analyses the arguments from left to right. In case there is an argument
   606 analyses the argument list from left to right. In case there is an
   595 \texttt{a} on the front of the list (the case \texttt{a::as}), we call
   607 argument \texttt{a} on the front of the list (the case \texttt{a::as}
   596 CPS recursively for the corresponding subexpression. The temporary variable
   608 in Line 7), we call CPS recursively for the corresponding
   597 containing the result for this argument we add to the end of the K-values we
   609 subexpression. The temporary variable containing the result for this
   598 have already analysed before. Again very conveniently we can use the
   610 argument we add to the end of the K-values we have already analysed
   599 recursive call to \texttt{aux} as the continuation for the computations
   611 before. Again very conveniently we can use the recursive call to
   600 that follow. If we reach the end of the argument list (the \texttt{Nil}-case),
   612 \texttt{aux} as the continuation for the computations that
   601 then we collect all the K-values \texttt{v1} to \texttt{vn} and call the
   613 follow. When we reach the end of the argument list (the
   602 function in the K-language like so
   614 \texttt{Nil}-case in Lines 3--6), then we collect all the K-values
       
   615 \texttt{v1} to \texttt{vn} and call the function in the K-language
       
   616 like so
   603 
   617 
   604 
   618 
   605 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
   619 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}]
   606 let z = call foo(v1,...,vn) in
   620 let z = call foo(v1,...,vn) in
   607 ...
   621 ...
   624 The last question we need to answer in the code (see Figure~\ref{cps}) is how we
   638 The last question we need to answer in the code (see Figure~\ref{cps}) is how we
   625 start the CPS-translation function, or more precisely with which continuation we
   639 start the CPS-translation function, or more precisely with which continuation we
   626 should start CPS? It turns out that this initial continuation will be the last one that is
   640 should start CPS? It turns out that this initial continuation will be the last one that is
   627 called. What should be the last step in the computation? Yes, we need to return the
   641 called. What should be the last step in the computation? Yes, we need to return the
   628 temporary variable where the last result is stored (if it is a simple number, then we can
   642 temporary variable where the last result is stored (if it is a simple number, then we can
   629 just return this number). Therefore we cal CPS with the code
   643 just return this number). Therefore we call CPS with the code
   630 
   644 
   631 \begin{lstlisting}[language=Scala,numbers=none]
   645 \begin{lstlisting}[language=Scala,numbers=none]
   632 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
   646 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
   633 \end{lstlisting}
   647 \end{lstlisting}
   634 
   648 
   635 \noindent
   649 \noindent
   636 where we give the function \code{KReturn(_)}. 
   650 where we give the function \code{KReturn(_)} as the continuation. With
       
   651 this we completed the translation of expressions into our
       
   652 K-language. Figure~\ref{absfun} gives the complete datatypes for the
       
   653 ASTs of the Fun-language and the K-values and K-expressions for the
       
   654 K-language.
       
   655 
       
   656 Having obtained a K-expression, it is relatively straightforward to
       
   657 generate a valid program for the LLVM-IR. We leave this to the
       
   658 attentive reader. What else can we do?  Well it should be relatively
       
   659 easy to apply some common optimisations to the K-expressions. One
       
   660 optimisations is called constant folding---for example if we have an
       
   661 expression $3 + 4$ then we can replace it by just $5$ instead of
       
   662 generating code to compute $5$. Now this information needs to be
       
   663 propagated to the next computation step to see whether any further
       
   664 constant foldings are possible. Another useful technique is common
       
   665 subexpression elimination, meaning if you have twice a calculation of
       
   666 a function $foo(a)$, then we want to call it only once and store the
       
   667 result in a temporary variable that can be used instead of the second,
       
   668 or third, call to $foo(a)$. Again I leave this to the attentive reader
       
   669 to work out and implement.
   637 
   670 
   638 
   671 
   639 \begin{figure}[p]\small
   672 \begin{figure}[p]\small
   640 \begin{lstlisting}[language=Scala,numbers=none]
   673 \begin{lstlisting}[language=Scala,numbers=none]
   641 // Fun language (expressions)
   674 // Fun language (expressions)
   650 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
   683 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
   651 case class Sequence(e1: Exp, e2: Exp) extends Exp
   684 case class Sequence(e1: Exp, e2: Exp) extends Exp
   652 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
   685 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp  
   653 
   686 
   654 
   687 
   655 
       
   656 // K-language (K-expressions, K-values)
   688 // K-language (K-expressions, K-values)
   657 abstract class KExp
   689 abstract class KExp
   658 abstract class KVal
   690 abstract class KVal
   659 
   691 
   660 case class KVar(s: String) extends KVal
   692 case class KVar(s: String) extends KVal
   665 
   697 
   666 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
   698 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
   667 case class KLet(x: String, v: KVal, e: KExp) extends KExp
   699 case class KLet(x: String, v: KVal, e: KExp) extends KExp
   668 case class KReturn(v: KVal) extends KExp
   700 case class KReturn(v: KVal) extends KExp
   669 \end{lstlisting}
   701 \end{lstlisting}
   670 \caption{Abstract syntax trees for the Fun-language.\label{absfun}}
   702 \caption{Abstract syntax trees for the Fun-language and the K-language.\label{absfun}}
   671 \end{figure}
   703 \end{figure}
   672 
   704 
   673 
   705 
   674 
   706 
   675 \noindent
   707 \noindent