|     23 for this ease is that the JVM is a stack-based virtual machine and it |     23 for this ease is that the JVM is a stack-based virtual machine and it | 
|     24 is therefore not hard to translate deeply-nested arithmetic |     24 is therefore not hard to translate deeply-nested arithmetic | 
|     25 expressions into a sequence of instructions manipulating the |     25 expressions into a sequence of instructions manipulating the | 
|     26 stack. The problem is that ``real'' CPUs, although supporting stack |     26 stack. The problem is that ``real'' CPUs, although supporting stack | 
|     27 operations, are not really designed to be \emph{stack machines}.  The |     27 operations, are not really designed to be \emph{stack machines}.  The | 
|     28 design of CPUs is more like: Here are some operations and a chunk of |     28 design of CPUs is more like: Here are some instructions and a chunk of | 
|     29 memory---compiler, or better compiler writers, do something with |     29 memory---compiler, or better compiler writers, do something with | 
|     30 them. Consequently, modern compilers need to go the extra mile in |     30 them. Consequently, modern compilers need to go the extra mile in | 
|     31 order to generate code that is much easier and faster to process by |     31 order to generate code that is much easier and faster to process by | 
|     32 actual CPUs, rather than running code on virtual machines that |     32 actual CPUs, rather than running code on virtual machines that | 
|     33 introduce an additional layer of indirection. To make this all |     33 introduce an additional layer of indirection. To make this all | 
|     34 tractable for this module, we target the LLVM Intermediate |     34 tractable for this module, we target the LLVM Intermediate | 
|     35 Language. In this way we can take advantage of the tools coming with |     35 Language. In this way we can take advantage of the tools coming with | 
|     36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to |     36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to | 
|     37 worry about things like register allocations. By using LLVM, however, |     37 worry about things like register allocations. By using the LLVM-IR, | 
|     38 we also have to pay price in the sense that generating code gets |     38 however, we also have to pay price in the sense that generating code | 
|     39 harder\ldots{}unfortunately. |     39 gets harder\ldots{}unfor\-tunately. | 
|     40  |     40  | 
|     41 \subsection*{LLVM and LLVM-IR} |     41 \subsection*{LLVM and the LLVM-IR} | 
|     42  |     42  | 
|     43 \noindent LLVM is a beautiful example |     43 \noindent LLVM is a beautiful example | 
|     44 that projects from Academia can make a difference in the World. LLVM |     44 that projects from Academia can make a difference in the World. LLVM | 
|     45 started in 2000 as a project by two researchers at the  University of |     45 started in 2000 as a project by two researchers at the  University of | 
|     46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was |     46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was | 
|     59  |     59  | 
|     60 We will target the LLVM Intermediate Language, or LLVM Intermediate |     60 We will target the LLVM Intermediate Language, or LLVM Intermediate | 
|     61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the |     61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the | 
|     62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also |     62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also | 
|     63 allow us to benefit from the modular structure of the LLVM compiler |     63 allow us to benefit from the modular structure of the LLVM compiler | 
|     64 and we can let, for example, the compiler generate code for different |     64 and we can let the compiler generate code for different CPUs, for | 
|     65 CPUs, say X86 or ARM.  That means we can be agnostic about where our |     65 example X86 or ARM.  That means we can be agnostic about where our | 
|     66 code is actually going to run.\footnote{Anybody want to try to run our |     66 code is actually going to run.\footnote{Anybody want to try to run our | 
|     67   programs on Android phones?}  We can also be somewhat ignorant about |     67   programs on Android phones?}  We can also be somewhat ignorant about | 
|     68 optimising our code and about allocating memory efficiently: the LLVM |     68 optimising our code and about allocating memory efficiently: the LLVM | 
|     69 tools will take care of all this. |     69 tools will take care of all this. | 
|     70  |     70  | 
|     71 However, what we have to do in order to make LLVM to play ball is to |     71 However, what we have to do in order to make LLVM to play ball is to | 
|     72 generate code in \emph{Static Single-Assignment} format (short SSA), |     72 generate code in \emph{Static Single-Assignment} format (short SSA), | 
|     73 because that is what the LLVM-IR expects from us. A reason why LLVM |     73 because that is what the LLVM-IR expects from us. A reason why LLVM | 
|     74 uses the SSA format, rather than JVM-like stack instructions, is that |     74 uses the SSA-format, rather than JVM-like stack instructions, is that | 
|     75 stack instructions are difficult to optimise---you cannot just |     75 stack instructions are difficult to optimise---you cannot just | 
|     76 re-arrange instructions without messing about with what is calculated |     76 re-arrange instructions without messing about with what is calculated | 
|     77 on the stack. Also it is hard to find out if all the calculations on |     77 on the stack. Also it is hard to find out if all the calculations on | 
|     78 the stack are actually necessary and not by chance dead code. The JVM |     78 the stack are actually necessary and not by chance dead code. The JVM | 
|     79 has for all these obstacles sophisticated machinery to make such |     79 has for all these obstacles sophisticated machinery to make such | 
|     82 ourselves. This means we have to work around the intricacies of what |     82 ourselves. This means we have to work around the intricacies of what | 
|     83 instructions CPUs can actually process fast. This is what the SSA |     83 instructions CPUs can actually process fast. This is what the SSA | 
|     84 format is designed for. |     84 format is designed for. | 
|     85  |     85  | 
|     86  |     86  | 
|     87 The main idea behind the SSA format is to use very simple variable |     87 The main idea behind the SSA-format is to have sequences of very | 
|     88 assignments where every tmp-variable is assigned only once. The |     88 simple variable assignments where every (tmp)variable is assigned only | 
|     89 assignments also need to be primitive in the sense that they can be |     89 once. The assignments need to be simple in the sense that they can be | 
|     90 just simple operations like addition, multiplication, jumps, |     90 just be primitive operations like addition, multiplication, jumps, | 
|     91 comparisons, function calls and so on.  Say, we have an expression |     91 comparisons, function calls and so on.  Say, we have an expression | 
|     92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is |     92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding sequence | 
|         |     93 of assignments in SSA-format are: | 
|     93   |     94   | 
|     94 \begin{lstlisting}[language=LLVMIR,numbers=left] |     95 \begin{lstlisting}[language=LLVMIR,numbers=left] | 
|     95 let tmp0 = add 1 a in |     96 let tmp0 = add 1 a in | 
|     96 let tmp1 = add 3 2 in   |     97 let tmp1 = add 3 2 in   | 
|     97 let tmp2 = call foo(tmp1)   |     98 let tmp2 = call foo(tmp1)   | 
|     99 let tmp4 = add tmp2 tmp3 in  |    100 let tmp4 = add tmp2 tmp3 in  | 
|    100 let tmp5 = add tmp0 tmp4 in |    101 let tmp5 = add tmp0 tmp4 in | 
|    101   return tmp5  |    102   return tmp5  | 
|    102 \end{lstlisting} |    103 \end{lstlisting} | 
|    103  |    104  | 
|    104 \noindent where every tmp-variable is used only once (we could, for |    105 \noindent where every tmpX-variable is used only once (we could, for | 
|    105 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this |    106 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this | 
|    106 would not change the overall result). |    107 would not change the overall result). At the end we have a return-instruction | 
|         |    108 wich contains the final result of the expression. | 
|    107  |    109  | 
|    108 There are sophisticated algorithms for imperative languages, like C, |    110 There are sophisticated algorithms for imperative languages, like C, | 
|    109 that efficiently transform a high-level program into SSA format. But |    111 that efficiently transform high-level programs into SSA-format. But | 
|    110 we can ignore them here. We want to compile a functional language and |    112 we can ignore them here. We want to compile a functional language and | 
|    111 there things get much more interesting than just sophisticated. We |    113 there things get much more interesting than just sophisticated. We | 
|    112 will need to have a look at CPS translations, where the CPS stands for |    114 will need to have a look at CPS translations, where the CPS stands for | 
|    113 Continuation-Passing-Style---basically black programming art or |    115 \emph{Continuation-Passing-Style}---basically black programming art or | 
|    114 abracadabra programming. So sit tight. |    116 abracadabra programming. So sit tight. | 
|    115  |    117  | 
|    116 \subsection*{LLVM-IR} |    118 \subsection*{LLVM-IR} | 
|    117  |    119  | 
|    118 Before we start, let's however first have a look at the \emph{LLVM Intermediate |    120 Before we start, let's however first have a look at the \emph{LLVM Intermediate | 
|    230 \end{lstlisting} |    232 \end{lstlisting} | 
|    231  |    233  | 
|    232 \begin{figure}[t]\small  |    234 \begin{figure}[t]\small  | 
|    233 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll} |    235 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll} | 
|    234 \caption{An LLVM-IR program for calculating the square function. It |    236 \caption{An LLVM-IR program for calculating the square function. It | 
|    235   calls the sqr-function in \texttt{@main} with the argument \texttt{5} |    237   calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5} | 
|    236   (Line 20). The |    238   (Line 20). The | 
|    237   code for the \texttt{sqr} function is in Lines 13 -- 16. It stores |    239   code for the \texttt{sqr}-function is in Lines 13 -- 16. It stores | 
|    238   the result of sqr in the variable called \texttt{\%i} and then |    240   the result of sqr in the variable called \texttt{\%1} and then | 
|    239   prints out the contents of this variable in Line 21. The other |    241   prints out the contents of this variable in Line 21. The other | 
|    240   code is boilerplate for printing out integers.\label{lli}} |    242   code is boilerplate for printing out integers.\label{lli}} | 
|    241 \end{figure}    |    243 \end{figure}    | 
|    242  |    244  | 
|    243  |    245  | 
|    288 Here \texttt{tmp3} will contain the result of what the whole |    290 Here \texttt{tmp3} will contain the result of what the whole | 
|    289 expression stands for. In each individual step we can only perform an |    291 expression stands for. In each individual step we can only perform an | 
|    290 ``atomic'' or ``trival'' operation, like addition or multiplication of |    292 ``atomic'' or ``trival'' operation, like addition or multiplication of | 
|    291 a number or a variable.  We are not allowed to have for example a |    293 a number or a variable.  We are not allowed to have for example a | 
|    292 nested addition or an if-condition on the right-hand side of an |    294 nested addition or an if-condition on the right-hand side of an | 
|    293 assignment. Such constraints are forced upon us because of how the SSA |    295 assignment. Such constraints are forced upon us because of how the | 
|    294 format works in the LLVM-IR. To simplify matters we represent |    296 SSA-format works in the LLVM-IR. To simplify matters we represent | 
|    295 assignments with two kinds of syntactic entities, namely |    297 assignments with two kinds of syntactic entities, namely | 
|    296 \emph{K-values} and \emph{K-expressions}. K-values are all ``things'' |    298 \emph{K-values} and \emph{K-expressions}. K-values are all ``things'' | 
|    297 that can appear on the right-hand side of an equal. Say we have |    299 that can appear on the right-hand side of an equal. Say we have the | 
|    298 the following definition for K-values: |    300 following definition for K-values: | 
|    299  |    301  | 
|    300 \begin{lstlisting}[language=Scala,numbers=none] |    302 \begin{lstlisting}[language=Scala,numbers=none] | 
|    301 enum KVal { |    303 enum KVal { | 
|    302   case KVar(s: String) |    304   case KVar(s: String) | 
|    303   case KNum(n: Int) |    305   case KNum(n: Int) | 
|    310 where a K-value can be a variable, a number or a ``trivial'' binary |    312 where a K-value can be a variable, a number or a ``trivial'' binary | 
|    311 operation, such as addition or multiplication. The two arguments of a |    313 operation, such as addition or multiplication. The two arguments of a | 
|    312 binary operation need to be K-values as well. Finally, we have |    314 binary operation need to be K-values as well. Finally, we have | 
|    313 function calls, but again each argument of the function call needs to |    315 function calls, but again each argument of the function call needs to | 
|    314 be a K-value.  One point we need to be careful, however, is that the |    316 be a K-value.  One point we need to be careful, however, is that the | 
|    315 arguments of the binary operations and function calls are in fact only |    317 arguments of the binary operations and function calls can in fact only | 
|    316 variables or numbers. To encode this constraint into the type-system |    318 be variables or numbers. To encode this constraint into the type-system | 
|    317 of Scala would make things too complicated---to satisfy this |    319 of Scala, however, would make things too complicated---to satisfy this | 
|    318 constraint is therefore on us. For our Fun-language, we will later on |    320 constraint is therefore on us. For our Fun-language, we will later on | 
|    319 consider some further K-values. |    321 consider some further K-values. | 
|    320  |    322  | 
|    321  |    323  | 
|    322 Our K-expressions will encode the \texttt{let} and the \texttt{return} |    324 Our K-expressions will encode the \texttt{let} and the \texttt{return} | 
|    331 \end{lstlisting} |    333 \end{lstlisting} | 
|    332  |    334  | 
|    333 \noindent |    335 \noindent | 
|    334 By having in \texttt{KLet} taking first a string (standing for an |    336 By having in \texttt{KLet} taking first a string (standing for an | 
|    335 intermediate variable) and second a value, we can fulfil the SSA |    337 intermediate variable) and second a value, we can fulfil the SSA | 
|    336 constraint in assignments ``by construction''---there is no way we |    338 constraint in assignments ``by con\-struction''---there is no way we | 
|    337 could write anything else than a K-value.  Note that the third |    339 could write anything else than a K-value.  Note that the third | 
|    338 argument of a \texttt{KLet} is again a K-expression, meaning either |    340 argument of a \texttt{KLet} is again a K-expression, meaning either | 
|    339 another \texttt{KLet} or a \texttt{KReturn}. In this way we can |    341 another \texttt{KLet} or a \texttt{KReturn}. In this way we can | 
|    340 construct a sequence of computations and return a final result.  Of |    342 construct a sequence of computations and return a final result.  Of | 
|    341 course we also have to ensure that all intermediary computations are |    343 course we also have to ensure that all intermediary computations are | 
|    342 given (fresh) names---we will use an (ugly) counter for this. |    344 given (fresh) names---we will use an (ugly) counter for this. | 
|    343  |    345  | 
|    344 To sum up, K-values are the atomic operations that can be on the |    346 To sum up, K-values are the atomic operations that can be on the | 
|    345 right-hand side of assignemnts. The K-language is restricted such that |    347 right-hand side of assignemnts. The K-language is restricted such that | 
|    346 it is easy to generate the SSA format for the LLVM-IR. What remains to |    348 it is easy to generate the SSA-format for the LLVM-IR. What remains to | 
|    347 be done is a translation of our Fun-language into the K-language. The |    349 be done is a translation of our Fun-language into the K-language. The | 
|    348 main difficulty is that we need to order the computationa---in teh |    350 main difficulty is that we need to order the computation---in the | 
|    349 K-language we only have linear sequence of instructions to need to be |    351 K-language we only have linear sequence of instructions. Before we | 
|    350 followed. Before we explain this, we have a look at some |    352 explain this, we have a look at some CPS-translation. | 
|    351 CPS-translation. |         | 
|    352  |    353  | 
|    353  |    354  | 
|    354  |    355  | 
|    355  |    356  | 
|    356 \subsection*{CPS-Translations} |    357 \subsection*{CPS-Translations} | 
|    357  |    358  | 
|    358 CPS stands for Continuation-Passing-Style. It is a kind of programming |    359 CPS stands for \emph{Continuation-Passing-Style}. It is a kind of programming | 
|    359 technique often used in advanced functional programming. Before we delve |    360 technique often used in advanced functional programming. Before we delve | 
|    360 into the CPS-translation for our Fun-language, let us look at  |    361 into the CPS-translation for our Fun-language compiler, let us look at  | 
|    361 CPS-versions of some well-known functions. Consider |    362 CPS-versions of some well-known functions. Consider | 
|    362  |    363  | 
|    363 \begin{lstlisting}[language=Scala, numbers=none] |    364 \begin{lstlisting}[language=Scala, numbers=none] | 
|    364 def fact(n: Int) : Int =  |    365 def fact(n: Int) : Int =  | 
|    365   if (n == 0) 1 else n * fact(n - 1)  |    366   if (n == 0) 1 else n * fact(n - 1)  | 
|    433 (1 + 1) + 1 |    434 (1 + 1) + 1 | 
|    434 3 |    435 3 | 
|    435 \end{lstlisting} |    436 \end{lstlisting} | 
|    436  |    437  | 
|    437 \noindent |    438 \noindent | 
|    438 The point of this section is that you are playing around with these |    439 The point of this section is that you should be playing around with these | 
|    439 simple definitions and make sure they calculate the expected result. |    440 simple definitions and make sure they calculate the expected result. | 
|    440 In the next step, you should understand \emph{how} these functions |    441 In the next step, you should understand \emph{how} these functions | 
|    441 calculate the result with the continuations.  |    442 calculate the result with the continuations.  | 
|    442  |    443  | 
|    443  |    444  | 
|    444 \section*{Worked Example} |    445 \section*{Worked Example} | 
|    445  |    446  | 
|    446 Let us now come back to the CPS-translations for the Fun-language. The |    447 Let us now come back to the CPS-translations for the Fun-language. | 
|    447 main difficulty of generating instructions in SSA format is that large |    448 Though we will start with a simpler subset containing only numbers, | 
|    448 compound expressions need to be broken up into smaller pieces and |    449 arithmetic expressions and function calls.  The main difficulty of | 
|    449 intermediate results need to be chained into later instructions. To do |    450 generating instructions in SSA-format is that large compound | 
|    450 this conveniently, we use the CPS-translation mechanism. What continuations |    451 expressions need to be broken up into smaller pieces and intermediate | 
|         |    452 results need to be chained into later instructions. To do this | 
|         |    453 conveniently, we use the CPS-translation mechanism. What continuations | 
|    451 essentially solve is the following problem: Given an expressions |    454 essentially solve is the following problem: Given an expressions | 
|    452  |    455  | 
|    453 \begin{equation}\label{exp} |    456 \begin{equation}\label{exp} | 
|    454 (1 + 2) * (3 + 4) |    457 (1 + 2) * (3 + 4) | 
|    455 \end{equation}   |    458 \end{equation}   | 
|    456  |    459  | 
|    457 \noindent |    460 \noindent | 
|    458 which of the subexpressions should be calculated first? We just arbitrarily |    461 which of the subexpressions should be calculated first? We just | 
|    459 going to decide that it will be from left to right. This means we have |    462 arbitrarily going to decide that it will be from left to right. Other | 
|    460 to tear the expression shown in \eqref{exp} apart as follows: |    463 languages make different choices---C famously is right to left. In our | 
|         |    464 case this means we have to tear the expression shown in \eqref{exp} | 
|         |    465 apart as follows: | 
|    461  |    466  | 
|    462 \[ |    467 \[ | 
|    463 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4) |    468 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4) | 
|    464 \]   |    469 \]   | 
|    465  |    470  | 
|    466 \noindent |    471 \noindent | 
|    467 The first one will give us a result, but the second one is not |    472 The first subexpression can be calculated and will give us a result, | 
|    468 really an expression that makes sense. It will only make sense |    473 but the second one is not really an expression that makes sense. It | 
|    469 as the next step in the computation when we fill-in the result of |    474 will only make sense as the next step in the computation when we | 
|    470 $1+2$ into the ``hole'' indicated by $\Box$. Such incomplete |    475 fill-in the result of $1+2$ into the ``hole'' indicated by | 
|    471 expressions can be represented in Scala as functions |    476 $\Box$. Such incomplete expressions can be represented in Scala as | 
|         |    477 functions | 
|    472  |    478  | 
|    473 \begin{lstlisting}[language=Scala, numbers=none] |    479 \begin{lstlisting}[language=Scala, numbers=none] | 
|    474 r => r * (3 + 4) |    480 r => r * (3 + 4) | 
|    475 \end{lstlisting}   |    481 \end{lstlisting}   | 
|    476  |    482  | 
|    477 \noindent |    483 \noindent | 
|    478 where \texttt{r} is any result that has been computed earlier. By the |    484 where \texttt{r} is any result that has been computed earlier. By the | 
|    479 way in lambda-calculus notation we would write |    485 way, in lambda-calculus notation we would write | 
|    480 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use |    486 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use | 
|    481 functions (``continuations'') to represent what is coming next in a |    487 functions (``continuations'') to represent what is coming next in a | 
|    482 sequence of instructions. In our case, continuations are functions of |    488 sequence of instructions. In our case, continuations are functions of | 
|    483 type \code{KVal} to \code{KExp}. They can be seen as a sequence of |    489 type \code{KVal} to \code{KExp}. They can be seen as a sequence of | 
|    484 \code{KLet}s where there is a ``hole'' that needs to be |    490 \code{KLet}s where there is a ``hole'' that needs to be | 
|    496 where in the second line is a $\Box$ which still expects a \code{KVal} |    502 where in the second line is a $\Box$ which still expects a \code{KVal} | 
|    497 to be filled in before it becomes a ``proper'' \code{KExp}. When we  |    503 to be filled in before it becomes a ``proper'' \code{KExp}. When we  | 
|    498 apply an argument to the continuation (remember they are functions) |    504 apply an argument to the continuation (remember they are functions) | 
|    499 we essentially fill something into the corresponding hole. |    505 we essentially fill something into the corresponding hole. | 
|    500  |    506  | 
|    501 Lets look at concrete code. To simplify matters first,  |    507 Lets look at some concrete code. To simplify matters,  | 
|    502 suppose our source language consists just of three kinds of expressions |    508 suppose our source language consists just of three kinds of expressions | 
|    503  |    509  | 
|    504 \begin{lstlisting}[language=Scala,numbers=none] |    510 \begin{lstlisting}[language=Scala,numbers=none] | 
|    505 enum Expr { |    511 enum Expr { | 
|    506     case Num(n: Int) |    512     case Num(n: Int) | 
|    517   e match { ... } |    523   e match { ... } | 
|    518 \end{lstlisting} |    524 \end{lstlisting} | 
|    519  |    525  | 
|    520 \noindent  |    526 \noindent  | 
|    521 where \code{k} is the continuation and \code{e} is the expression to |    527 where \code{k} is the continuation and \code{e} is the expression to | 
|    522 be compiled. In case we have numbers, we can just pass them to the |    528 be compiled. The result of the function is a K-expression, which later | 
|    523 continuations because numbers need not be further teared apart as they |    529 can be compiled into LLVM-IR code. | 
|    524 are already atomic. Passing the number to the continuation means we  |    530  | 
|    525 apply the continuation like  |    531 In case we have numbers, then we can just pass them in CPS-translation | 
|         |    532 to the continuations because numbers need not be further teared apart | 
|         |    533 as they are already atomic. Passing the number to the continuation | 
|         |    534 means we apply the continuation like | 
|    526  |    535  | 
|    527 \begin{lstlisting}[language=Scala,numbers=none] |    536 \begin{lstlisting}[language=Scala,numbers=none] | 
|    528   case Num(i) => k(KNum(i)) |    537   case Num(i) => k(KNum(i)) | 
|    529 \end{lstlisting} |    538 \end{lstlisting} | 
|    530  |    539  | 
|    531 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression. |    540 \noindent This would fill in the $\Box$ in a \code{KLet}-expression. | 
|    532 There is no need to create a temporary variable for simple numbers. |    541 Since \texttt{k} is a function from \texttt{KVar} to \texttt{KExp} we | 
|    533 More interesting is the case for arithmetic operations. |    542 also optain the correct type for CPS, namely \texttt{KExp}.  There is | 
|         |    543 no need to create a temporary variable for simple numbers.  More | 
|         |    544 interesting is the case for arithmetic operations. | 
|    534  |    545  | 
|    535 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |    546 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] | 
|    536 case Aop(op, e1, e2) => { |    547 case Aop(op, e1, e2) => { | 
|    537   val z = Fresh("tmp") |    548   val z = Fresh("tmp") | 
|    538   CPS(e1)(r1 =>  |    549   CPS(e1)(r1 =>  | 
|    555 \noindent |    566 \noindent | 
|    556 Note that this assignment has two ``holes'', one for the left |    567 Note that this assignment has two ``holes'', one for the left | 
|    557 subexpression and one for the right subexpression. We can fill both |    568 subexpression and one for the right subexpression. We can fill both | 
|    558 holes by calling the CPS function twice---this is a bit of the black |    569 holes by calling the CPS function twice---this is a bit of the black | 
|    559 art in CPS. We can use the second call of CPS as the continuation |    570 art in CPS. We can use the second call of CPS as the continuation | 
|    560 of the first call: The reason is that |    571 of the first call. The reason is that | 
|    561  |    572  | 
|    562 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |    573 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] | 
|    563 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))) |    574 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))) | 
|    564 \end{lstlisting} |    575 \end{lstlisting} | 
|    565  |    576  | 
|    567 we created the assignment with the fresh temporary variable |    578 we created the assignment with the fresh temporary variable | 
|    568 \texttt{z}, we need to ``communicate'' that the result of the |    579 \texttt{z}, we need to ``communicate'' that the result of the | 
|    569 computation of the arithmetic expression is stored in \texttt{z} to the |    580 computation of the arithmetic expression is stored in \texttt{z} to the | 
|    570 computations that follow. In this way we apply the continuation \texttt{k} with this |    581 computations that follow. In this way we apply the continuation \texttt{k} with this | 
|    571 new variable (essentially we are plugging in a hole further down the line). |    582 new variable (essentially we are plugging in a hole further down the line). | 
|         |    583 Hope this makes sense. | 
|    572  |    584  | 
|    573 The last case we need to consider in our small expression language are |    585 The last case we need to consider in our small expression language are | 
|    574 function calls. |    586 function calls. | 
|    575  |    587  | 
|    576 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |    588 \begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm] | 
|    577 case Call(fname, args) => { |    589 case Call(fname, args) => { | 
|    578   def aux(args: List[Expr], vs: List[KVal]): KExp = args match { |    590   def aux(args: List[Expr], vs: List[KVal]): KExp = args match { | 
|    579        case Nil => { |    591        case Nil => { | 
|    580          val z = Fresh("tmp") |    592          val z = Fresh("tmp") | 
|    581          KLet(z, KCall(fname, vs), k(KVar(z))) |    593          KLet(z, KCall(fname, vs), k(KVar(z))) | 
|    589 \noindent |    601 \noindent | 
|    590 For them we introduce an auxiliary function that compiles first all |    602 For them we introduce an auxiliary function that compiles first all | 
|    591 function arguments---remember in our source language we can have calls |    603 function arguments---remember in our source language we can have calls | 
|    592 such as $foo(3 + 4, g(3))$ where we first have to create temporary |    604 such as $foo(3 + 4, g(3))$ where we first have to create temporary | 
|    593 variables (and computations) for each argument. Therefore \texttt{aux} |    605 variables (and computations) for each argument. Therefore \texttt{aux} | 
|    594 analyses the arguments from left to right. In case there is an argument |    606 analyses the argument list from left to right. In case there is an | 
|    595 \texttt{a} on the front of the list (the case \texttt{a::as}), we call |    607 argument \texttt{a} on the front of the list (the case \texttt{a::as} | 
|    596 CPS recursively for the corresponding subexpression. The temporary variable |    608 in Line 7), we call CPS recursively for the corresponding | 
|    597 containing the result for this argument we add to the end of the K-values we |    609 subexpression. The temporary variable containing the result for this | 
|    598 have already analysed before. Again very conveniently we can use the |    610 argument we add to the end of the K-values we have already analysed | 
|    599 recursive call to \texttt{aux} as the continuation for the computations |    611 before. Again very conveniently we can use the recursive call to | 
|    600 that follow. If we reach the end of the argument list (the \texttt{Nil}-case), |    612 \texttt{aux} as the continuation for the computations that | 
|    601 then we collect all the K-values \texttt{v1} to \texttt{vn} and call the |    613 follow. When we reach the end of the argument list (the | 
|    602 function in the K-language like so |    614 \texttt{Nil}-case in Lines 3--6), then we collect all the K-values | 
|         |    615 \texttt{v1} to \texttt{vn} and call the function in the K-language | 
|         |    616 like so | 
|    603  |    617  | 
|    604  |    618  | 
|    605 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}] |    619 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}] | 
|    606 let z = call foo(v1,...,vn) in |    620 let z = call foo(v1,...,vn) in | 
|    607 ... |    621 ... | 
|    624 The last question we need to answer in the code (see Figure~\ref{cps}) is how we |    638 The last question we need to answer in the code (see Figure~\ref{cps}) is how we | 
|    625 start the CPS-translation function, or more precisely with which continuation we |    639 start the CPS-translation function, or more precisely with which continuation we | 
|    626 should start CPS? It turns out that this initial continuation will be the last one that is |    640 should start CPS? It turns out that this initial continuation will be the last one that is | 
|    627 called. What should be the last step in the computation? Yes, we need to return the |    641 called. What should be the last step in the computation? Yes, we need to return the | 
|    628 temporary variable where the last result is stored (if it is a simple number, then we can |    642 temporary variable where the last result is stored (if it is a simple number, then we can | 
|    629 just return this number). Therefore we cal CPS with the code |    643 just return this number). Therefore we call CPS with the code | 
|    630  |    644  | 
|    631 \begin{lstlisting}[language=Scala,numbers=none] |    645 \begin{lstlisting}[language=Scala,numbers=none] | 
|    632 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_)) |    646 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_)) | 
|    633 \end{lstlisting} |    647 \end{lstlisting} | 
|    634  |    648  | 
|    635 \noindent |    649 \noindent | 
|    636 where we give the function \code{KReturn(_)}.  |    650 where we give the function \code{KReturn(_)} as the continuation. With | 
|         |    651 this we completed the translation of expressions into our | 
|         |    652 K-language. Figure~\ref{absfun} gives the complete datatypes for the | 
|         |    653 ASTs of the Fun-language and the K-values and K-expressions for the | 
|         |    654 K-language. | 
|         |    655  | 
|         |    656 Having obtained a K-expression, it is relatively straightforward to | 
|         |    657 generate a valid program for the LLVM-IR. We leave this to the | 
|         |    658 attentive reader. What else can we do?  Well it should be relatively | 
|         |    659 easy to apply some common optimisations to the K-expressions. One | 
|         |    660 optimisations is called constant folding---for example if we have an | 
|         |    661 expression $3 + 4$ then we can replace it by just $5$ instead of | 
|         |    662 generating code to compute $5$. Now this information needs to be | 
|         |    663 propagated to the next computation step to see whether any further | 
|         |    664 constant foldings are possible. Another useful technique is common | 
|         |    665 subexpression elimination, meaning if you have twice a calculation of | 
|         |    666 a function $foo(a)$, then we want to call it only once and store the | 
|         |    667 result in a temporary variable that can be used instead of the second, | 
|         |    668 or third, call to $foo(a)$. Again I leave this to the attentive reader | 
|         |    669 to work out and implement. | 
|    637  |    670  | 
|    638  |    671  | 
|    639 \begin{figure}[p]\small |    672 \begin{figure}[p]\small | 
|    640 \begin{lstlisting}[language=Scala,numbers=none] |    673 \begin{lstlisting}[language=Scala,numbers=none] | 
|    641 // Fun language (expressions) |    674 // Fun language (expressions) | 
|    650 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp |    683 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp | 
|    651 case class Sequence(e1: Exp, e2: Exp) extends Exp |    684 case class Sequence(e1: Exp, e2: Exp) extends Exp | 
|    652 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp   |    685 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp   | 
|    653  |    686  | 
|    654  |    687  | 
|    655  |         | 
|    656 // K-language (K-expressions, K-values) |    688 // K-language (K-expressions, K-values) | 
|    657 abstract class KExp |    689 abstract class KExp | 
|    658 abstract class KVal |    690 abstract class KVal | 
|    659  |    691  | 
|    660 case class KVar(s: String) extends KVal |    692 case class KVar(s: String) extends KVal |