handouts/ho09.tex
changeset 912 e32802acf952
parent 909 a04efdd5e7a3
child 913 eef6a56c185a
equal deleted inserted replaced
911:df8660143051 912:e32802acf952
    63 allow us to benefit from the modular structure of the LLVM compiler
    63 allow us to benefit from the modular structure of the LLVM compiler
    64 and we can let the compiler generate code for different CPUs, for
    64 and we can let the compiler generate code for different CPUs, for
    65 example X86 or ARM.  That means we can be agnostic about where our
    65 example X86 or ARM.  That means we can be agnostic about where our
    66 code is actually going to run.\footnote{Anybody want to try to run our
    66 code is actually going to run.\footnote{Anybody want to try to run our
    67   programs on Android phones?}  We can also be somewhat ignorant about
    67   programs on Android phones?}  We can also be somewhat ignorant about
    68 optimising our code and about allocating memory efficiently: the LLVM
    68 optimising our code and about allocating memory efficiently---the LLVM
    69 tools will take care of all this.
    69 tools will take care of all this.
    70 
    70 
    71 However, what we have to do in order to make LLVM to play ball is to
    71 However, what we have to do in order to make LLVM to play ball is to
    72 generate code in \emph{Static Single-Assignment} format (short SSA),
    72 generate code in \emph{Static Single-Assignment} format (short SSA). A
    73 because that is what the LLVM-IR expects from us. A reason why LLVM
    73 reason why LLVM uses the SSA-format, rather than JVM-like stack
    74 uses the SSA-format, rather than JVM-like stack instructions, is that
    74 instructions, is that stack instructions are difficult to
    75 stack instructions are difficult to optimise---you cannot just
    75 optimise---you cannot just re-arrange instructions without messing
    76 re-arrange instructions without messing about with what is calculated
    76 about with what is calculated on the stack. Also it is hard to find
    77 on the stack. Also it is hard to find out if all the calculations on
    77 out if all the calculations on the stack are actually necessary and
    78 the stack are actually necessary and not by chance dead code. The JVM
    78 not by chance dead code. The JVM has for all these obstacles
    79 has for all these obstacles sophisticated machinery to make such
    79 sophisticated machinery to make such ``high-level'' code still run
    80 ``high-level'' code still run fast, but let's say that for the sake of
    80 fast, but let's say that for the sake of argument we do not want to
    81 argument we do not want to rely on it. We want to generate fast code
    81 rely on it. We want to generate fast code ourselves. This means we
    82 ourselves. This means we have to work around the intricacies of what
    82 have to work around the intricacies of what instructions CPUs can
    83 instructions CPUs can actually process fast. This is what the SSA
    83 actually process fast. This is what the SSA format is designed for.
    84 format is designed for.
       
    85 
    84 
    86 
    85 
    87 The main idea behind the SSA-format is to have sequences of very
    86 The main idea behind the SSA-format is to have sequences of very
    88 simple variable assignments where every (tmp)variable is assigned only
    87 simple variable assignments where every (tmp)variable is assigned only
    89 once. The assignments need to be simple in the sense that they can be
    88 once. The assignments need to be simple in the sense that they can be
   101 let tmp5 = add tmp0 tmp4 in
   100 let tmp5 = add tmp0 tmp4 in
   102   return tmp5 
   101   return tmp5 
   103 \end{lstlisting}
   102 \end{lstlisting}
   104 
   103 
   105 \noindent where every tmpX-variable is used only once (we could, for
   104 \noindent where every tmpX-variable is used only once (we could, for
   106 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this
   105 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if
   107 would not change the overall result). At the end we have a return-instruction
   106 this would not change the overall result). At the end we have a
   108 wich contains the final result of the expression.
   107 return-instruction wich contains the final result of the
       
   108 expression. As can be seen, the task we have to solve is to take apart
       
   109 compound expressions as shown above and transfrom them into a sequence
       
   110 of simple assignments. Note that this for example means we have to
       
   111 fix the order in which the expression is calculated. 
   109 
   112 
   110 There are sophisticated algorithms for imperative languages, like C,
   113 There are sophisticated algorithms for imperative languages, like C,
   111 that efficiently transform high-level programs into SSA-format. But
   114 that efficiently transform high-level programs into SSA-format. But
   112 we can ignore them here. We want to compile a functional language and
   115 we can ignore them here. We want to compile a functional language and
   113 there things get much more interesting than just sophisticated. We
   116 there things get much more interesting than just sophisticated. We
   159 addition and multiplication, are also prefixed with the type they
   162 addition and multiplication, are also prefixed with the type they
   160 operate on. Obviously these types need to match up\ldots{} but since
   163 operate on. Obviously these types need to match up\ldots{} but since
   161 we have in our programs only integers, for the moment \texttt{i32}
   164 we have in our programs only integers, for the moment \texttt{i32}
   162 everywhere will do. We do not have to generate any other types, but
   165 everywhere will do. We do not have to generate any other types, but
   163 obviously this is a limitation in our Fun-language (and which we
   166 obviously this is a limitation in our Fun-language (and which we
   164 lift in the final CW).
   167 are going to lift in the final CW).
   165  
   168  
   166 There are a few interesting instructions in the LLVM-IR which are quite
   169 There are a few interesting instructions in the LLVM-IR which are quite
   167 different than what we have seen in the JVM. Can you remember the
   170 different than what we have seen in the JVM. Can you remember the
   168 kerfuffle we had to go through with boolean expressions and negating the
   171 kerfuffle we had to go through with boolean expressions and negating the
   169 condition? In the LLVM-IR, branching  if-conditions are implemented
   172 condition? In the LLVM-IR, branching  if-conditions are implemented
   209 \noindent
   212 \noindent
   210 where the arguments can only be simple variables and numbers, but not compound
   213 where the arguments can only be simple variables and numbers, but not compound
   211 expressions.
   214 expressions.
   212 
   215 
   213 Conveniently, you can use the program \texttt{lli}, which comes with
   216 Conveniently, you can use the program \texttt{lli}, which comes with
   214 LLVM, to interpret programs written in the LLVM-IR. So you can easily
   217 LLVM, to interpret programs written in the LLVM-IR. Type on the command line
   215 check whether the code you produced actually works. To get a running
   218 
   216 program that does something interesting you need to add some boilerplate
   219 \begin{lstlisting}[language=bash,numbers=none]
   217 about printing out numbers and a main-function that is the entry point
   220 lli sqr.ll
   218 for the program (see Figure~\ref{lli} for a complete listing of the square function). Again
   221 \end{lstlisting}
       
   222 
       
   223 \noindent
       
   224 and you can see the output of the pragram generated. 
       
   225 This means you can easily check whether the code you produced actually
       
   226 works. To get a running program that does something interesting you
       
   227 need to add some boilerplate about printing out numbers and a
       
   228 main-function that is the entry point for the program (see
       
   229 Figure~\ref{lli} for a complete listing of the square function). Again
   219 this is very similar to the boilerplate we needed to add in our JVM
   230 this is very similar to the boilerplate we needed to add in our JVM
   220 compiler. 
   231 compiler.
   221 
   232 
   222 You can generate a binary for the program in Figure~\ref{lli} by using
   233 You can generate a binary for the program in Figure~\ref{lli} by using
   223 the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
   234 the \texttt{llc}-compiler and then \texttt{gcc}/\texttt{clang}, whereby \texttt{llc} generates
   224 an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
   235 an object file and \texttt{gcc} (that is actually \texttt{clang} on my Mac) generates the
   225 executable binary:
   236 executable binary:
   234 \begin{figure}[t]\small 
   245 \begin{figure}[t]\small 
   235 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   246 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   236 \caption{An LLVM-IR program for calculating the square function. It
   247 \caption{An LLVM-IR program for calculating the square function. It
   237   calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
   248   calls the \texttt{sqr}-function in \texttt{@main} with the argument \texttt{5}
   238   (Line 20). The
   249   (Line 20). The
   239   code for the \texttt{sqr}-function is in Lines 13 -- 16. It stores
   250   code for the \texttt{sqr}-function is in Lines 13 -- 16. The main-function
       
   251   stores
   240   the result of sqr in the variable called \texttt{\%1} and then
   252   the result of sqr in the variable called \texttt{\%1} and then
   241   prints out the contents of this variable in Line 21. The other
   253   prints out the contents of this variable in Line 21. The other
   242   code is boilerplate for printing out integers.\label{lli}}
   254   code is boilerplate for printing out integers.\label{lli}}
   243 \end{figure}   
   255 \end{figure}   
   244 
   256 
   311 \noindent
   323 \noindent
   312 where a K-value can be a variable, a number or a ``trivial'' binary
   324 where a K-value can be a variable, a number or a ``trivial'' binary
   313 operation, such as addition or multiplication. The two arguments of a
   325 operation, such as addition or multiplication. The two arguments of a
   314 binary operation need to be K-values as well. Finally, we have
   326 binary operation need to be K-values as well. Finally, we have
   315 function calls, but again each argument of the function call needs to
   327 function calls, but again each argument of the function call needs to
   316 be a K-value.  One point we need to be careful, however, is that the
   328 be a K-value.  One constraint we need to be careful about is that the
   317 arguments of the binary operations and function calls can in fact only
   329 arguments of the binary operations and function calls can only be
   318 be variables or numbers. To encode this constraint into the type-system
   330 variables or numbers. For example
   319 of Scala, however, would make things too complicated---to satisfy this
   331 
   320 constraint is therefore on us. For our Fun-language, we will later on
   332 \begin{lstlisting}[language=Scala,numbers=none]
   321 consider some further K-values.
   333 KAop("+", KAop("*", KNum(1), KNum(2)), KNum(3))
       
   334 KCall("foo", List(KAop("+", KNum(4), KNum(5)))  
       
   335 \end{lstlisting}
       
   336 
       
   337 \noindent
       
   338 while perfect instances of K-values according to the types, are not
       
   339 allowed in the LLVM-IR. To encode this constraint into the type-system
       
   340 of Scala, however, would make things overly complicated---to satisfy
       
   341 this constraint is therefore on us. For the moment this will be all
       
   342 K-values we are considdering. Later on, we will have some more for our
       
   343 Fun-language.
   322 
   344 
   323 
   345 
   324 Our K-expressions will encode the \texttt{let} and the \texttt{return}
   346 Our K-expressions will encode the \texttt{let} and the \texttt{return}
   325 from the SSA-format (again for the moment we only consider these two
   347 from the SSA-format (again for the moment we only consider these two
   326 constructors---later on we will have if-branches as well).
   348 constructors---later on we will have if-branches as well).
   331   case KReturn(v: KVal)
   353   case KReturn(v: KVal)
   332 }
   354 }
   333 \end{lstlisting}
   355 \end{lstlisting}
   334 
   356 
   335 \noindent
   357 \noindent
   336 By having in \texttt{KLet} taking first a string (standing for an
   358 By having in \texttt{KLet} taking first a string (standing for a
   337 intermediate variable) and second a value, we can fulfil the SSA
   359 tmp-variable) and second a value, we can fulfil the SSA constraint in
   338 constraint in assignments ``by con\-struction''---there is no way we
   360 assignments ``by con\-struction''---there is no way we could write
   339 could write anything else than a K-value.  Note that the third
   361 anything else than a K-value.  Note that the third argument of a
   340 argument of a \texttt{KLet} is again a K-expression, meaning either
   362 \texttt{KLet} is again a K-expression, meaning either another
   341 another \texttt{KLet} or a \texttt{KReturn}. In this way we can
   363 \texttt{KLet} or a \texttt{KReturn}. In this way we can construct a
   342 construct a sequence of computations and return a final result.  Of
   364 sequence of computations and indicate what is the final result of the
   343 course we also have to ensure that all intermediary computations are
   365 computations.  According to the SSA-format, we also have to ensure
   344 given (fresh) names---we will use an (ugly) counter for this.
   366 that all intermediary computations are given (fresh) names---we will
       
   367 use an (ugly) counter for this.
   345 
   368 
   346 To sum up, K-values are the atomic operations that can be on the
   369 To sum up, K-values are the atomic operations that can be on the
   347 right-hand side of assignemnts. The K-language is restricted such that
   370 right-hand side of assignemnts. The K-language is restricted such that
   348 it is easy to generate the SSA-format for the LLVM-IR. What remains to
   371 it is easy to generate the SSA-format for the LLVM-IR. What remains to
   349 be done is a translation of our Fun-language into the K-language. The
   372 be done is a translation of our Fun-language into the K-language. The
   350 main difficulty is that we need to order the computation---in the
   373 main difficulty is that we need to order the computation---in the
   351 K-language we only have linear sequence of instructions. Before we
   374 K-language we only have linear sequence of instructions. Before we
   352 explain this, we have a look at some CPS-translation.
   375 explain this, we have a look at some programs in CPS-style.
   353 
   376 
   354 
   377 
   355 
   378 
   356 
   379 
   357 \subsection*{CPS-Translations}
   380 \subsection*{CPS-Translations}
   358 
   381 
   359 CPS stands for \emph{Continuation-Passing-Style}. It is a kind of programming
   382 CPS stands for \emph{Continuation-Passing-Style}. It is a kind of
   360 technique often used in advanced functional programming. Before we delve
   383 programming technique often used in advanced functional programming
   361 into the CPS-translation for our Fun-language compiler, let us look at 
   384 and in particular in compilers. Before we delve into the
       
   385 CPS-translation for our Fun-language compiler, let us look at
   362 CPS-versions of some well-known functions. Consider
   386 CPS-versions of some well-known functions. Consider
   363 
   387 
   364 \begin{lstlisting}[language=Scala, numbers=none]
   388 \begin{lstlisting}[language=Scala, numbers=none]
   365 def fact(n: Int) : Int = 
   389 def fact(n: Int) : Int = 
   366   if (n == 0) 1 else n * fact(n - 1) 
   390   if (n == 0) 1 else n * fact(n - 1) 
   367 \end{lstlisting}
   391 \end{lstlisting}
   368 
   392 
   369 \noindent 
   393 \noindent 
   370 This is clearly the usual factorial function. But now consider the
   394 This is clearly the usual factorial function. But now consider the
   371 following version of the factorial function:
   395 following slight variant of the factorial function:
       
   396 
       
   397 \begin{lstlisting}[language=Scala, numbers=left]
       
   398 def factC(n: Int, k: Int => Int) : Int = 
       
   399   if (n == 0) k(1) 
       
   400   else factC(n - 1, x => k(n * x)) 
       
   401 
       
   402 factC(3, id)
       
   403 \end{lstlisting}
       
   404 
       
   405 \noindent
       
   406 The function is is Lines 1--3. The function call is in Line 5.  We
       
   407 call this function with a number, in this case 3, and the
       
   408 identity-function (which returns just its input). The \texttt{k} is
       
   409 the continuation in this function. The recursive calls are:
   372 
   410 
   373 \begin{lstlisting}[language=Scala, numbers=none]
   411 \begin{lstlisting}[language=Scala, numbers=none]
   374 def factC(n: Int, ret: Int => Int) : Int = 
   412 factC(2, x => id(3 * x))
   375   if (n == 0) ret(1) 
   413 factC(1, x => id(3 * (2 * x)))
   376   else factC(n - 1, x => ret(n * x)) 
   414 factC(0, x => id(3 * (2 * (1 * x))))
   377 
   415 \end{lstlisting}
   378 factC(3, identity)
   416 
   379 \end{lstlisting}
   417 \noindent
   380 
   418 Having reached 0, we get out of the recursion and apply 1 to the
   381 \noindent
   419 continuation (see if-branch above in Line 2). This gives
   382 This function is called with a number, in this case 3, and the
       
   383 identity-function (which returns just its input). The recursive
       
   384 calls are:
       
   385 
   420 
   386 \begin{lstlisting}[language=Scala, numbers=none]
   421 \begin{lstlisting}[language=Scala, numbers=none]
   387 factC(2, x => identity(3 * x))
   422 id(3 * (2 * (1 * 1)))
   388 factC(1, x => identity(3 * (2 * x)))
       
   389 factC(0, x => identity(3 * (2 * (1 * x))))
       
   390 \end{lstlisting}
       
   391 
       
   392 \noindent
       
   393 Having reached 0, we get out of the recursion and apply 1 to the
       
   394 continuation (see if-branch above). This gives
       
   395 
       
   396 \begin{lstlisting}[language=Scala, numbers=none]
       
   397 identity(3 * (2 * (1 * 1)))
       
   398 = 3 * (2 * (1 * 1))
   423 = 3 * (2 * (1 * 1))
   399 = 6
   424 = 6
   400 \end{lstlisting}
   425 \end{lstlisting}
   401 
   426 
   402 \noindent
   427 \noindent
   403 which is the expected result. If this looks somewhat familiar to you,
   428 which is the expected result. If this looks somewhat familiar to you,
   404 than this is because functions with continuations can be
   429 than this is because functions with continuations can be seen as a
   405 seen as a kind of generalisation of tail-recursive functions.
   430 kind of generalisation of tail-recursive functions.  So far, however,
   406 So far we did not look at this generalisation in earnest.
   431 we did not look at this generalisation in earnest.  Anyway notice how
   407 Anyway
   432 the continuations is ``stacked up'' during the recursion and then
   408 notice how the continuations is ``stacked up'' during the recursion and
   433 ``unrolled'' when we apply 1 to the continuation. Interestingly, we
   409 then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
   434 can do something similar to the Fibonacci function where in the
   410 can do something similar to the Fibonacci function where in the traditional
   435 traditional version we have two recursive calls. Consider the
   411 version we have two recursive calls. Consider the following function
   436 following function
   412 
   437 
   413 \begin{lstlisting}[language=Scala, numbers=none]
   438 \begin{lstlisting}[language=Scala, numbers=left]
   414 def fibC(n: Int, ret: Int => Int) : Int = 
   439 def fibC(n: Int, k: Int => Int) : Int = 
   415   if (n == 0 || n == 1) ret(1) 
   440   if (n == 0 || n == 1) k(1) 
   416   else fibC(n - 1,
   441   else fibC(n - 1,
   417              r1 => fibC(n - 2,
   442              r1 => fibC(n - 2,
   418                r2 => ret(r1 + r2)))
   443                r2 => k(r1 + r2)))
   419 \end{lstlisting}
   444 \end{lstlisting}
   420 
   445 
   421 \noindent
   446 \noindent
   422 Here the continuation is a nested function essentially wrapping up 
   447 Here the continuation (Lines 4--5) is a nested function essentially wrapping up 
   423 the second recursive call. Let us check how the recursion unfolds
   448 the second recursive call plus the original continuation. Let us check how the recursion unfolds
   424 when called with 3 and the identity function:
   449 when called with 3 and the identity function:
   425 
   450 
   426 \begin{lstlisting}[language=Scala, numbers=none]
   451 \begin{lstlisting}[language=Scala, numbers=none]
   427 fibC(3, id)
   452 fibC(3, id)
   428 fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
   453 fibC(2, r1 => fibC(1, r2 => id(r1 + r2)))
   437 
   462 
   438 \noindent
   463 \noindent
   439 The point of this section is that you should be playing around with these
   464 The point of this section is that you should be playing around with these
   440 simple definitions and make sure they calculate the expected result.
   465 simple definitions and make sure they calculate the expected result.
   441 In the next step, you should understand \emph{how} these functions
   466 In the next step, you should understand \emph{how} these functions
   442 calculate the result with the continuations. 
   467 calculate the result with the continuations. Now change the initial
       
   468 continuation which you call the function to
       
   469 
       
   470 \begin{lstlisting}[language=Scala, numbers=none]
       
   471 r => { println("The result plus 1 is:") ; r + 1 }
       
   472 \end{lstlisting}
       
   473 
       
   474 \noindent
       
   475 Does this still calculate the expected result?
   443 
   476 
   444 
   477 
   445 \section*{Worked Example}
   478 \section*{Worked Example}
   446 
   479 
   447 Let us now come back to the CPS-translations for the Fun-language.
   480 Let us now come back to the CPS-translations for the Fun-language.
   457 (1 + 2) * (3 + 4)
   490 (1 + 2) * (3 + 4)
   458 \end{equation}  
   491 \end{equation}  
   459 
   492 
   460 \noindent
   493 \noindent
   461 which of the subexpressions should be calculated first? We just
   494 which of the subexpressions should be calculated first? We just
   462 arbitrarily going to decide that it will be from left to right. Other
   495 arbitrarily going to decide that the calculation will be from left to
   463 languages make different choices---C famously is right to left. In our
   496 right. Other languages make different choices---C famously is right to
   464 case this means we have to tear the expression shown in \eqref{exp}
   497 left. In our case this means we have to tear the expression shown in
   465 apart as follows:
   498 \eqref{exp} apart as follows:
   466 
   499 
   467 \[
   500 \[
   468 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
   501 (1 + 2) \qquad\text{and}\qquad  \Box * (3 + 4)
   469 \]  
   502 \]  
   470 
   503 
   471 \noindent
   504 \noindent
   472 The first subexpression can be calculated and will give us a result,
   505 The first subexpression can be easily calculated and will give us a result,
   473 but the second one is not really an expression that makes sense. It
   506 but the second one is not really an expression that makes sense. It
   474 will only make sense as the next step in the computation when we
   507 will only make sense as the next step in the computation when we
   475 fill-in the result of $1+2$ into the ``hole'' indicated by
   508 fill-in the result of $1+2$ into the ``hole'' indicated by
   476 $\Box$. Such incomplete expressions can be represented in Scala as
   509 $\Box$. Such incomplete expressions can be represented in Scala as
   477 functions
   510 functions
   479 \begin{lstlisting}[language=Scala, numbers=none]
   512 \begin{lstlisting}[language=Scala, numbers=none]
   480 r => r * (3 + 4)
   513 r => r * (3 + 4)
   481 \end{lstlisting}  
   514 \end{lstlisting}  
   482 
   515 
   483 \noindent
   516 \noindent
   484 where \texttt{r} is any result that has been computed earlier. By the
   517 where \texttt{r} will represent any result that has been computed
   485 way, in lambda-calculus notation we would write
   518 earlier. By the way, in lambda-calculus notation we would write
   486 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
   519 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use
   487 functions (``continuations'') to represent what is coming next in a
   520 functions (``continuations'') to represent what is coming next in a
   488 sequence of instructions. In our case, continuations are functions of
   521 sequence of instructions. In our case, continuations are functions of
   489 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
   522 type \code{KVal} to \code{KExp}. They can be seen as a sequence of
   490 \code{KLet}s where there is a ``hole'' that needs to be
   523 \code{KLet}s where there is a ``hole'' that needs to be
   493 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
   526 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
   494 let tmp0 = add 1 a in   
   527 let tmp0 = add 1 a in   
   495 let tmp1 = mul (*@$\Box$@*) 5 in 
   528 let tmp1 = mul (*@$\Box$@*) 5 in 
   496 let tmp2 = add 3 tmp1 in 
   529 let tmp2 = add 3 tmp1 in 
   497 let tmp3 = add tmp0 tmp2 in
   530 let tmp3 = add tmp0 tmp2 in
   498   retrun tmp3 
   531   return tmp3 
   499 \end{lstlisting}
   532 \end{lstlisting}
   500 
   533 
   501 \noindent
   534 \noindent
   502 where in the second line is a $\Box$ which still expects a \code{KVal}
   535 where in the second line is a $\Box$ which still expects a \code{KVal}
   503 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   536 to be filled in before it becomes a ``proper'' \code{KExp}. When we 
   508 suppose our source language consists just of three kinds of expressions
   541 suppose our source language consists just of three kinds of expressions
   509 
   542 
   510 \begin{lstlisting}[language=Scala,numbers=none]
   543 \begin{lstlisting}[language=Scala,numbers=none]
   511 enum Expr {
   544 enum Expr {
   512     case Num(n: Int)
   545     case Num(n: Int)
   513     case Bop(op: String, e1: Expr, e2: Expr)
   546     case Aop(op: String, e1: Expr, e2: Expr)
   514     case Call(fname: String, args: List[Expr])
   547     case Call(fname: String, args: List[Expr])
   515 }
   548 }
   516 \end{lstlisting}
   549 \end{lstlisting}
   517 
   550 
   518 \noindent
   551 \noindent
   519 The code of the CPS-translation is then of the form
   552 This allows us to generate SSA-instructions for expressions like
       
   553 
       
   554 \[
       
   555 1 + foo(bar(4 * 7), 3, id(12))  
       
   556 \]
       
   557 
       
   558 \noindent
       
   559 The code of the CPS-translation
       
   560 that generates these instructions is then of the form
   520 
   561 
   521 \begin{lstlisting}[language=Scala,numbers=none]
   562 \begin{lstlisting}[language=Scala,numbers=none]
   522 def CPS(e: Exp)(k: KVal => KExp) : KExp = 
   563 def CPS(e: Exp)(k: KVal => KExp) : KExp = 
   523   e match { ... }
   564   e match { ... }
   524 \end{lstlisting}
   565 \end{lstlisting}
   528 be compiled. The result of the function is a K-expression, which later
   569 be compiled. The result of the function is a K-expression, which later
   529 can be compiled into LLVM-IR code.
   570 can be compiled into LLVM-IR code.
   530 
   571 
   531 In case we have numbers, then we can just pass them in CPS-translation
   572 In case we have numbers, then we can just pass them in CPS-translation
   532 to the continuations because numbers need not be further teared apart
   573 to the continuations because numbers need not be further teared apart
   533 as they are already atomic. Passing the number to the continuation
   574 as they are already primitive. Passing the number to the continuation
   534 means we apply the continuation like
   575 means we apply the continuation like
   535 
   576 
   536 \begin{lstlisting}[language=Scala,numbers=none]
   577 \begin{lstlisting}[language=Scala,numbers=none]
   537   case Num(i) => k(KNum(i))
   578   case Num(i) => k(KNum(i))
   538 \end{lstlisting}
   579 \end{lstlisting}
   572 
   613 
   573 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   614 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm]
   574 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
   615 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))
   575 \end{lstlisting}
   616 \end{lstlisting}
   576 
   617 
   577 \noindent is a function from \code{KVal} to \code{KExp}. Once
   618 \noindent is a function from \code{KVal} to \code{KExp}, which we need
   578 we created the assignment with the fresh temporary variable
   619 as type for the continuation. Once we created the assignment with the
   579 \texttt{z}, we need to ``communicate'' that the result of the
   620 fresh temporary variable \texttt{z}, we need to ``communicate'' that
   580 computation of the arithmetic expression is stored in \texttt{z} to the
   621 the result of the computation of the arithmetic expression is stored
   581 computations that follow. In this way we apply the continuation \texttt{k} with this
   622 in \texttt{z} to the computations that follow. In this way we apply
   582 new variable (essentially we are plugging in a hole further down the line).
   623 the continuation \texttt{k} with this new variable (essentially we are
   583 Hope this makes sense.
   624 plugging in a hole further down the line).  Hope this makes sense.
   584 
   625 
   585 The last case we need to consider in our small expression language are
   626 The last case we need to consider in our small expression language are
   586 function calls.
   627 function calls. For them remember each argument of the function
       
   628 call can in SSA-format only be a variable or a number.
   587 
   629 
   588 \begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
   630 \begin{lstlisting}[language=Scala,numbers=left,xleftmargin=0mm]
   589 case Call(fname, args) => {
   631 case Call(fname, args) => {
   590   def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
   632   def aux(args: List[Expr], vs: List[KVal]): KExp = args match {
   591        case Nil => {
   633        case Nil => {
   597   aux(args, Nil)  
   639   aux(args, Nil)  
   598 }
   640 }
   599 \end{lstlisting}
   641 \end{lstlisting}
   600 
   642 
   601 \noindent
   643 \noindent
   602 For them we introduce an auxiliary function that compiles first all
   644 For this case we introduce an auxiliary function that compiles first all
   603 function arguments---remember in our source language we can have calls
   645 function arguments---remember in our source language we can have calls
   604 such as $foo(3 + 4, g(3))$ where we first have to create temporary
   646 such as $foo(3 + 4, g(3))$ where we first have to create temporary
   605 variables (and computations) for each argument. Therefore \texttt{aux}
   647 variables (and computations) for each argument. Therefore \texttt{aux}
   606 analyses the argument list from left to right. In case there is an
   648 analyses the argument list from left to right. In case there is an
   607 argument \texttt{a} on the front of the list (the case \texttt{a::as}
   649 argument \texttt{a} on the front of the list (the case \texttt{a::as}
   633   \hspace{10mm}...
   675   \hspace{10mm}...
   634   \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
   676   \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc}
   635 \caption{CPS-translation for a simple expression language.\label{cps}}
   677 \caption{CPS-translation for a simple expression language.\label{cps}}
   636 \end{figure}
   678 \end{figure}
   637 
   679 
   638 The last question we need to answer in the code (see Figure~\ref{cps}) is how we
   680 The last question we need to answer in the code (see Figure~\ref{cps})
   639 start the CPS-translation function, or more precisely with which continuation we
   681 is how we start the CPS-translation function, or more precisely with
   640 should start CPS? It turns out that this initial continuation will be the last one that is
   682 which continuation we should start CPS? It turns out that this initial
   641 called. What should be the last step in the computation? Yes, we need to return the
   683 continuation will be the last one that is called. What should be the
   642 temporary variable where the last result is stored (if it is a simple number, then we can
   684 last step in the computation? Yes, we need to return the temporary
   643 just return this number). Therefore we call CPS with the code
   685 variable where the last result is stored (if it is a simple number,
       
   686 then we can just return this number). Therefore we call CPS with the
       
   687 code
   644 
   688 
   645 \begin{lstlisting}[language=Scala,numbers=none]
   689 \begin{lstlisting}[language=Scala,numbers=none]
   646 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
   690 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_))
   647 \end{lstlisting}
   691 \end{lstlisting}
   648 
   692 
   649 \noindent
   693 \noindent
   650 where we give the function \code{KReturn(_)} as the continuation. With
   694 where we give the function \code{KReturn(_)} as the continuation. With
   651 this we completed the translation of expressions into our
   695 this we completed the translation of simple expressions into our
   652 K-language. Figure~\ref{absfun} gives the complete datatypes for the
   696 K-language. Play around with some more expressions and see how the
   653 ASTs of the Fun-language and the K-values and K-expressions for the
   697 CPS-translation generates the correct code. I know this is not easy to
   654 K-language.
   698 follow code when you see it the first time.  Figure~\ref{absfun} gives
       
   699 the complete datatypes for the ASTs of the Fun-language and the
       
   700 K-values and K-expressions for the K-language. The complete
       
   701 CPS-translation you can find in the code.
       
   702 
       
   703 \section*{Next Steps}
   655 
   704 
   656 Having obtained a K-expression, it is relatively straightforward to
   705 Having obtained a K-expression, it is relatively straightforward to
   657 generate a valid program for the LLVM-IR. We leave this to the
   706 generate a valid program for the LLVM-IR. We leave this to the
   658 attentive reader. What else can we do?  Well it should be relatively
   707 attentive reader. What else can we do?  Well it should be relatively
   659 easy to apply some common optimisations to the K-expressions. One
   708 easy to apply some common optimisations to the K-expressions. One