20 |
17 |
21 \section*{Handout 9 (LLVM, SSA and CPS)} |
18 \section*{Handout 9 (LLVM, SSA and CPS)} |
22 |
19 |
23 Reflecting on our two tiny compilers targetting the JVM, the code |
20 Reflecting on our two tiny compilers targetting the JVM, the code |
24 generation part was actually not so hard, no? Pretty much just some |
21 generation part was actually not so hard, no? Pretty much just some |
25 post-traversal of the abstract syntax tree, yes? One of the reasons for |
22 post-traversal of the abstract syntax tree, yes? One of the reasons |
26 this ease is that the JVM is a stack-based virtual machine and it is |
23 for this ease is that the JVM is a stack-based virtual machine and it |
27 therefore not hard to translate deeply-nested arithmetic expressions |
24 is therefore not hard to translate deeply-nested arithmetic |
28 into a sequence of instructions manipulating the stack. The problem is |
25 expressions into a sequence of instructions manipulating the |
29 that ``real'' CPUs, although supporting stack operations, are not really |
26 stack. The problem is that ``real'' CPUs, although supporting stack |
30 designed to be \emph{stack machines}. The design of CPUs is more like, |
27 operations, are not really designed to be \emph{stack machines}. The |
31 here is a chunk of memory---compiler, or better compiler writers, do |
28 design of CPUs is more like: Here are some operations and a chunk of |
32 something with it. Consequently, modern compilers need to go the extra |
29 memory---compiler, or better compiler writers, do something with |
33 mile in order to generate code that is much easier and faster to process |
30 them. Consequently, modern compilers need to go the extra mile in |
34 by CPUs. To make this all tractable for this module, we target the LLVM |
31 order to generate code that is much easier and faster to process by |
35 Intermediate Language. In this way we can take advantage of the tools |
32 actual CPUs, rather than running code on virtual machines that |
36 coming with LLVM. For example we do not have to worry about things like |
33 introduce an additional layer of indirection. To make this all |
37 register allocations.\bigskip |
34 tractable for this module, we target the LLVM Intermediate |
38 |
35 Language. In this way we can take advantage of the tools coming with |
39 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example |
36 LLVM.\footnote{\url{http://llvm.org}} For example we do not have to |
|
37 worry about things like register allocations. By using LLVM, however, |
|
38 we also have to pay price in the sense that generating code gets |
|
39 harder\ldots{}unfortunately. |
|
40 |
|
41 \subsection*{LLVM and LLVM-IR} |
|
42 |
|
43 \noindent LLVM is a beautiful example |
40 that projects from Academia can make a difference in the World. LLVM |
44 that projects from Academia can make a difference in the World. LLVM |
41 started in 2000 as a project by two researchers at the University of |
45 started in 2000 as a project by two researchers at the University of |
42 Illinois at Urbana-Champaign. At the time the behemoth of compilers was |
46 Illinois at Urbana-Champaign. At the time the behemoth of compilers was |
43 gcc with its myriad of front-ends for other languages (C++, Fortran, |
47 gcc with its myriad of front-ends for different programming languages (C++, Fortran, |
44 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over |
48 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over |
45 time into a monolithic gigantic piece of m\ldots ehm software, which you |
49 time into a monolithic gigantic piece of m\ldots ehm complicated |
|
50 software, which you |
46 could not mess about in an afternoon. In contrast, LLVM is designed to |
51 could not mess about in an afternoon. In contrast, LLVM is designed to |
47 be a modular suite of tools with which you can play around easily and |
52 be a modular suite of tools with which you can play around easily and |
48 try out something new. LLVM became a big player once Apple hired one of |
53 try out something new. LLVM became a big player once Apple hired one of |
49 the original developers (I cannot remember the reason why Apple did not |
54 the original developers (I cannot remember the reason why Apple did not |
50 want to use gcc, but maybe they were also just disgusted by gcc's big |
55 want to use gcc, but maybe they were also just disgusted by gcc's big |
52 or less legacy. This does not mean that programming languages like C and |
57 or less legacy. This does not mean that programming languages like C and |
53 C++ are dying out any time soon---they are nicely supported by LLVM. |
58 C++ are dying out any time soon---they are nicely supported by LLVM. |
54 |
59 |
55 We will target the LLVM Intermediate Language, or LLVM Intermediate |
60 We will target the LLVM Intermediate Language, or LLVM Intermediate |
56 Representation (short LLVM-IR). The LLVM-IR looks very similar to the |
61 Representation (short LLVM-IR). The LLVM-IR looks very similar to the |
57 assembly language of Jasmin and Krakatau. It will also allow us to |
62 assembly language of Jasmin and Krakatau. Targetting LLVM-IR will also |
58 benefit from the modular structure of the LLVM compiler and let for |
63 allow us to benefit from the modular structure of the LLVM compiler |
59 example the compiler generate code for different CPUs, like X86 or ARM. |
64 and we can let, for example, the compiler generate code for different |
60 That means we can be agnostic about where our code actually runs. We can |
65 CPUs, say X86 or ARM. That means we can be agnostic about where our |
61 also be ignorant about optimising code and allocating memory |
66 code is actually going to run.\footnote{Anybody want to try to run our |
62 efficiently. |
67 programs on Android phones?} We can also be somewhat ignorant about |
63 |
68 optimising our code and about allocating memory efficiently: the LLVM |
64 However, what we have to do for LLVM is to generate code in \emph{Static |
69 tools will take care of all this. |
65 Single-Assignment} format (short SSA), because that is what the LLVM-IR |
70 |
66 expects from us. A reason why LLVM uses the SSA format, rather than |
71 However, what we have to do in order to make LLVM to play ball is to |
67 JVM-like stack instructions, is that stack instructions are difficult to |
72 generate code in \emph{Static Single-Assignment} format (short SSA), |
68 optimise---you cannot just re-arrange instructions without messing about |
73 because that is what the LLVM-IR expects from us. A reason why LLVM |
69 with what is calculated on the stack. Also it is hard to find out if all |
74 uses the SSA format, rather than JVM-like stack instructions, is that |
70 the calculations on the stack are actually necessary and not by chance |
75 stack instructions are difficult to optimise---you cannot just |
71 dead code. The JVM has for all these obstacles sophisticated machinery |
76 re-arrange instructions without messing about with what is calculated |
72 to make such ``high-level'' code still run fast, but let's say that for |
77 on the stack. Also it is hard to find out if all the calculations on |
73 the sake of argument we do not want to rely on it. We want to generate |
78 the stack are actually necessary and not by chance dead code. The JVM |
74 fast code ourselves. This means we have to work around the intricacies |
79 has for all these obstacles sophisticated machinery to make such |
75 of what instructions CPUs can actually process fast. This is what the |
80 ``high-level'' code still run fast, but let's say that for the sake of |
76 SSA format is designed for. |
81 argument we do not want to rely on it. We want to generate fast code |
|
82 ourselves. This means we have to work around the intricacies of what |
|
83 instructions CPUs can actually process fast. This is what the SSA |
|
84 format is designed for. |
77 |
85 |
78 |
86 |
79 The main idea behind the SSA format is to use very simple variable |
87 The main idea behind the SSA format is to use very simple variable |
80 assignments where every tmp-variable is assigned only once. The assignments |
88 assignments where every tmp-variable is assigned only once. The |
81 also need to be primitive in the sense that they can be just simple |
89 assignments also need to be primitive in the sense that they can be |
82 operations like addition, multiplication, jumps, comparisons and so on. |
90 just simple operations like addition, multiplication, jumps, |
83 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the |
91 comparisons, function calls and so on. Say, we have an expression |
84 corresponding SSA format is |
92 $((1 + a) + (foo(3 + 2) + (b * 5)))$, then the corresponding SSA format is |
85 |
93 |
86 \begin{lstlisting}[language=LLVMIR,numbers=left] |
94 \begin{lstlisting}[language=LLVMIR,numbers=left] |
87 let tmp0 = add 1 a in |
95 let tmp0 = add 1 a in |
88 let tmp1 = mul b 5 in |
96 let tmp1 = add 3 2 in |
89 let tmp2 = add 3 tmp1 in |
97 let tmp2 = call foo(tmp1) |
90 let tmp3 = add tmp0 tmp2 in tmp3 |
98 let tmp3 = mul b 5 in |
91 \end{lstlisting} |
99 let tmp4 = add tmp2 tmp3 in |
92 |
100 let tmp5 = add tmp0 tmp4 in |
93 \noindent where every variable is used only once (we could not write |
101 return tmp5 |
94 \texttt{tmp1 = add 3 tmp1} in Line 3 for example). |
102 \end{lstlisting} |
|
103 |
|
104 \noindent where every tmp-variable is used only once (we could, for |
|
105 example, not write \texttt{tmp1 = add tmp2 tmp3} in Line 5 even if this |
|
106 would not change the overall result). |
95 |
107 |
96 There are sophisticated algorithms for imperative languages, like C, |
108 There are sophisticated algorithms for imperative languages, like C, |
97 that efficiently transform a high-level program into SSA format. But |
109 that efficiently transform a high-level program into SSA format. But |
98 we can ignore them here. We want to compile a functional language and |
110 we can ignore them here. We want to compile a functional language and |
99 there things get much more interesting than just sophisticated. We |
111 there things get much more interesting than just sophisticated. We |
129 %tmp = mul i32 %x, %x |
141 %tmp = mul i32 %x, %x |
130 ret i32 %tmp |
142 ret i32 %tmp |
131 } |
143 } |
132 \end{lstlisting} |
144 \end{lstlisting} |
133 |
145 |
134 \noindent First notice that all variable names, in this case \texttt{x} |
146 \noindent First notice that all ``local'' variable names, in this case |
135 and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR. |
147 \texttt{x} and \texttt{tmp}, are prefixed with \texttt{\%} in the |
136 Temporary variables can be named with an identifier, such as |
148 LLVM-IR. Temporary variables can be named with an identifier, such as |
137 \texttt{tmp}, or numbers. In contrast, function names, since they are ``global'', |
149 \texttt{tmp}, or numbers. In contrast, function names, since they are |
138 need to be prefixed with an @-symbol. Also, the LLVM-IR is a fully typed |
150 ``global'', need to be prefixed with an @-symbol. Also, the LLVM-IR is |
139 language. The \texttt{i32} type stands for 32-bit integers. There are |
151 a fully typed language. The \texttt{i32} type stands for 32-bit |
140 also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}), |
152 integers. There are also types for 64-bit integers (\texttt{i64}), |
141 floats, arrays and even pointer types. In the code above, \texttt{sqr} |
153 chars (\texttt{i8}), floats, arrays and even pointer types. In the |
142 takes an argument of type \texttt{i32} and produces a result of type |
154 code above, \texttt{sqr} takes an argument of type \texttt{i32} and |
143 \texttt{i32} (the result type is in front of the function name, like in |
155 produces a result of type \texttt{i32} (the result type is in front of |
144 C). Each arithmetic operation, for example addition and multiplication, |
156 the function name, like in C). Each arithmetic operation, for example |
145 are also prefixed with the type they operate on. Obviously these types |
157 addition and multiplication, are also prefixed with the type they |
146 need to match up\ldots{} but since we have in our programs only |
158 operate on. Obviously these types need to match up\ldots{} but since |
147 integers, for the moment \texttt{i32} everywhere will do. We do not have to generate |
159 we have in our programs only integers, for the moment \texttt{i32} |
148 any other types, but obviously this is a limitation in our Fun language. |
160 everywhere will do. We do not have to generate any other types, but |
|
161 obviously this is a limitation in our Fun-language (and which we |
|
162 lift in the final CW). |
149 |
163 |
150 There are a few interesting instructions in the LLVM-IR which are quite |
164 There are a few interesting instructions in the LLVM-IR which are quite |
151 different than what we have seen in the JVM. Can you remember the |
165 different than what we have seen in the JVM. Can you remember the |
152 kerfuffle we had to go through with boolean expressions and negating the |
166 kerfuffle we had to go through with boolean expressions and negating the |
153 condition? In the LLVM-IR, branching if-conditions is implemented |
167 condition? In the LLVM-IR, branching if-conditions are implemented |
154 differently: there is a separate \texttt{br}-instruction as follows: |
168 differently: there is a separate \texttt{br}-instruction as follows: |
155 |
169 |
156 \begin{lstlisting}[language=LLVM] |
170 \begin{lstlisting}[language=LLVM] |
157 br i1 %var, label %if_br, label %else_br |
171 br i1 %var, label %if_br, label %else_br |
158 \end{lstlisting} |
172 \end{lstlisting} |
159 |
173 |
160 \noindent |
174 \noindent |
161 The type \texttt{i1} stands for booleans. If the variable is true, then |
175 The type \texttt{i1} stands for booleans. If the variable is true, |
162 this instruction jumps to the if-branch, which needs an explicit label; |
176 then this instruction jumps to the if-branch, which needs an explicit |
163 otherwise it jumps to the else-branch, again with its own label. This allows us |
177 label; otherwise it jumps to the else-branch, again with its own |
164 to keep the meaning of the boolean expression ``as is'' when compiling if's---thanks god no more negating the boolean. |
178 label. This allows us to keep the meaning of the boolean expression |
|
179 ``as is'' when compiling if's---thanks god no more negating of |
|
180 booleans. |
|
181 |
165 A value of type boolean is generated in the LLVM-IR by the |
182 A value of type boolean is generated in the LLVM-IR by the |
166 \texttt{icmp}-instruction. This instruction is for integers (hence the |
183 \texttt{icmp}-instruction. This instruction is for integers (hence the |
167 \texttt{i}) and takes the comparison operation as argument. For example |
184 \texttt{i}) and takes the comparison operation as argument. For example |
168 |
185 |
169 \begin{lstlisting}[language=LLVM] |
186 \begin{lstlisting}[language=LLVM] |
212 \end{lstlisting} |
230 \end{lstlisting} |
213 |
231 |
214 \begin{figure}[t]\small |
232 \begin{figure}[t]\small |
215 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll} |
233 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll} |
216 \caption{An LLVM-IR program for calculating the square function. It |
234 \caption{An LLVM-IR program for calculating the square function. It |
217 calls this function in \texttt{@main} with the argument \texttt{5}. The |
235 calls the sqr-function in \texttt{@main} with the argument \texttt{5} |
218 code for the \texttt{sqr} function is in Lines 13 -- 16. The main |
236 (Line 20). The |
219 function calls \texttt{sqr} and then prints out the result. The other |
237 code for the \texttt{sqr} function is in Lines 13 -- 16. It stores |
220 code is boilerplate for printing out integers.\label{lli}} |
238 the result of sqr in the variable called \texttt{\%i} and then |
|
239 prints out the contents of this variable in Line 21. The other |
|
240 code is boilerplate for printing out integers.\label{lli}} |
221 \end{figure} |
241 \end{figure} |
222 |
242 |
223 |
243 |
224 |
244 |
225 \subsection*{Our Own Intermediate Language} |
245 \subsection*{Our Own Intermediate Language} |
226 |
246 |
227 Remember compilers have to solve the problem of bridging the gap between |
247 Let's get back to our compiler: Remember compilers have to solve the |
228 ``high-level'' programs and ``low-level'' hardware. If the gap is too |
248 problem of bridging the gap between ``high-level'' programs and |
229 wide for one step, then a good strategy is to lay a stepping stone |
249 ``low-level'' hardware. If the gap is too wide for one step, then a |
230 somewhere in between. The LLVM-IR itself is such a stepping stone to |
250 good strategy is to lay a stepping stone somewhere in between. The |
231 make the task of generating and optimising code easier. Like a real |
251 LLVM-IR itself is such a stepping stone to make the task of generating |
232 compiler we will use our own stepping stone which I call the |
252 and optimising code easier. Like a real compiler we will use our own |
233 \emph{K-language}. For what follows recall the various kinds of |
253 stepping stone which I call the \emph{K-language}. |
234 expressions in the Fun language. For convenience the Scala code of the |
254 |
235 corresponding abstract syntax trees is shown on top of |
255 \begin{center} |
236 Figure~\ref{absfun}. Below is the code for the abstract syntax trees in |
256 \begin{tikzpicture}[scale=0.9,font=\bf, |
237 the K-language. In K, here are two kinds of syntactic entities, namely |
257 node/.style={ |
238 \emph{K-values} and \emph{K-expressions}. The central constructor of the |
258 rectangle,rounded corners=3mm, |
239 K-language is \texttt{KLet}. For this recall in SSA that arithmetic |
259 ultra thick,draw=black!50,minimum height=16mm, |
240 expressions such as $((1 + a) + (3 + (b * 5)))$ need to be broken up |
260 minimum width=20mm, |
241 into smaller ``atomic'' steps, like so |
261 top color=white,bottom color=black!20}] |
|
262 |
|
263 %\node (0) at (-3,0) {}; |
|
264 \node (A) at (0.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}source}] {Fun-Language}; |
|
265 \node (B) at (3.5,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}stepping stone}] {K-Language}; |
|
266 \node (C) at (7.0,0) [node,text width=1.6cm,text centered,label=above:{\small\it{}target}] {LLVM-IR}; |
|
267 |
|
268 \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {} (B); |
|
269 \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {} (C); |
|
270 \end{tikzpicture} |
|
271 \end{center} |
|
272 |
|
273 \noindent |
|
274 To see why we need a stepping stone for generating code in SSA-format, |
|
275 considder again arithmetic expressions such as |
|
276 $((1 + a) + (3 + (b * 5)))$. They need to be broken up into smaller |
|
277 ``atomic'' steps, like so |
242 |
278 |
243 \begin{lstlisting}[language=LLVMIR,numbers=none] |
279 \begin{lstlisting}[language=LLVMIR,numbers=none] |
244 let tmp0 = add 1 a in |
280 let tmp0 = add 1 a in |
245 let tmp1 = mul b 5 in |
281 let tmp1 = mul b 5 in |
246 let tmp2 = add 3 tmp1 in |
282 let tmp2 = add 3 tmp1 in |
247 let tmp3 = add tmp0 tmp2 in |
283 let tmp3 = add tmp0 tmp2 in |
248 tmp3 |
284 return tmp3 |
249 \end{lstlisting} |
285 \end{lstlisting} |
250 |
286 |
251 \noindent |
287 \noindent |
252 Here \texttt{tmp3} will contain the result of what the whole expression |
288 Here \texttt{tmp3} will contain the result of what the whole |
253 stands for. In each individual step we can only perform an ``atomic'' |
289 expression stands for. In each individual step we can only perform an |
254 operation, like addition or multiplication of a number and a variable. |
290 ``atomic'' or ``trival'' operation, like addition or multiplication of |
255 We are not allowed to have for example an if-condition on the right-hand |
291 a number or a variable. We are not allowed to have for example a |
256 side of an equals. Such constraints are enforced upon us because of how |
292 nested addition or an if-condition on the right-hand side of an |
257 the SSA format works in the LLVM-IR. By having in \texttt{KLet} taking |
293 assignment. Such constraints are forced upon us because of how the SSA |
258 first a string (standing for an intermediate result) and second a value, |
294 format works in the LLVM-IR. To simplify matters we represent |
259 we can fulfil this constraint ``by construction''---there is no way we |
295 assignments with two kinds of syntactic entities, namely |
260 could write anything else than a value. |
296 \emph{K-values} and \emph{K-expressions}. K-values are all ``things'' |
|
297 that can appear on the right-hand side of an equal. Say we have |
|
298 the following definition for K-values: |
|
299 |
|
300 \begin{lstlisting}[language=Scala,numbers=none] |
|
301 enum KVal { |
|
302 case KVar(s: String) |
|
303 case KNum(n: Int) |
|
304 case KAop(op: String, v1: KVal, v2: KVal) |
|
305 case KCall(fname: String, args: List[KVal]) |
|
306 } |
|
307 \end{lstlisting} |
|
308 |
|
309 \noindent |
|
310 where a K-value can be a variable, a number or a ``trivial'' binary |
|
311 operation, such as addition or multiplication. The two arguments of a |
|
312 binary operation need to be K-values as well. Finally, we have |
|
313 function calls, but again each argument of the function call needs to |
|
314 be a K-value. One point we need to be careful, however, is that the |
|
315 arguments of the binary operations and function calls are in fact only |
|
316 variables or numbers. To encode this constraint into the type-system |
|
317 of Scala would make things too complicated---to satisfy this |
|
318 constraint is therefore on us. For our Fun-language, we will later on |
|
319 consider some further K-values. |
|
320 |
|
321 |
|
322 Our K-expressions will encode the \texttt{let} and the \texttt{return} |
|
323 from the SSA-format (again for the moment we only consider these two |
|
324 constructors---later on we will have if-branches as well). |
|
325 |
|
326 \begin{lstlisting}[language=Scala,numbers=none] |
|
327 enum KExp { |
|
328 case KLet(x: String, v: KVal, e: KExp) |
|
329 case KReturn(v: KVal) |
|
330 } |
|
331 \end{lstlisting} |
|
332 |
|
333 \noindent |
|
334 By having in \texttt{KLet} taking first a string (standing for an |
|
335 intermediate variable) and second a value, we can fulfil the SSA |
|
336 constraint in assignments ``by construction''---there is no way we |
|
337 could write anything else than a K-value. Note that the third |
|
338 argument of a \texttt{KLet} is again a K-expression, meaning either |
|
339 another \texttt{KLet} or a \texttt{KReturn}. In this way we can |
|
340 construct a sequence of computations and return a final result. Of |
|
341 course we also have to ensure that all intermediary computations are |
|
342 given (fresh) names---we will use an (ugly) counter for this. |
261 |
343 |
262 To sum up, K-values are the atomic operations that can be on the |
344 To sum up, K-values are the atomic operations that can be on the |
263 right-hand side of equal-signs. The K-language is restricted such that |
345 right-hand side of assignemnts. The K-language is restricted such that |
264 it is easy to generate the SSA format for the LLVM-IR. |
346 it is easy to generate the SSA format for the LLVM-IR. What remains to |
265 |
347 be done is a translation of our Fun-language into the K-language. The |
266 |
348 main difficulty is that we need to order the computationa---in teh |
267 |
349 K-language we only have linear sequence of instructions to need to be |
268 \begin{figure}[p]\small |
350 followed. Before we explain this, we have a look at some |
269 \begin{lstlisting}[language=Scala,numbers=none] |
351 CPS-translation. |
270 // Fun language (expressions) |
352 |
271 abstract class Exp |
|
272 abstract class BExp |
|
273 |
|
274 case class Call(name: String, args: List[Exp]) extends Exp |
|
275 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp |
|
276 case class Write(e: Exp) extends Exp |
|
277 case class Var(s: String) extends Exp |
|
278 case class Num(i: Int) extends Exp |
|
279 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp |
|
280 case class Sequence(e1: Exp, e2: Exp) extends Exp |
|
281 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp |
|
282 |
|
283 |
|
284 |
|
285 // K-language (K-expressions, K-values) |
|
286 abstract class KExp |
|
287 abstract class KVal |
|
288 |
|
289 case class KVar(s: String) extends KVal |
|
290 case class KNum(i: Int) extends KVal |
|
291 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal |
|
292 case class KCall(o: String, vrs: List[KVal]) extends KVal |
|
293 case class KWrite(v: KVal) extends KVal |
|
294 |
|
295 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp |
|
296 case class KLet(x: String, v: KVal, e: KExp) extends KExp |
|
297 case class KReturn(v: KVal) extends KExp |
|
298 \end{lstlisting} |
|
299 \caption{Abstract syntax trees for the Fun language.\label{absfun}} |
|
300 \end{figure} |
|
301 |
353 |
302 |
354 |
303 |
355 |
304 \subsection*{CPS-Translations} |
356 \subsection*{CPS-Translations} |
305 |
357 |
306 CPS stands for Continuation-Passing-Style. It is a kind of programming |
358 CPS stands for Continuation-Passing-Style. It is a kind of programming |
307 technique often used in advanced functional programming. Before we delve |
359 technique often used in advanced functional programming. Before we delve |
308 into the CPS-translation for our Fun language, let us look at |
360 into the CPS-translation for our Fun-language, let us look at |
309 CPS-versions of some well-known functions. Consider |
361 CPS-versions of some well-known functions. Consider |
310 |
362 |
311 \begin{lstlisting}[language=Scala, numbers=none] |
363 \begin{lstlisting}[language=Scala, numbers=none] |
312 def fact(n: Int) : Int = |
364 def fact(n: Int) : Int = |
313 if (n == 0) 1 else n * fact(n - 1) |
365 if (n == 0) 1 else n * fact(n - 1) |
378 id((1 + 1) + 1) |
432 id((1 + 1) + 1) |
379 (1 + 1) + 1 |
433 (1 + 1) + 1 |
380 3 |
434 3 |
381 \end{lstlisting} |
435 \end{lstlisting} |
382 |
436 |
383 Let us now come back to the CPS-translations for the Fun language. The |
437 \noindent |
|
438 The point of this section is that you are playing around with these |
|
439 simple definitions and make sure they calculate the expected result. |
|
440 In the next step, you should understand \emph{how} these functions |
|
441 calculate the result with the continuations. |
|
442 |
|
443 |
|
444 \section*{Worked Example} |
|
445 |
|
446 Let us now come back to the CPS-translations for the Fun-language. The |
384 main difficulty of generating instructions in SSA format is that large |
447 main difficulty of generating instructions in SSA format is that large |
385 compound expressions need to be broken up into smaller pieces and |
448 compound expressions need to be broken up into smaller pieces and |
386 intermediate results need to be chained into later instructions. To do |
449 intermediate results need to be chained into later instructions. To do |
387 this conveniently, CPS-translations have been developed. They use |
450 this conveniently, we use the CPS-translation mechanism. What continuations |
|
451 essentially solve is the following problem: Given an expressions |
|
452 |
|
453 \begin{equation}\label{exp} |
|
454 (1 + 2) * (3 + 4) |
|
455 \end{equation} |
|
456 |
|
457 \noindent |
|
458 which of the subexpressions should be calculated first? We just arbitrarily |
|
459 going to decide that it will be from left to right. This means we have |
|
460 to tear the expression shown in \eqref{exp} apart as follows: |
|
461 |
|
462 \[ |
|
463 (1 + 2) \qquad\text{and}\qquad \Box * (3 + 4) |
|
464 \] |
|
465 |
|
466 \noindent |
|
467 The first one will give us a result, but the second one is not |
|
468 really an expression that makes sense. It will only make sense |
|
469 as the next step in the computation when we fill-in the result of |
|
470 $1+2$ into the ``hole'' indicated by $\Box$. Such incomplete |
|
471 expressions can be represented in Scala as functions |
|
472 |
|
473 \begin{lstlisting}[language=Scala, numbers=none] |
|
474 r => r * (3 + 4) |
|
475 \end{lstlisting} |
|
476 |
|
477 \noindent |
|
478 where \texttt{r} is any result that has been computed earlier. By the |
|
479 way in lambda-calculus notation we would write |
|
480 $\lambda r.\, r * (3 + 4)$ for the same function. To sum up, we use |
388 functions (``continuations'') to represent what is coming next in a |
481 functions (``continuations'') to represent what is coming next in a |
389 sequence of instructions. In our case, continuations are functions of type |
482 sequence of instructions. In our case, continuations are functions of |
390 \code{KVal} to \code{KExp}. They can be seen as a sequence of |
483 type \code{KVal} to \code{KExp}. They can be seen as a sequence of |
391 \code{KLet}s where there is a ``hole'' that needs to be filled. Consider |
484 \code{KLet}s where there is a ``hole'' that needs to be |
392 for example |
485 filled. Consider for example |
393 |
486 |
394 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}] |
487 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}] |
395 let tmp0 = add 1 a in |
488 let tmp0 = add 1 a in |
396 let tmp1 = mul (*@$\Box$@*) 5 in |
489 let tmp1 = mul (*@$\Box$@*) 5 in |
397 let tmp2 = add 3 tmp1 in |
490 let tmp2 = add 3 tmp1 in |
398 let tmp3 = add tmp0 tmp2 in |
491 let tmp3 = add tmp0 tmp2 in |
399 tmp3 |
492 retrun tmp3 |
400 \end{lstlisting} |
493 \end{lstlisting} |
401 |
494 |
402 \noindent |
495 \noindent |
403 where in the second line is a $\Box$ which still expects a \code{KVal} |
496 where in the second line is a $\Box$ which still expects a \code{KVal} |
404 to be filled in before it becomes a ``proper'' \code{KExp}. When we |
497 to be filled in before it becomes a ``proper'' \code{KExp}. When we |
405 apply an argument to the continuation (remember they are functions) |
498 apply an argument to the continuation (remember they are functions) |
406 we essentially fill something into the corresponding hole. The code |
499 we essentially fill something into the corresponding hole. |
407 of the CPS-translation is |
500 |
|
501 Lets look at concrete code. To simplify matters first, |
|
502 suppose our source language consists just of three kinds of expressions |
|
503 |
|
504 \begin{lstlisting}[language=Scala,numbers=none] |
|
505 enum Expr { |
|
506 case Num(n: Int) |
|
507 case Bop(op: String, e1: Expr, e2: Expr) |
|
508 case Call(fname: String, args: List[Expr]) |
|
509 } |
|
510 \end{lstlisting} |
|
511 |
|
512 \noindent |
|
513 The code of the CPS-translation is then of the form |
408 |
514 |
409 \begin{lstlisting}[language=Scala,numbers=none] |
515 \begin{lstlisting}[language=Scala,numbers=none] |
410 def CPS(e: Exp)(k: KVal => KExp) : KExp = |
516 def CPS(e: Exp)(k: KVal => KExp) : KExp = |
411 e match { ... } |
517 e match { ... } |
412 \end{lstlisting} |
518 \end{lstlisting} |
413 |
519 |
414 \noindent |
520 \noindent |
415 where \code{k} is the continuation and \code{e} is the expression |
521 where \code{k} is the continuation and \code{e} is the expression to |
416 to be compiled. In case we have numbers or variables, we can just |
522 be compiled. In case we have numbers, we can just pass them to the |
|
523 continuations because numbers need not be further teared apart as they |
|
524 are already atomic. Passing the number to the continuation means we |
417 apply the continuation like |
525 apply the continuation like |
418 |
526 |
419 \begin{center} |
527 \begin{lstlisting}[language=Scala,numbers=none] |
420 \code{k(KNum(n))} \qquad \code{k(KVar(x))} |
528 case Num(i) => k(KNum(i)) |
421 \end{center} |
529 \end{lstlisting} |
422 |
530 |
423 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression. |
531 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression. |
424 More interesting is the case for an arithmetic expression. |
532 There is no need to create a temporary variable for simple numbers. |
425 |
533 More interesting is the case for arithmetic operations. |
426 \begin{lstlisting}[language=Scala,numbers=none] |
534 |
427 case Aop(o, e1, e2) => { |
535 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |
428 val z = Fresh("tmp") |
536 case Aop(op, e1, e2) => { |
429 CPS(e1)(y1 => |
537 val z = Fresh("tmp") |
430 CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z))))) |
538 CPS(e1)(r1 => |
|
539 CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z))))) |
431 } |
540 } |
432 \end{lstlisting} |
541 \end{lstlisting} |
433 |
542 |
434 \noindent |
543 \noindent |
435 For more such rules, have a look to the code of the fun-llvm |
544 What we essentially have to do in this case is the following: compile |
436 compiler. |
545 the subexpressions \texttt{e1} and \texttt{e2}. They will produce some |
|
546 result that is stored in two temporary variables (assuming they are more |
|
547 complicated than just numbers). We need to use these two temporary |
|
548 variables and feed them into a new assignment of the form |
|
549 |
|
550 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}] |
|
551 let z = op (*@$\Box_\texttt{r1}$@*) (*@$\Box_\texttt{r2}$@*) in |
|
552 ... |
|
553 \end{lstlisting} |
|
554 |
|
555 \noindent |
|
556 Note that this assignment has two ``holes'', one for the left |
|
557 subexpression and one for the right subexpression. We can fill both |
|
558 holes by calling the CPS function twice---this is a bit of the black |
|
559 art in CPS. We can use the second call of CPS as the continuation |
|
560 of the first call: The reason is that |
|
561 |
|
562 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |
|
563 r1 => CPS(e2)(r2 => KLet(z, KAop(op, r1, r2), k(KVar(z)))) |
|
564 \end{lstlisting} |
|
565 |
|
566 \noindent is a function from \code{KVal} to \code{KExp}. Once |
|
567 we created the assignment with the fresh temporary variable |
|
568 \texttt{z}, we need to ``communicate'' that the result of the |
|
569 computation of the arithmetic expression is stored in \texttt{z} to the |
|
570 computations that follow. In this way we apply the continuation \texttt{k} with this |
|
571 new variable (essentially we are plugging in a hole further down the line). |
|
572 |
|
573 The last case we need to consider in our small expression language are |
|
574 function calls. |
|
575 |
|
576 \begin{lstlisting}[language=Scala,numbers=none,xleftmargin=0mm] |
|
577 case Call(fname, args) => { |
|
578 def aux(args: List[Expr], vs: List[KVal]): KExp = args match { |
|
579 case Nil => { |
|
580 val z = Fresh("tmp") |
|
581 KLet(z, KCall(fname, vs), k(KVar(z))) |
|
582 } |
|
583 case a::as => CPS(a)(r => aux(as, vs ::: List(r))) |
|
584 } |
|
585 aux(args, Nil) |
|
586 } |
|
587 \end{lstlisting} |
|
588 |
|
589 \noindent |
|
590 For them we introduce an auxiliary function that compiles first all |
|
591 function arguments---remember in our source language we can have calls |
|
592 such as $foo(3 + 4, g(3))$ where we first have to create temporary |
|
593 variables (and computations) for each argument. Therefore \texttt{aux} |
|
594 analyses the arguments from left to right. In case there is an argument |
|
595 \texttt{a} on the front of the list (the case \texttt{a::as}), we call |
|
596 CPS recursively for the corresponding subexpression. The temporary variable |
|
597 containing the result for this argument we add to the end of the K-values we |
|
598 have already analysed before. Again very conveniently we can use the |
|
599 recursive call to \texttt{aux} as the continuation for the computations |
|
600 that follow. If we reach the end of the argument list (the \texttt{Nil}-case), |
|
601 then we collect all the K-values \texttt{v1} to \texttt{vn} and call the |
|
602 function in the K-language like so |
|
603 |
|
604 |
|
605 \begin{lstlisting}[language=LLVMIR,numbers=none,escapeinside={(*@}{@*)}] |
|
606 let z = call foo(v1,...,vn) in |
|
607 ... |
|
608 \end{lstlisting} |
|
609 |
|
610 \noindent |
|
611 Again we need to communicate the result of the function call, namely the |
|
612 fresh temporary variable \texttt{z}, to the rest of the computation. Therefore |
|
613 we apply the continuation \texttt{k} with this variable. |
|
614 |
|
615 |
|
616 |
|
617 \begin{figure}[p]\small |
|
618 \lstinputlisting[numbers=left,firstline=1,lastline=24]{../progs/fun/simple-cps.sc} |
|
619 \hspace{10mm}... |
|
620 \lstinputlisting[numbers=left,firstline=32,firstnumber=32,lastline=51]{../progs/fun/simple-cps.sc} |
|
621 \caption{CPS-translation for a simple expression language.\label{cps}} |
|
622 \end{figure} |
|
623 |
|
624 The last question we need to answer in the code (see Figure~\ref{cps}) is how we |
|
625 start the CPS-translation function, or more precisely with which continuation we |
|
626 should start CPS? It turns out that this initial continuation will be the last one that is |
|
627 called. What should be the last step in the computation? Yes, we need to return the |
|
628 temporary variable where the last result is stored (if it is a simple number, then we can |
|
629 just return this number). Therefore we cal CPS with the code |
|
630 |
|
631 \begin{lstlisting}[language=Scala,numbers=none] |
|
632 def CPSi(e: Expr) : KExp = CPS(e)(KReturn(_)) |
|
633 \end{lstlisting} |
|
634 |
|
635 \noindent |
|
636 where we give the function \code{KReturn(_)}. |
|
637 |
|
638 |
|
639 \begin{figure}[p]\small |
|
640 \begin{lstlisting}[language=Scala,numbers=none] |
|
641 // Fun language (expressions) |
|
642 abstract class Exp |
|
643 abstract class BExp |
|
644 |
|
645 case class Call(name: String, args: List[Exp]) extends Exp |
|
646 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp |
|
647 case class Write(e: Exp) extends Exp |
|
648 case class Var(s: String) extends Exp |
|
649 case class Num(i: Int) extends Exp |
|
650 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp |
|
651 case class Sequence(e1: Exp, e2: Exp) extends Exp |
|
652 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp |
|
653 |
|
654 |
|
655 |
|
656 // K-language (K-expressions, K-values) |
|
657 abstract class KExp |
|
658 abstract class KVal |
|
659 |
|
660 case class KVar(s: String) extends KVal |
|
661 case class KNum(i: Int) extends KVal |
|
662 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal |
|
663 case class KCall(o: String, vrs: List[KVal]) extends KVal |
|
664 case class KWrite(v: KVal) extends KVal |
|
665 |
|
666 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp |
|
667 case class KLet(x: String, v: KVal, e: KExp) extends KExp |
|
668 case class KReturn(v: KVal) extends KExp |
|
669 \end{lstlisting} |
|
670 \caption{Abstract syntax trees for the Fun-language.\label{absfun}} |
|
671 \end{figure} |
|
672 |
|
673 |
437 |
674 |
438 \noindent |
675 \noindent |
439 \end{document} |
676 \end{document} |
440 |
677 |
441 |
678 |