56 |
56 |
57 def gcd(a, b) = if b == 0 then a else gcd(b, a % b); |
57 def gcd(a, b) = if b == 0 then a else gcd(b, a % b); |
58 \end{lstlisting} |
58 \end{lstlisting} |
59 |
59 |
60 \noindent Compare the code of the fib-program with the same program |
60 \noindent Compare the code of the fib-program with the same program |
61 written in the While-language\ldots Fun is definitely more comfortable. |
61 written in the WHILE-language\ldots Fun is definitely more comfortable. |
62 We will still focus on programs involving integers only, that means for |
62 We will still focus on programs involving integers only, that means for |
63 example that every function in Fun is expected to return an integer. The |
63 example that every function in Fun is expected to return an integer. The |
64 point of the Fun language is to compile each function to a separate |
64 point of the Fun language is to compile each function to a separate |
65 method in JVM bytecode (not just a big monolithic code chunk). The means |
65 method in JVM bytecode (not just a big monolithic code chunk). The means |
66 we need to adapt to some of the conventions of the JVM about methods. |
66 we need to adapt to some of the conventions of the JVM about methods. |
67 |
67 |
68 The grammar of the Fun-language is slightly simpler than the |
68 The grammar of the Fun-language is slightly simpler than the |
69 While-language, because the main syntactic category are |
69 WHILE-language, because the main syntactic category are |
70 expressions (we do not have statements in Fun). The grammar rules are |
70 expressions (we do not have statements in Fun). The grammar rules are |
71 as follows:\footnote{We of course have a slightly different (non-left-recursive) |
71 as follows:\footnote{We of course have a slightly different (non-left-recursive) |
72 grammar for our parsing combinators. But for simplicity sake we leave |
72 grammar for our parsing combinators. But for simplicity sake we leave |
73 these details to the implementation.} |
73 these details to the implementation.} |
74 |
74 |
133 args: List[String], |
133 args: List[String], |
134 body: Exp) extends Decl |
134 body: Exp) extends Decl |
135 case class Main(e: Exp) extends Decl |
135 case class Main(e: Exp) extends Decl |
136 \end{lstlisting}} |
136 \end{lstlisting}} |
137 |
137 |
138 Let us first look at some clauses for compiling expressions. The |
138 The rest of the hand out is about compiling this language. Let us first |
139 compilation of arithmetic and boolean expressions is just like for the |
139 look at some clauses for compiling expressions. The compilation of |
140 While-language and does not need any modification (recall that the |
140 arithmetic and boolean expressions is just like for the WHILE-language |
|
141 and does not need any modification (recall that the |
141 \textit{compile}-function for boolean expressions takes a third argument |
142 \textit{compile}-function for boolean expressions takes a third argument |
142 for the label where the control-flow should jump when the boolean |
143 for the label where the control-flow should jump when the boolean |
143 expression is \emph{not} true---this is needed for compiling |
144 expression is \emph{not} true---this is needed for compiling |
144 \pcode{if}s). One additional feature in the Fun-language are sequences. |
145 \pcode{if}s). One additional feature in the Fun-language are sequences. |
145 Their purpose is to do one calculation after another or printing out an |
146 Their purpose is to do one calculation after another or printing out an |
146 intermediate result. The reason why we need to be careful however is the |
147 intermediate result. The reason why we need to be careful however is the |
147 convention that every expression can only produce s single result |
148 convention that every expression can only produce a single result |
148 (including sequences). Since this result will be on the top of the |
149 (including sequences). Since this result will be on the top of the |
149 stack, we need to generate a \pcode{pop}-instruction for sequences in |
150 stack, we need to generate a \pcode{pop}-instruction for sequences in |
150 order to clean-up the stack. For example, for an expression of the form |
151 order to clean-up the stack. For example, for an expression of the form |
151 \pcode{exp1 ; exp2} we need to generate code where after the first code |
152 \pcode{exp1 ; exp2} we need to generate code where after the first code |
152 chunk a \pcode{pop}-instruction is needed. |
153 chunk a \pcode{pop}-instruction is needed. |
193 dup |
194 dup |
194 invokestatic XXX/XXX/write(I)V |
195 invokestatic XXX/XXX/write(I)V |
195 \end{lstlisting} |
196 \end{lstlisting} |
196 |
197 |
197 \noindent We also need to first generate code for the |
198 \noindent We also need to first generate code for the |
198 argument-expression of write, which in the While-language was |
199 argument-expression of write, which in the WHILE-language was |
199 only allowed to be a single variable. |
200 only allowed to be a single variable. |
200 |
201 |
201 Most of the new code in the compiler for the Fun-language |
202 Most of the new code in the compiler for the Fun-language |
202 comes from function definitions and function calls. For this |
203 comes from function definitions and function calls. For this |
203 have a look again at the helper function in |
204 have a look again at the helper function in |
345 the result on top of the stack as the result of the |
346 the result on top of the stack as the result of the |
346 add-function. |
347 add-function. |
347 |
348 |
348 \subsection*{Tail-Call Optimisations} |
349 \subsection*{Tail-Call Optimisations} |
349 |
350 |
350 Let us now briefly touch upon the vast topic of compiler optimisations. |
351 Let us now briefly touch again upon the vast topic of compiler |
351 As an example, let's perform tail-call optimisations for our |
352 optimisations. As an example, let's perform tail-call optimisations for |
352 Fun-language. Consider the following version of the factorial function: |
353 our Fun-language. Consider the following version of the factorial |
|
354 function: |
353 |
355 |
354 \begin{lstlisting} |
356 \begin{lstlisting} |
355 def facT(n, acc) = |
357 def facT(n, acc) = |
356 if n == 0 then acc |
358 if n == 0 then acc |
357 else facT(n - 1, n * acc); |
359 else facT(n - 1, n * acc); |
358 \end{lstlisting} |
360 \end{lstlisting} |
359 |
361 |
360 \noindent |
362 \noindent |
361 The corrsponding JVM code for this function is below: |
363 The corresponding JVM code for this function is below: |
362 |
364 |
363 \begin{lstlisting}[language=JVMIS, numbers=left] |
365 \begin{lstlisting}[language=JVMIS, numbers=left] |
364 .method public static facT(II)I |
366 .method public static facT(II)I |
365 .limit locals 2 |
367 .limit locals 2 |
366 .limit stack 6 |
368 .limit stack 6 |
390 onto the stack. That is how we communicate arguments to a function. To |
392 onto the stack. That is how we communicate arguments to a function. To |
391 see the the difficulty, imagine you call this function 1000 times |
393 see the the difficulty, imagine you call this function 1000 times |
392 recursively. Each call results in some hefty overhead on the |
394 recursively. Each call results in some hefty overhead on the |
393 stack---ultimately leading to a stack overflow. Well, it is possible to |
395 stack---ultimately leading to a stack overflow. Well, it is possible to |
394 avoid this overhead completely in many circumstances. This is what |
396 avoid this overhead completely in many circumstances. This is what |
395 tail-call optimisations are about. |
397 \emph{tail-call optimisations} are about. |
396 |
398 |
397 Note that the call to \code{facT} in the program is the last instruction |
399 Note that the call to \code{facT} in the program is the last instruction |
398 before the \code{ireturn} (the label in Line 17 does not count since it |
400 before the \code{ireturn} (the label in Line 17 does not count since it |
399 is not an instruction). Also remember, before we make the recursive call |
401 is not an instruction). Also remember, before we make the recursive call |
400 the arguments of \code{facT} need to put on the stack. Once we are |
402 the arguments of \code{facT} need to be put on the stack. Once we are |
401 ``inside'' the arguments on the stack turn into local variables. Therefore |
403 ``inside'' the function, the arguments on the stack turn into local |
|
404 variables. Therefore |
402 \code{n} and \code{acc} are referenced inside the function with \pcode{iload 0} |
405 \code{n} and \code{acc} are referenced inside the function with \pcode{iload 0} |
403 and \pcode{iload 1} respectively. |
406 and \pcode{iload 1} respectively. |
404 |
407 |
405 The idea of tail-call optimisation is to eliminate the expensive |
408 The idea of tail-call optimisation is to eliminate the expensive |
406 recursive functions call and replace it by a simple jump back to the |
409 recursive functions call and replace it by a simple jump back to the |