# HG changeset patch # User Christian Urban # Date 1574034834 0 # Node ID 605d971e98fd175436f1bbacba7c09bf66b3ff91 # Parent 8c7ccdebcb89ab07e914f5675c70198705f513be updated diff -r 8c7ccdebcb89 -r 605d971e98fd handouts/ho08.pdf Binary file handouts/ho08.pdf has changed diff -r 8c7ccdebcb89 -r 605d971e98fd handouts/ho08.tex --- a/handouts/ho08.tex Sun Nov 17 16:12:16 2019 +0000 +++ b/handouts/ho08.tex Sun Nov 17 23:53:54 2019 +0000 @@ -5,6 +5,17 @@ \usepackage{../grammar} \usepackage{../graphics} \usepackage{amssymb} +\usepackage{xcolor} + +\usepackage[most]{tcolorbox} + +\newtcbox{\hm}[1][]{% + size=fbox, + tcbox raise base, nobeforeafter, + enhanced, colframe=gray, + colback=gray!30, boxrule=1pt, + fontupper=\ttfamily, + #1} % modern compilers are different % https://channel9.msdn.com/Blogs/Seth-Juarez/Anders-Hejlsberg-on-Modern-Compiler-Construction @@ -13,6 +24,8 @@ %https://dfilaretti.github.io/2017-04-30/halting-problem + + \begin{document} \lstset{morekeywords={def,if,then,else,write,read},keywordstyle=\color{codepurple}\bfseries} @@ -26,7 +39,7 @@ inside the main function. In this handout we like to have a look at a slightly more comfortable language, which I call Fun-language, and a tiny-teeny bit more realistic compiler. -The Fun-language is a functional programming language. A small +The Fun-language is a small functional programming language. A small collection of programs we want to be able to write and compile is as follows: @@ -44,15 +57,13 @@ def gcd(a, b) = if b == 0 then a else gcd(b, a % b); \end{lstlisting} -\noindent Compare the code of the fib-program with the same -program written in the While-language\ldots Fun is definitely -more comfortable. We will still focus on programs involving -integers only, that means for example that every function is -expected to return an integer. The point of the Fun language -is to compile each function to a separate method in JVM -bytecode (not just a big monolithic code chunk). The means we -need to adapt to some of the conventions of the JVM about -methods. +\noindent Compare the code of the fib-program with the same program +written in the While-language\ldots Fun is definitely more comfortable. +We will still focus on programs involving integers only, that means for +example that every function in Fun is expected to return an integer. The +point of the Fun language is to compile each function to a separate +method in JVM bytecode (not just a big monolithic code chunk). The means +we need to adapt to some of the conventions of the JVM about methods. The grammar of the Fun-language is slightly simpler than the While-language, because the main syntactic category are @@ -63,7 +74,7 @@ \begin{plstx}[rhs style=,margin=1.5cm] : \meta{Exp} ::= \meta{Id} | \meta{Num} {\hspace{3.7cm}} | \meta{Exp} + \meta{Exp} | ... | (\meta{Exp}) {\hspace{3.7cm}} - | \code{if} \meta{BExp} \code{then} \meta{Exp} \code{else} \meta{Exp} + | \code{if} \; \meta{BExp} \; \code{then} \; \meta{Exp} \; \code{else} \; \meta{Exp} | \code{write} \meta{Exp} {\hspace{5cm}} | \meta{Exp} ; \meta{Exp} {\hspace{5cm}} | \textit{FunName} (\meta{Exp}, ..., \meta{Exp})\\ @@ -120,23 +131,21 @@ case class Main(e: Exp) extends Decl \end{lstlisting}} -Let us first look at some clauses for compiling expressions. -The compilation of arithmetic and boolean expressions is just -like for the While-language and do not need any modification -(recall that the \textit{compile}-function for boolean -expression takes a third argument for the label where the -control-flow should jump when the boolean expression is -\emph{not} true---this is needed for compiling \pcode{if}s). -One additional feature in the Fun-language are sequences. -Their purpose is to do one calculation after another. The -reason why we need to be careful however is the convention -that every expression can only produce s single result -(including sequences). Since this result will be on the -top of the stack, we need to generate a -\pcode{pop}-instruction in order to clean-up the stack. Given -the expression of the form \pcode{exp1 ; exp2} we need to -generate code where after the first code chunk -a \pcode{pop}-instruction is needed. +Let us first look at some clauses for compiling expressions. The +compilation of arithmetic and boolean expressions is just like for the +While-language and does not need any modification (recall that the +\textit{compile}-function for boolean expressions takes a third argument +for the label where the control-flow should jump when the boolean +expression is \emph{not} true---this is needed for compiling +\pcode{if}s). One additional feature in the Fun-language are sequences. +Their purpose is to do one calculation after another or printing out an +intermediate result. The reason why we need to be careful however is the +convention that every expression can only produce s single result +(including sequences). Since this result will be on the top of the +stack, we need to generate a \pcode{pop}-instruction for sequences in +order to clean-up the stack. For example, for an expression of the form +\pcode{exp1 ; exp2} we need to generate code where after the first code +chunk a \pcode{pop}-instruction is needed. \begin{lstlisting}[language=JVMIS, numbers=none,mathescape] $\textrm{\textit{compile}}($exp1$)$ @@ -332,6 +341,135 @@ the result on top of the stack as the result of the add-function. +\subsection*{Tail-Call Optimisations} + +Let us now briefly touch upon the vast topic of compiler optimisations. +As an example, let's perform tail-call optimisations for our +Fun-language. Consider the following version of the factorial function: + +\begin{lstlisting} +def facT(n, acc) = + if n == 0 then acc + else facT(n - 1, n * acc); +\end{lstlisting} + +\noindent +The corrsponding JVM code for this function is below: + +\begin{lstlisting}[language=JVMIS, numbers=left] +.method public static facT(II)I +.limit locals 2 +.limit stack 6 + iload 0 + ldc 0 + if_icmpne If_else_2 + iload 1 + goto If_end_3 +If_else_2: + iload 0 + ldc 1 + isub + iload 0 + iload 1 + imul + invokestatic fact/fact/facT(II)I +If_end_3: + ireturn +.end method +\end{lstlisting} + +\noindent +The interesting part is in Lines 10 to 16. Since the \code{facT} +function is recursive, we have also a recursive call in Line 16 in the +JVM code. The problem is that before we can make the recursive call, we +need to put the two arguments, namely \code{n - 1} and \code{n * acc}, +onto the stack. That is how we communicate arguments to a function. To +see the the difficulty, imagine you call this function 1000 times +recursively. Each call results in some hefty overhead on the +stack---ultimately leading to a stack overflow. Well, it is possible to +avoid this overhead completely in many circumstances. This is what +tail-call optimisations are about. + +Note that the call to \code{facT} in the program is the last instruction +before the \code{ireturn} (the label in Line 17 does not count since it +is not an instruction). Also remember, before we make the recursive call +the arguments of \code{facT} need to put on the stack. Once we are +``inside'' the arguments on the stack turn into local variables. Therefore +\code{n} and \code{acc} are referenced inside the function with \pcode{iload 0} +and \pcode{iload 1} respectively. + +The idea of tail-call optimisation is to eliminate the expensive recursive +functions call and replace it by a simple jump back to the beginning of the +function. To make this work we have to change how we communicate the arguments +to the next level of ``recursion'': we cannot use the stack, but have to load +the argument into the corresponding local variables. This gives the following +code + +\begin{lstlisting}[language=JVMIS, numbers=left, escapeinside={(*@}{@*)}] +.method public static facT(II)I +.limit locals 2 +.limit stack 6 +(*@\hm{facT\_Start:} @*) + iload 0 + ldc 0 + if_icmpne If_else_2 + iload 1 + goto If_end_3 +If_else_2: + iload 0 + ldc 1 + isub + iload 0 + iload 1 + imul + (*@\hm{istore 1} @*) + (*@\hm{istore 0} @*) + (*@\hm{goto facT\_Start} @*) +If_end_3: + ireturn +.end method +\end{lstlisting} + +\noindent +In Line 4 we introduce a label for indicating where the start of the +function is. Important are Lines 17 and 18 where we store the values +from the stack into local variables. When we then jump back to the +beginning of the function (in Line 19) it will look like to the +function as if the function had been called the normal way via values +on the stack. But because of the jump, clearly, no memory on the +stack is needed. In effect we replaced a recursive call with a simple +loop. + +Can this optimisation always be applied? Unfortunately not. The +recursive call needs to be in tail-call position, that is the last +operation needs to be the recursive call. This is for example +not the case with the usual formulation of the factorial function. +Consider again the Fun-program + +\begin{lstlisting}[numbers=none] +def fact(n) = if n == 0 then 1 + else n * fact(n - 1) +\end{lstlisting} + +\noindent +In this version of the factorial function the recursive call is +\emph{not} the last operation (which can also been seen very clearly +in the generated JVM code). Because of this, the plumbing of local +variables would not work and in effect the optimisation is not applicable. +Very roughly speaking the tail-position of a function is in the two +highlighted places + +\begin{itemize} +\item \texttt{if Bexp then \hm{Exp} else \hm{Exp}} +\item \texttt{Exp ; \hm{Exp}} +\end{itemize} + +To sum up, the compiler needs to recognise when a recursive call +is in tail-position. It then can apply the tail-call optimisations +technique, which is well known and widely implemented in compilers +for functional programming languages. + + \end{document} %%% Local Variables: diff -r 8c7ccdebcb89 -r 605d971e98fd progs/compile2.scala --- a/progs/compile2.scala Sun Nov 17 16:12:16 2019 +0000 +++ b/progs/compile2.scala Sun Nov 17 23:53:54 2019 +0000 @@ -514,9 +514,6 @@ bf_run("""++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++ ..+++.>>.<-.<.+++.------.--------.>>+.>++.""", "hello") -println("BF SIER Prog") -println(bf_str(bf1).replaceAll("\\s", "").split(";").toList.length) - val bf2 = """+++++++++++ >+>>>>++++++++++++++++++++++++++++++++++++++++++++ @@ -532,10 +529,6 @@ bf_run(bf2, "fibs") -println("BF FIB Prog") -println(bf_str(bf2).replaceAll("\\s", "").split(";").toList.length) - -/* bf_run(""" A mandelbrot set fractal viewer in brainf*** written by Erik Bosman +++++++++++++[->++>>>+++++>++>+<<<<<<]>>>>>++++++>--->>>>>>>>>>+++++++++++++++[[ @@ -683,5 +676,3 @@ +[-[->>>>>>>>>+<<<<<<<<<]>>>>>>>>>]>>>>>->>>>>>>>>>>>>>>>>>>>>>>>>>>-<<<<<<[<<<< <<<<<]]>>>]""", "mand") - -*/