handouts/ho07.tex
changeset 711 6f3f3dd01786
parent 710 183663740fb7
child 712 e71eb9ce2373
equal deleted inserted replaced
710:183663740fb7 711:6f3f3dd01786
    89 \noindent The first instruction loads the constant $1$ onto the stack,
    89 \noindent The first instruction loads the constant $1$ onto the stack,
    90 the next one loads $2$, the third instruction adds both numbers together
    90 the next one loads $2$, the third instruction adds both numbers together
    91 replacing the top two elements of the stack with the result $3$. For
    91 replacing the top two elements of the stack with the result $3$. For
    92 simplicity, we will consider throughout only arithmetic involving
    92 simplicity, we will consider throughout only arithmetic involving
    93 integer numbers. This means our main JVM instructions for arithmetic
    93 integer numbers. This means our main JVM instructions for arithmetic
    94 will be \code{iadd}, \code{isub}, \code{imul}, \code{idiv} and so on.
    94 will be \instr{iadd}, \instr{isub}, \instr{imul}, \instr{idiv} and so on.
    95 The \code{i} stands for integer instructions in the JVM (alternatives
    95 The \code{i} stands for integer instructions in the JVM (alternatives
    96 are \code{d} for doubles, \code{l} for longs and \code{f} for floats
    96 are \code{d} for doubles, \code{l} for longs and \code{f} for floats
    97 etc).
    97 etc).
    98 
    98 
    99 Recall our grammar for arithmetic expressions (\meta{E} is the
    99 Recall our grammar for arithmetic expressions (\meta{E} is the
   156 \textit{compile}-function, which takes the abstract syntax tree as an
   156 \textit{compile}-function, which takes the abstract syntax tree as an
   157 argument:
   157 argument:
   158 
   158 
   159 \begin{center}
   159 \begin{center}
   160 \begin{tabular}{lcl}
   160 \begin{tabular}{lcl}
   161 $\textit{compile}(n)$ & $\dn$ & $\pcode{ldc}\; n$\\
   161 $\textit{compile}(n)$ & $\dn$ & $\instr{ldc}\; n$\\
   162 $\textit{compile}(a_1 + a_2)$ & $\dn$ &
   162 $\textit{compile}(a_1 + a_2)$ & $\dn$ &
   163 $\textit{compile}(a_1) \;@\;\textit{compile}(a_2)\;@\; \pcode{iadd}$\\
   163 $\textit{compile}(a_1) \;@\;\textit{compile}(a_2)\;@\; \instr{iadd}$\\
   164 $\textit{compile}(a_1 - a_2)$ & $\dn$ & 
   164 $\textit{compile}(a_1 - a_2)$ & $\dn$ & 
   165 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{isub}$\\
   165 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{isub}$\\
   166 $\textit{compile}(a_1 * a_2)$ & $\dn$ & 
   166 $\textit{compile}(a_1 * a_2)$ & $\dn$ & 
   167 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{imul}$\\
   167 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{imul}$\\
   168 $\textit{compile}(a_1 \backslash a_2)$ & $\dn$ & 
   168 $\textit{compile}(a_1 \backslash a_2)$ & $\dn$ & 
   169 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{idiv}$\\
   169 $\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{idiv}$\\
   170 \end{tabular}
   170 \end{tabular}
   171 \end{center}
   171 \end{center}
   172 
   172 
   173 \noindent
   173 \noindent
   174 This is all fine, but our arithmetic expressions can contain variables
   174 This is all fine, but our arithmetic expressions can contain variables
   202 expressions will therefore take two arguments: the abstract syntax tree
   202 expressions will therefore take two arguments: the abstract syntax tree
   203 and an environment, $E$, that maps identifiers to index-numbers.
   203 and an environment, $E$, that maps identifiers to index-numbers.
   204 
   204 
   205 \begin{center}
   205 \begin{center}
   206 \begin{tabular}{lcl}
   206 \begin{tabular}{lcl}
   207 $\textit{compile}(n, E)$ & $\dn$ & $\pcode{ldc}\;n$\\
   207 $\textit{compile}(n, E)$ & $\dn$ & $\instr{ldc}\;n$\\
   208 $\textit{compile}(a_1 + a_2, E)$ & $\dn$ & 
   208 $\textit{compile}(a_1 + a_2, E)$ & $\dn$ & 
   209 $\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{iadd}$\\
   209 $\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \instr{iadd}$\\
   210 $\textit{compile}(a_1 - a_2, E)$ & $\dn$ &
   210 $\textit{compile}(a_1 - a_2, E)$ & $\dn$ &
   211 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{isub}$\\
   211 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{isub}$\\
   212 $\textit{compile}(a_1 * a_2, E)$ & $\dn$ &
   212 $\textit{compile}(a_1 * a_2, E)$ & $\dn$ &
   213 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{imul}$\\
   213 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{imul}$\\
   214 $\textit{compile}(a_1 \backslash a_2, E)$ & $\dn$ & 
   214 $\textit{compile}(a_1 \backslash a_2, E)$ & $\dn$ & 
   215 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{idiv}$\\
   215 $\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{idiv}$\\
   216 $\textit{compile}(x, E)$ & $\dn$ & $\pcode{iload}\;E(x)$\\
   216 $\textit{compile}(x, E)$ & $\dn$ & $\instr{iload}\;E(x)$\\
   217 \end{tabular}
   217 \end{tabular}
   218 \end{center}
   218 \end{center}
   219 
   219 
   220 \noindent In the last line we generate the code for variables where
   220 \noindent In the last line we generate the code for variables where
   221 $E(x)$ stands for looking up the environment to which index the variable
   221 $E(x)$ stands for looking up the environment to which index the variable
   251 $\textit{x := a}$:
   251 $\textit{x := a}$:
   252 
   252 
   253 \begin{center}
   253 \begin{center}
   254 \begin{tabular}{lcl}
   254 \begin{tabular}{lcl}
   255 $\textit{compile}(x := a, E)$ & $\dn$ & 
   255 $\textit{compile}(x := a, E)$ & $\dn$ & 
   256 $(\textit{compile}(a, E) \;@\;\pcode{istore}\;index, E')$
   256 $(\textit{compile}(a, E) \;@\;\instr{istore}\;index, E')$
   257 \end{tabular}
   257 \end{tabular}
   258 \end{center}
   258 \end{center}
   259 
   259 
   260 \noindent We first generate code for the right-hand side of the
   260 \noindent We first generate code for the right-hand side of the
   261 assignment (that is the arithmetic expression $a$) and then add an
   261 assignment (that is the arithmetic expression $a$) and then add an
   262 \pcode{istore}-instruction at the end. By convention running the code
   262 \instr{istore}-instruction at the end. By convention running the code
   263 for the arithmetic expression $a$ will leave the result on top of the
   263 for the arithmetic expression $a$ will leave the result on top of the
   264 stack. After that the \pcode{istore} instruction, the result will be
   264 stack. After that the \instr{istore} instruction, the result will be
   265 stored in the index corresponding to the variable $x$. If the variable
   265 stored in the index corresponding to the variable $x$. If the variable
   266 $x$ has been used before in the program, we just need to look up what
   266 $x$ has been used before in the program, we just need to look up what
   267 the index is and return the environment unchanged (that is in this case
   267 the index is and return the environment unchanged (that is in this case
   268 $E' = E$). However, if this is the first encounter of the variable $x$
   268 $E' = E$). However, if this is the first encounter of the variable $x$
   269 in the program, then we have to augment the environment and assign $x$
   269 in the program, then we have to augment the environment and assign $x$
   297 where we can store values of variables.
   297 where we can store values of variables.
   298 
   298 
   299 A bit more complicated is the generation of code for
   299 A bit more complicated is the generation of code for
   300 \pcode{if}-statements, say
   300 \pcode{if}-statements, say
   301 
   301 
   302 \begin{lstlisting}[mathescape,language={},numbers=none]
   302 \begin{lstlisting}[mathescape,language={WHILE},numbers=none]
   303 if $b$ then $cs_1$ else $cs_2$
   303 if $b$ then $cs_1$ else $cs_2$
   304 \end{lstlisting}
   304 \end{lstlisting}
   305 
   305 
   306 \noindent where $b$ is a boolean expression and where both $cs_{1/2}$
   306 \noindent where $b$ is a boolean expression and where both $cs_{1/2}$
   307 are the statements for each of the \pcode{if}-branches. Let us assume we
   307 are the statements for each of the \pcode{if}-branches. Let us assume we
   336 $cs_1$. After this however, we must not run the code for
   336 $cs_1$. After this however, we must not run the code for
   337 $cs_2$, but always jump to after the last instruction of $cs_2$
   337 $cs_2$, but always jump to after the last instruction of $cs_2$
   338 (the code for the \pcode{else}-branch). Note that this jump is
   338 (the code for the \pcode{else}-branch). Note that this jump is
   339 unconditional, meaning we always have to jump to the end of
   339 unconditional, meaning we always have to jump to the end of
   340 $cs_2$. The corresponding instruction of the JVM is
   340 $cs_2$. The corresponding instruction of the JVM is
   341 \pcode{goto}. In case $b$ turns out to be false we need the
   341 \instr{goto}. In case $b$ turns out to be false we need the
   342 control-flow
   342 control-flow
   343 
   343 
   344 \begin{center}
   344 \begin{center}
   345 \begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
   345 \begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
   346  block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
   346  block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
   368 if-condition is false) from the end of the code for the 
   368 if-condition is false) from the end of the code for the 
   369 boolean to the beginning of the instructions $cs_2$. Once we 
   369 boolean to the beginning of the instructions $cs_2$. Once we 
   370 are finished with running $cs_2$ we can continue with whatever
   370 are finished with running $cs_2$ we can continue with whatever
   371 code comes after the if-statement.
   371 code comes after the if-statement.
   372 
   372 
   373 The \pcode{goto} and the conditional jumps need addresses to
   373 The \instr{goto} and the conditional jumps need addresses to
   374 where the jump should go. Since we are generating assembly
   374 where the jump should go. Since we are generating assembly
   375 code for the JVM, we do not actually have to give (numeric)
   375 code for the JVM, we do not actually have to give (numeric)
   376 addresses, but can just attach (symbolic) labels to our code.
   376 addresses, but can just attach (symbolic) labels to our code.
   377 These labels specify a target for a jump. Therefore the labels
   377 These labels specify a target for a jump. Therefore the labels
   378 need to be unique, as otherwise it would be ambiguous where a
   378 need to be unique, as otherwise it would be ambiguous where a
   379 jump should go to. A label, say \pcode{L}, is attached to code
   379 jump should go to. A label, say \pcode{L}, is attached to code
   380 like
   380 like
   381 
   381 
   382 \begin{lstlisting}[mathescape,numbers=none]
   382 \begin{lstlisting}[mathescape,numbers=none]
   383 L:
   383 L:
   384   $instr_1$
   384   $\textit{instr\_1}$
   385   $instr_2$
   385   $\textit{instr\_2}$
   386     $\vdots$
   386     $\vdots$
   387 \end{lstlisting}
   387 \end{lstlisting}
   388  
   388  
   389 \noindent where the label needs to be followed by a colon. The task of
   389 \noindent where the label needs to be followed by a colon. The task of
   390 the assembler (in our case Jasmin or Krakatau) is to resolve the labels
   390 the assembler (in our case Jasmin or Krakatau) is to resolve the labels
   399 for example, is as follows:
   399 for example, is as follows:
   400 
   400 
   401 \begin{center}
   401 \begin{center}
   402 \begin{tabular}{lcl}
   402 \begin{tabular}{lcl}
   403 $\textit{compile}(a_1 = a_2, E, lab)$ & $\dn$\\ 
   403 $\textit{compile}(a_1 = a_2, E, lab)$ & $\dn$\\ 
   404 \multicolumn{3}{l}{$\qquad\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{if_icmpne}\;lab$}
   404 \multicolumn{3}{l}{$\qquad\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \instr{if_icmpne}\;lab$}
   405 \end{tabular}
   405 \end{tabular}
   406 \end{center}
   406 \end{center}
   407 
   407 
   408 \noindent where we are first generating code for the
   408 \noindent where we are first generating code for the
   409 subexpressions $a_1$ and $a_2$. This will mean after running
   409 subexpressions $a_1$ and $a_2$. This will mean after running
   427 condition---equals becomes not-equal, less becomes greater-or-equal. 
   427 condition---equals becomes not-equal, less becomes greater-or-equal. 
   428 Other jump instructions for boolean operators are
   428 Other jump instructions for boolean operators are
   429 
   429 
   430 \begin{center}
   430 \begin{center}
   431 \begin{tabular}{l@{\hspace{10mm}}c@{\hspace{10mm}}l}
   431 \begin{tabular}{l@{\hspace{10mm}}c@{\hspace{10mm}}l}
   432 $\not=$ & $\Rightarrow$ & \pcode{if_icmpeq}\\
   432 $\not=$ & $\Rightarrow$ & \instr{if_icmpeq}\\
   433 $<$ & $\Rightarrow$ & \pcode{if_icmpge}\\
   433 $<$ & $\Rightarrow$ & \instr{if_icmpge}\\
   434 $\le$ & $\Rightarrow$ & \pcode{if_icmpgt}\\
   434 $\le$ & $\Rightarrow$ & \instr{if_icmpgt}\\
   435 \end{tabular}
   435 \end{tabular}
   436 \end{center}
   436 \end{center}
   437 
   437 
   438 \noindent and so on.   If you do not like this design (it can be the
   438 \noindent and so on.   If you do not like this design (it can be the
   439 source of some nasty, hard-to-detect errors), you can also change the
   439 source of some nasty, hard-to-detect errors), you can also change the
   698     return
   698     return
   699 .end method
   699 .end method
   700 \end{lstlisting}
   700 \end{lstlisting}
   701 \end{framed}
   701 \end{framed}
   702 \caption{The boilerplate code needed for running generated code. It
   702 \caption{The boilerplate code needed for running generated code. It
   703   hardwires limits for stack space and number of local
   703   hardwires limits for stack space and for the number of local
   704   variables.\label{boiler}}
   704   variables.\label{boiler}}
   705 \end{figure}
   705 \end{figure}
   706 
   706 
   707 
   707 
   708 To sum up, in Figure~\ref{test} is the complete code generated
   708 To sum up, in Figure~\ref{test} is the complete code generated
   777 multiple arrays in a WHILE-program. For looking up an element in an
   777 multiple arrays in a WHILE-program. For looking up an element in an
   778 array we can use the following JVM code
   778 array we can use the following JVM code
   779 
   779 
   780 \begin{lstlisting}[mathescape,language=JVMIS]
   780 \begin{lstlisting}[mathescape,language=JVMIS]
   781 aload loc_var 
   781 aload loc_var 
   782 index_aexp 
   782 $\textit{index\_aexp}$ 
   783 iaload
   783 iaload
   784 \end{lstlisting}
   784 \end{lstlisting}
   785 
   785 
   786 \noindent
   786 \noindent
   787 The first instruction loads the ``pointer'', or local variable, to the
   787 The first instruction loads the ``pointer'', or local variable, to the
   792 JVM to load the corresponding element onto the stack. Updating an array
   792 JVM to load the corresponding element onto the stack. Updating an array
   793 at an index with a value is as follows.
   793 at an index with a value is as follows.
   794 
   794 
   795 \begin{lstlisting}[mathescape,language=JVMIS]
   795 \begin{lstlisting}[mathescape,language=JVMIS]
   796 aload loc_var 
   796 aload loc_var 
   797 index_aexp 
   797 $\textit{index\_aexp}$ 
   798 value_aexp 
   798 $\textit{value\_aexp}$ 
   799 iastore
   799 iastore
   800 \end{lstlisting}
   800 \end{lstlisting}
   801 
   801 
   802 \noindent
   802 \noindent
   803 Again the first instruction loads the local variable of
   803 Again the first instruction loads the local variable of
   868 generate a \code{skip} instruction just before finishing with the
   868 generate a \code{skip} instruction just before finishing with the
   869 closing \code{"\}"}. The reason is that we are rather pedantic about
   869 closing \code{"\}"}. The reason is that we are rather pedantic about
   870 semicolons in our WHILE-grammar: the last command cannot have a
   870 semicolons in our WHILE-grammar: the last command cannot have a
   871 semicolon---adding a \code{skip} works around this snag. 
   871 semicolon---adding a \code{skip} works around this snag. 
   872 
   872 
   873 Putting all this together and we can generate WHILE-programs with more
   873 Putting this all together and we can generate WHILE-programs with more
   874 than 15K JVM-instructions; run the compiled JVM code for such
   874 than 15K JVM-instructions; run the compiled JVM code for such
   875 programs and marvel at the output\ldots\medskip
   875 programs and marvel at the output\ldots\medskip
   876 
   876 
   877 \noindent
   877 \noindent
   878 \ldots{}Hooooray, we can finally run the BF-mandelbrot program on the JVM: it
   878 \ldots{}Hooooray, after a few more tweaks we can finally run the
   879 completes within 20 or so seconds (after nearly 10 minutes of parsing
   879 BF-mandelbrot program on the JVM (after nearly 10 minutes of parsing the
   880 the corresponding WHILE-program; the size of the resulting class files
   880 corresponding WHILE-program; the size of the resulting class file is
   881 is around 32K). Try replicating the 20 secs with an interpreter! The
   881 around 32K---not too bad). The generation of the picture completes
       
   882 within 20 or so seconds. Try replicating this with an interpreter! The
   882 good point is that we now have a sufficiently complicated program in our
   883 good point is that we now have a sufficiently complicated program in our
   883 WHILE-language in order to do some benchmarking. Which means we now face
   884 WHILE-language in order to do some benchmarking. Which means we now face
   884 the question about what to do next\ldots
   885 the question about what to do next\ldots
   885 
   886 
   886 \subsection*{Optimisations \& Co}
   887 \subsection*{Optimisations \& Co}
   887 
   888 
   888 Every compiler that deserves its name performs some optimisation on the
   889 Every compiler that deserves its name performs optimisations on the
   889 code. If we make the extra effort of writing a compiler for a language,
   890 code. If we make the extra effort of writing a compiler for a language,
   890 then obviously we want to have our code to run as fast as possible.
   891 then obviously we want to have our code to run as fast as possible. 
   891 
   892 So we should look into this.
   892 So let's optimise a bit the code we generate. There is actually one
   893 
   893 aspect in our generated code where we can make easily efficiency gains:
   894 There is actually one aspect in our generated code where we can make
   894 this has to do with some of the quirks of the JVM. Whenever we push a
   895 easily efficiency gains: this has to do with some of the quirks of the
   895 constant onto the stack, we used the JVM instruction \code{ldc
   896 JVM. Whenever we push a constant onto the stack, we used the JVM
   896 some_const}. This is a rather generic instructions in the sense that it
   897 instruction \instr{ldc some_const}. This is a rather generic instruction
   897 works not just for integers but also for strings, objects and so on.
   898 in the sense that it works not just for integers but also for strings,
   898 What this instruction does is to put the constant into a constant pool
   899 objects and so on. What this instruction does is putting the constant
   899 and then to use an index to this constant pool. This means \code{ldc}
   900 into a \emph{constant pool} and then to use an index into this constant
   900 will be represented by at least two bytes in the class file. While this
   901 pool. This means \instr{ldc} will be represented by at least two bytes
   901 is sensible for ``large'' constants like strings, it is a bit of
   902 in the class file. While this is sensible for ``large'' constants like
   902 overkill for small integers (which many integers will be when compiling
   903 strings, it is a bit of overkill for small integers (which many integers
   903 a BF-program). To counter this ``waste'', the JVM has specific
   904 will be when compiling a BF-program). To counter this ``waste'', the JVM
   904 instructions for small integers, for example
   905 has specific instructions for small integers, for example
   905  
   906  
   906 \begin{itemize}
   907 \begin{itemize}
   907 \item \code{iconst_0},\ldots, \code{iconst_5}
   908 \item \instr{iconst_0},\ldots, \instr{iconst_5}
   908 \item \code{bipush n}
   909 \item \instr{bipush n}
   909 \end{itemize}
   910 \end{itemize}
   910 
   911 
   911 \noindent
   912 \noindent
   912 where the \code{n} is \code{bipush} is between -128 and 128.   By having
   913 where the \code{n} is \instr{bipush} is between -128 and 128.   By
   913 dedicated instructions such as \code{iconst_0} to \code{iconst_5} (and
   914 having dedicated instructions such as \instr{iconst_0} to
   914 \code{iconst_m1}), we can make the generated code size smaller as these
   915 \instr{iconst_5} (and \instr{iconst_m1}), we can make the generated code
   915 instructions only require 1 Byte (as opposed the generic \code{ldc}
   916 size smaller as these instructions only require 1 byte (as opposed the
   916 which needs 1 Byte plus another for the index into the constant pool).
   917 generic \instr{ldc} which needs 1 byte plus another for the index into
   917 While in theory the use of such special instructions should make the
   918 the constant pool). While in theory the use of such special instructions
   918 code only smaller, it actually makes the code also run faster. Probably
   919 should make the code only smaller, it actually makes the code also run
   919 because the JVM has to process less code and uses a specific instruction
   920 faster. Probably because the JVM has to process less code and uses a
   920 in the underlying CPU.  The story with \code{bipush} is slightly
   921 specific instruction in the underlying CPU.  The story with
   921 different, because it also uses two Bytes---so it does not result in a
   922 \instr{bipush} is slightly different, because it also uses two
   922 reduction in code size. But probably it uses  specific instruction in
   923 bytes---so it does not result in a reduction in code size. But againm,
   923 the underlying CPU which make the JVM code run faster.  
   924 it probably uses a specific instruction in the underlying CPU that make
   924 
   925 the JVM code run faster. This means when generating code we can use
   925 
   926 the following helper function
   926 
   927 
       
   928 \begin{lstlisting}[language=Scala]
       
   929 def compile_num(i: Int) = 
       
   930   if (0 <= i && i <= 5) i"iconst_$i" else 
       
   931   if (-128 <= i && i <= 127) i"bipush $i" else i"ldc $i"
       
   932 \end{lstlisting}
       
   933 
       
   934 \noindent 
       
   935 that generates the more efficient instructions for pushing a constant
       
   936 onto the stack. Note the JVM also has special instructions  that
       
   937 load and store the first three local variables. The assumption is that 
       
   938 most operations and arguments in a method will only use very few 
       
   939 local variables. So the JVM has the following instructions:
   927 
   940 
   928 \begin{itemize}
   941 \begin{itemize}
   929 \item \code{iload_0},\ldots, \code{iload_3}
   942 \item \instr{iload_0},\ldots, \instr{iload_3}
   930 \item \code{istore_0},\ldots, \code{istore_3}
   943 \item \instr{istore_0},\ldots, \instr{istore_3}
   931 \item \code{aload_0},\ldots, \code{aload_3}
   944 \item \instr{aload_0},\ldots, \instr{aload_3}
   932 \item \code{astore_0},\ldots, \code{astore_3}
   945 \item \instr{astore_0},\ldots, \instr{astore_3}
   933 \end{itemize}
   946 \end{itemize}
   934 
   947 
   935 % 33296 bytes -> 21787
   948 
   936 % 21 ->  16 seconds
   949 \noindent Having implemented these optimisations, the code size of the
       
   950 BF-Mandelbrot program reduces and also it runs faster. According to my
       
   951 very rough experiments:
       
   952 
       
   953 \begin{center}
       
   954 \begin{tabular}{lll}
       
   955 & class-size & runtime\\\hline
       
   956 Mandelbrot:\\
       
   957 \hspace{5mm}unoptimised: & 33296 & 21 secs\\
       
   958 \hspace{5mm}optimised:   & 21787 & 16 secs\\
       
   959 \end{tabular}
       
   960 \end{center}
       
   961 
       
   962 \noindent 
       
   963 Quite good! Such optimisations are called \emph{peephole optimisations},
       
   964 because it is type of optimisations that involve changing a small set of
       
   965 instructions into an equivalent set that has better performance. 
       
   966 
       
   967 If you look careful at our code you will quickly find another source of
       
   968 inefficiency in programs like
       
   969 
       
   970 \begin{lstlisting}[mathescape,language=While]
       
   971 x := ...;
       
   972 write x
       
   973 \end{lstlisting}
       
   974 
       
   975 \noindent
       
   976 where our code first calculates the new result the for \texttt{x} on the
       
   977 stack, then pops off the result into a local variable, and after that
       
   978 loads the local variable back onto the stack for writing out a number.
       
   979 If we can detect such situations, then we can leave the value of
       
   980 \texttt{x} on the stack with for example the much cheaper instruction
       
   981 \instr{dup}. Now the problem with this optimisation is that it is quite
       
   982 easy for the snippet above, but what about instances where there is
       
   983 further WHILE-code in \emph{between} these two statements? Sometimes we
       
   984 will be able to optimise, sometimes we will not. The compiler needs to
       
   985 find out which situation applies. This can become quickly much more
       
   986 complicated. So we leave this kind of optimisations here and look at
       
   987 something more interesting and possibly surprising.
       
   988 
   937 
   989 
   938 As you have probably seen, the compiler writer has a lot of freedom
   990 As you have probably seen, the compiler writer has a lot of freedom
   939 about how to generate code from what the programmer wrote as program.
   991 about how to generate code from what the programmer wrote as program.
   940 The only condition is that generated code should behave as expected by
   992 The only condition is that generated code should behave as expected by
   941 the programmer. Then all is fine\ldots mission accomplished! But
   993 the programmer. Then all is fine\ldots mission accomplished! But
   942 sometimes the compiler writer is expected to go an extra mile, or even
   994 sometimes the compiler writer is expected to go an extra mile, or even
   943 miles. Suppose we are given the following WHILE-program:
   995 miles and change the meaning of a program in unexpected ways. Suppose we
       
   996 are given the following WHILE-program:
   944 
   997 
   945 \begin{lstlisting}[mathescape,language=While]
   998 \begin{lstlisting}[mathescape,language=While]
   946 new(arr[10]);
   999 new(arr[10]);
   947 arr[14] := 3 + arr[13]
  1000 arr[14] := 3 + arr[13]
   948 \end{lstlisting}
  1001 \end{lstlisting}
   949 
  1002 
   950 \noindent 
  1003 \noindent 
   951 While admittedly this is a contrived program, and probably not meant to
  1004 Admittedly this is a contrived program, and probably not meant to be
   952 be like this by any sane programmer, it is supposed to make the
  1005 like this by any sane programmer, but it is supposed to make the
   953 following point: We generate an array of size 10, and then try to access
  1006 following point: We generate an array of size 10, and then try to access
   954 the non-existing element at index 13 and even updating element with
  1007 the non-existing element at index 13 and even updating element with
   955 index 14. Obviously this is baloney. However, our compiler generates
  1008 index 14. Obviously this is baloney. Still, our compiler generates code
   956 code for this program without any questions asked. We can even run this
  1009 for this program without any questions asked. We can even run this code
   957 code on the JVM\ldots of course the result is an exception trace where
  1010 on the JVM\ldots of course the result is an exception trace where the
   958 the JVM yells at us for doing naughty things. (This is much better than
  1011 JVM yells at us for doing naughty things. (This is much better than C,
   959 C, for example, where such errors are not prevented and as a result
  1012 for example, where such errors are not prevented and as a result
   960 insidious attacks can be mounted against such kind C-programs. I assume
  1013 insidious attacks can be mounted against such kind C-programs. I assume
   961 everyone has heard about \emph{Buffer Overflow Attacks}.) 
  1014 everyone has heard about \emph{Buffer Overflow Attacks}.) Now what
   962 
  1015 should we do in such situations? Index over- or underflows are
   963 Imagine we do not want to rely in our compiler on the JVM for producing
  1016 notoriously difficult to detect statically (at compiletime), so it seem
   964 an annoying, but safe exception trace, rather we want to handle such
  1017 raising an exception at run-time like the JVM is the best compromise.
   965 situations ourselves. Lets assume we want to handle them in the
  1018 
       
  1019 Well, imagine we do not want to rely in our compiler on the JVM for
       
  1020 producing an annoying, but safe exception trace, rather we want to
       
  1021 handle such situations ourselves according to what we thing should
       
  1022 happen in such cases. Let's assume we want to handle them in the
   966 following way: if the programmer access a field out-of-bounds, we just
  1023 following way: if the programmer access a field out-of-bounds, we just
   967 return the default 0, and if a programmer wants to update an
  1024 return a default 0, and if a programmer wants to update an
   968 out-of-bounds filed, we want to ``quietly'' ignore this update.
  1025 out-of-bounds field, we want to ``quietly'' ignore this update.
   969 
  1026 
   970 
  1027 
   971 arraylength
  1028 arraylength
   972 
  1029 
   973 
  1030