afl-material: comparison handouts/ho07.tex

equal deleted inserted replaced

-:6f3f3dd01786
+:e71eb9ce2373
 \end{center}
 \noindent
 The input will be WHILE-programs; the output will be assembly files
 (with the file extension .j). Assembly files essentially contain
-human-readable machine code, meaning they are not just bits and bytes,
+human-readable low-level code, meaning they are not just bits and bytes,
 but rather something you can read and understand---with a bit of
 practice of course. An \emph{assembler} will then translate the assembly
-files into unreadable class- or binary-files the JVM can run.
+files into unreadable class- or binary-files the JVM or CPU can run.
 Unfortunately, the Java ecosystem does not come with an assembler which
 would be handy for our compiler-endeavour  (unlike Microsoft's  Common
 Language Infrastructure for the .Net platform which has an assembler
-out-of-the-box). As a substitute we shall use the 3rd-party
+out-of-the-box). As a substitute we shall use the 3rd-party programs
-programs Jasmin and Krakatau
+Jasmin and Krakatau
 \begin{itemize}
 \item \url{http://jasmin.sourceforge.net}
 \item \url{https://github.com/Storyyeller/Krakatau}
 \end{itemize}
 \noindent where \meta{Id} stands for variables and \meta{Num}
 for numbers. For the moment let us omit variables from arithmetic
 expressions. Our parser will take this grammar and given an input
-program produce an abstract syntax tree. For example we will obtain for
+program produce an abstract syntax tree. For example we obtain for
 the expression $1 + ((2 * 3) + (4 - 3))$ the following tree.
 \begin{center}
 \begin{tikzpicture}
 \Tree [.$+$ [.$1$ ] [.$+$ [.$*$ $2$ $3$ ] [.$-$ $4$ $3$ ]]]
 that assignments in the WHILE-language might change the
 environment---clearly if a variable is used for the first
 time, we need to allocate a new index and if it has been used
 before, then we need to be able to retrieve the associated index.
 This is reflected in the clause for compiling assignments, say
-$\textit{x := a}$:
+$x := a$:
 \begin{center}
 \begin{tabular}{lcl}
 $\textit{compile}(x := a, E)$ & $\dn$ &
 $(\textit{compile}(a, E) \;@\;\instr{istore}\;index, E')$
 \noindent We first generate code for the right-hand side of the
 assignment (that is the arithmetic expression $a$) and then add an
 \instr{istore}-instruction at the end. By convention running the code
 for the arithmetic expression $a$ will leave the result on top of the
-stack. After that the \instr{istore} instruction, the result will be
+stack. After that the \instr{istore}-instruction, the result will be
 stored in the index corresponding to the variable $x$. If the variable
 $x$ has been used before in the program, we just need to look up what
 the index is and return the environment unchanged (that is in this case
 $E' = E$). However, if this is the first encounter of the variable $x$
 in the program, then we have to augment the environment and assign $x$
 where the jump should go. Since we are generating assembly
 code for the JVM, we do not actually have to give (numeric)
 addresses, but can just attach (symbolic) labels to our code.
 These labels specify a target for a jump. Therefore the labels
 need to be unique, as otherwise it would be ambiguous where a
-jump should go to. A label, say \pcode{L}, is attached to code
+jump should go to. A label, say \pcode{L}, is attached to assembly
-like
+code like
 \begin{lstlisting}[mathescape,numbers=none]
 L:
 $\textit{instr\_1}$
 $\textit{instr\_2}$
 for the jump addresses (just before the else-branch and just
 after). In the next two lines we generate the instructions for
 the two branches, $is_1$ and $is_2$. The final code will
 be first the code for $b$ (including the label
 just-before-the-else-branch), then the \pcode{goto} for after
-the else-branch, the label $L_\textit{ifesle}$, followed by
+the else-branch, the label $L_\textit{ifelse}$, followed by
 the instructions for the else-branch, followed by the
 after-the-else-branch label. Consider for example the
 if-statement:
 \begin{lstlisting}[mathescape,numbers=none,language=While]
 \end{tikzpicture}
 \noindent The first three lines correspond to the the boolean
 expression $1 = 1$. The jump for when this boolean expression
 is false is in Line~3. Lines 4-6 corresponds to the if-branch;
-the else-branch is in Lines 8 and 9. Note carefully how the
+the else-branch is in Lines 8 and 9.
-environment $E$ is threaded through the recursive calls of
-\textit{compile}. The function receives an environment $E$,
+Note carefully how the environment $E$ is threaded through the recursive
-but it might extend it when compiling the if-branch, yielding
+calls of \textit{compile}. The function receives an environment $E$, but
-$E'$. This happens for example in the if-statement above
+it might extend it when compiling the if-branch, yielding $E'$. This
-whenever the variable \code{x} has not been used before.
+happens for example in the if-statement above whenever the variable
-Similarly with the environment $E''$ for the second call to
+\code{x} has not been used before. Similarly with the environment $E''$
-\textit{compile}. $E''$ is also the environment that needs to
+for the second call to \textit{compile}. $E''$ is also the environment
-be returned as part of the answer.
+that needs to be returned as part of the answer.
 The compilation of the while-loops, say
 \pcode{while} $b$ \pcode{do} $cs$, is very similar. In case
 the condition is true and we need to do another iteration,
 and the control-flow needs to be as follows
 \pcode{write}. It takes a single integer argument indicated by the
 \pcode{(I)} and returns no result, indicated by the \pcode{V} (for
 void). Since the method has only one argument, we only need a single
 local variable (Line~2) and a stack with two cells will be sufficient
 (Line 3). Line 4 instructs the JVM to get the value of the member
-\pcode{out} of the class \pcode{java/lang/System}. It expects the value
+\pcode{out} from the class \pcode{java/lang/System}. It expects the value
 to be of type \pcode{java/io/PrintStream}. A reference to this value
 will be placed on the stack.\footnote{Note the syntax \texttt{L
 \ldots{};} for the \texttt{PrintStream} type is not an typo. Somehow the
 designers of Jasmin decided that this syntax is pleasing to the eye. So
 if you wanted to have strings in your Jasmin code, you would need to
 not support ``dynamic'' arrays, that is the size of our arrays will
 always be fixed. The second construct is for referencing an array cell
 inside an arithmetic expression---we need to be able to look up the
 contents of an array at an index determined by an arithmetic expression.
 Similarly in the line below, we need to be able to update the content of
-an array at an calculated index.
+an array at a calculated index.
 For creating a new array we can generate the following three JVM
 instructions:
 \begin{lstlisting}[mathescape,language=JVMIS]
 | \texttt{new}(\meta{Id}\,[\,\meta{Num}\,])
 | \meta{Id}\,[\,\meta{E}\,]\,:=\,\meta{E}\\
 \end{plstx}
 With this in place we can turn back to the idea of creating
-WHILE-programs by translating BF programs. This is a relatively easy
+WHILE-programs by translating BF-programs. This is a relatively easy
 task because BF has only eight instructions (we will actually implement
 seven because we can omit the read-in instruction from BF). What makes
 this translation easy is that BF-loops can be straightforwardly
 represented as while-loops. The Scala code for the translation is as
 follows:
 WHILE-language in order to do some benchmarking. Which means we now face
 the question about what to do next\ldots
 \subsection*{Optimisations \& Co}
-Every compiler that deserves its name performs optimisations on the
+Every compiler that deserves its name has to perform some optimisations
-code. If we make the extra effort of writing a compiler for a language,
+on the code: if we put in the extra effort of writing a compiler for a
-then obviously we want to have our code to run as fast as possible.
+language, then obviously we want to have our code to run as fast as
-So we should look into this.
+possible. So we should look into this in more detail.
 There is actually one aspect in our generated code where we can make
-easily efficiency gains: this has to do with some of the quirks of the
+easily efficiency gains. This has to do with some of the quirks of the
 JVM. Whenever we push a constant onto the stack, we used the JVM
 instruction \instr{ldc some_const}. This is a rather generic instruction
 in the sense that it works not just for integers but also for strings,
 objects and so on. What this instruction does is putting the constant
-into a \emph{constant pool} and then to use an index into this constant
+into a \emph{constant pool} and then uses an index into this constant
 pool. This means \instr{ldc} will be represented by at least two bytes
-in the class file. While this is sensible for ``large'' constants like
+in the class file. While this is a sensible strategy for ``large''
-strings, it is a bit of overkill for small integers (which many integers
+constants like strings, it is a bit of overkill for small integers
-will be when compiling a BF-program). To counter this ``waste'', the JVM
+(which many integers will be when compiling a BF-program). To counter
-has specific instructions for small integers, for example
+this ``waste'', the JVM has specific instructions for small integers,
+for example
 \begin{itemize}
 \item \instr{iconst_0},\ldots, \instr{iconst_5}
 \item \instr{bipush n}
 \end{itemize}
 size smaller as these instructions only require 1 byte (as opposed the
 generic \instr{ldc} which needs 1 byte plus another for the index into
 the constant pool). While in theory the use of such special instructions
 should make the code only smaller, it actually makes the code also run
 faster. Probably because the JVM has to process less code and uses a
-specific instruction in the underlying CPU.  The story with
+specific instruction for the underlying CPU.  The story with
 \instr{bipush} is slightly different, because it also uses two
-bytes---so it does not result in a reduction in code size. But againm,
+bytes---so it does not necessarily result in a reduction of code size.
-it probably uses a specific instruction in the underlying CPU that make
+Instead, it probably uses a specific instruction in the underlying CPU
-the JVM code run faster. This means when generating code we can use
+that makes the JVM code run faster.\footnote{This is all ``probable''
-the following helper function
+because I have not read the 700 pages of JVM documentation by Oracle and
+also have no clue how the JVM is implemented.} This means when
+generating code for pushing constants onto the stack, we can use the
+following Scala helper-function
 \begin{lstlisting}[language=Scala]
 def compile_num(i: Int) =
 if (0 <= i && i <= 5) i"iconst_$i" else
-if (-128 <= i && i <= 127) i"bipush $i" else i"ldc $i"
+if (-128 <= i && i <= 127) i"bipush $i"
+else i"ldc $i"
 \end{lstlisting}
 \noindent
-that generates the more efficient instructions for pushing a constant
+It generates the more efficient instructions when pushing a small integer
-onto the stack. Note the JVM also has special instructions  that
+constant onto the stack. The default is \instr{ldc} for any other constants.
-load and store the first three local variables. The assumption is that
-most operations and arguments in a method will only use very few
+The JVM also has such special instructions for
-local variables. So the JVM has the following instructions:
+loading and storing the first three local variables. The assumption is
+that most operations and arguments in a method will only use very few
+local variables. So we can use the following instructions:
 \begin{itemize}
 \item \instr{iload_0},\ldots, \instr{iload_3}
 \item \instr{istore_0},\ldots, \instr{istore_3}
 \item \instr{aload_0},\ldots, \instr{aload_3}
 \item \instr{astore_0},\ldots, \instr{astore_3}
 \end{itemize}
 \noindent Having implemented these optimisations, the code size of the
-BF-Mandelbrot program reduces and also it runs faster. According to my
+BF-Mandelbrot program reduces and also the class-file runs faster (the
-very rough experiments:
+parsing part is still very slow). According to my very rough
+experiments:
 \begin{center}
 \begin{tabular}{lll}
 & class-size & runtime\\\hline
 Mandelbrot:\\
 \end{tabular}
 \end{center}
 \noindent
 Quite good! Such optimisations are called \emph{peephole optimisations},
-because it is type of optimisations that involve changing a small set of
+because they involve changing one or a small set of instructions into an
-instructions into an equivalent set that has better performance.
+equivalent set that has better performance.
-If you look careful at our code you will quickly find another source of
+If you look careful at our generated code you will quickly find another
-inefficiency in programs like
+source of inefficiency in programs like
 \begin{lstlisting}[mathescape,language=While]
 x := ...;
 write x
 \end{lstlisting}
 \noindent
 where our code first calculates the new result the for \texttt{x} on the
 stack, then pops off the result into a local variable, and after that
 loads the local variable back onto the stack for writing out a number.
+\begin{lstlisting}[mathescape,language=JVMIS]
+...
+istore 0
+iload 0
+...
+\end{lstlisting}
+\noindent
 If we can detect such situations, then we can leave the value of
 \texttt{x} on the stack with for example the much cheaper instruction
 \instr{dup}. Now the problem with this optimisation is that it is quite
 easy for the snippet above, but what about instances where there is
 further WHILE-code in \emph{between} these two statements? Sometimes we
 will be able to optimise, sometimes we will not. The compiler needs to
-find out which situation applies. This can become quickly much more
+find out which situation applies. This can quickly become  much more
 complicated. So we leave this kind of optimisations here and look at
 something more interesting and possibly surprising.
+As you might have seen, the compiler writer has a lot of freedom about
-As you have probably seen, the compiler writer has a lot of freedom
+how to generate code from what the programmer wrote as program. The only
-about how to generate code from what the programmer wrote as program.
+condition is that generated code should behave as expected by the
-The only condition is that generated code should behave as expected by
+programmer. Then all is fine with the code above\ldots mission
-the programmer. Then all is fine\ldots mission accomplished! But
+accomplished! But sometimes the compiler writer is expected to go an
-sometimes the compiler writer is expected to go an extra mile, or even
+extra mile, or even miles and change(!) the meaning of a program.
-miles and change the meaning of a program in unexpected ways. Suppose we
+Suppose we are given the following WHILE-program:
-are given the following WHILE-program:
 \begin{lstlisting}[mathescape,language=While]
 new(arr[10]);
 arr[14] := 3 + arr[13]
 \end{lstlisting}
 \noindent
 Admittedly this is a contrived program, and probably not meant to be
 like this by any sane programmer, but it is supposed to make the
-following point: We generate an array of size 10, and then try to access
+following point: The program generates an array of size 10, and then
-the non-existing element at index 13 and even updating element with
+tries to access the non-existing element at index 13 and even updating
-index 14. Obviously this is baloney. Still, our compiler generates code
+the element with index 14. Obviously this is baloney. Still, our
-for this program without any questions asked. We can even run this code
+compiler generates code for this program without any questions asked. We
-on the JVM\ldots of course the result is an exception trace where the
+can even run this code on the JVM\ldots of course the result is an
-JVM yells at us for doing naughty things. (This is much better than C,
+exception trace where the JVM yells at us for doing naughty
-for example, where such errors are not prevented and as a result
+things.\footnote{Still this is much better than C, for example, where
-insidious attacks can be mounted against such kind C-programs. I assume
+such errors are not prevented and as a result insidious attacks can be
-everyone has heard about \emph{Buffer Overflow Attacks}.) Now what
+mounted against such kind C-programs. I assume everyone has heard about
-should we do in such situations? Index over- or underflows are
+\emph{Buffer Overflow Attacks}.} Now what should we do in such
-notoriously difficult to detect statically (at compiletime), so it seem
+situations? Over- and underflows of indices are notoriously difficult to
-raising an exception at run-time like the JVM is the best compromise.
+detect statically (at compiletime). So it might seem raising an
+exception at run-time like the JVM is the best compromise.
 Well, imagine we do not want to rely in our compiler on the JVM for
 producing an annoying, but safe exception trace, rather we want to
-handle such situations ourselves according to what we thing should
+handle such situations ourselves according to what we think should
-happen in such cases. Let's assume we want to handle them in the
+happen in such cases. Let us assume we want to handle them in the
 following way: if the programmer access a field out-of-bounds, we just
-return a default 0, and if a programmer wants to update an
+return a default 0, and if a programmer wants to update an out-of-bounds
-out-of-bounds field, we want to ``quietly'' ignore this update.
+field, we want to ``quietly'' ignore this update. One way to achieve
+this would be to rewrite the WHILE-programs and insert the necessary
+if-conditions for safely reading and writing arrays. Another way
-arraylength
+is to modify the code we generate.
+\begin{lstlisting}[mathescape,language=JVMIS2]
+$\textit{index\_aexp}$
+aload loc_var
+dup2
+arraylength
+if_icmple L1
+pop2
+iconst_0
+goto L2
+L1:
+swap
+iaload
+L2:
+\end{lstlisting}
+\begin{lstlisting}[mathescape,language=JVMIS2]
+$\textit{index\_aexp}$
+aload loc_var
+dup2
+arraylength
+if_icmple L1
+pop2
+goto L2
+L1:
+swap
+$\textit{value\_aexp}$
+iastore
+L2:
+\end{lstlisting}
 \end{document}
 %%% Local Variables:

changeset 712	e71eb9ce2373
parent 711	6f3f3dd01786
child 713	0ea14d84efe3