diff -r 0855c4478f27 -r 38ea26f227af cws/cw05.tex
--- a/cws/cw05.tex	Thu Dec 06 16:08:11 2018 +0000
+++ b/cws/cw05.tex	Thu Dec 06 18:37:17 2018 +0000
@@ -319,11 +319,152 @@
   \end{figure}
 \end{itemize}\bigskip  
 
+\newpage
+
 \subsection*{Part 2 (4 Marks)}
 
-While it is fun to look at bf-programs, like the Sierpinski triangle or the Mandelbrot
-program, being interpreted, it is much more fun to write a compiler for the bf-language.
+I am sure you agree while it is fun to look at bf-programs, like the
+Sierpinski triangle or the Mandelbrot program, being interpreted, it
+is much more fun to write a compiler for the bf-language.
+
+
+\subsubsection*{Tasks (file bfc.scala)}
+
+\begin{itemize}
+\item[(5)] Compilers in general attempt to make programs run
+  faster by precomputing as much information as possible
+  before running the program. In our case we can precompute the
+  addresses where we need to jump at the beginning and end of
+  loops. 
+
+  For this write a function \texttt{jtable} that precomputes the ``jump
+  table'' for a bf-program. This function takes a bf-program 
+  as an argument and returns a \texttt{Map[Int, Int]}. The 
+  purpose of this Map is to record the information
+  that given on the pc-position $pc$ is a '\texttt{[}' or a '\texttt{]}',
+  then to which pc-position do we need to jump next?
+ 
+  For example for the program
+    
+  \begin{center}
+    \texttt{+++++[->++++++++++<]>--<+++[->>++++++++++}
+    \texttt{<<]>>++<<----------[+>.>.<+<]}
+  \end{center}
+
+  we obtain the Map (note the precise numbers might differ depending on white
+  spaces etc.~in the bf-program):
+
+  \begin{center}
+  \texttt{Map(69 -> 61, 5 -> 20, 60 -> 70, 27 -> 44, 43 -> 28, 19 -> 6)}
+  \end{center}  
+  
+  This Map states that for the '\texttt{[}' on position 5, we need to
+  jump to position 20, which is just after the corresponding '\texttt{]}'.
+  Similarly, for the '\texttt{]}' on position 19, we need to jump to
+  position 6, which is just after the '\texttt{[}' on position 5, and so
+  on. The idea is to not calculate this information each time
+  we hit a bracket, but just look up this information in the 
+  \texttt{jtable}. 
+
+  Then adapt the \texttt{compute} and \texttt{run} functions
+  from Part 1 in order to take advantage of the information
+  stored in the \texttt{jtable}. This means whenever \texttt{jumpLeft}
+  and \texttt{jumpRight} was called previously, you should look
+  up the jump address in the \texttt{jtable}. Feel free to reuse
+  the function \texttt{jumpLeft} and \texttt{jumpRight} for
+  calculating the \texttt{jtable}.\hfill{[1 Mark]}
+
+\item[(6)] Compilers try to eliminate any ``dead'' code that could slow
+  down programs and also perform what is often called
+  \emph{peephole optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} While it is difficult for compilers to comprehend what
+  is intended with whole programs, they are very good at finding out
+  what small snippets of code do, and then try to generate
+  faster code for such snippets.
+
+  In our case, dead code is everything that is not a bf-command.
+  Therefore write a function \texttt{optimise} which deletes such
+  dead code. Moreover this function should replace every substring
+  of the form \pcode{[-]} by a new command \texttt{0}. 
+  The idea is that the loop \pcode{[-]} just resets the
+  memory at the current location to 0. It is more efficient
+  to do this in a single step, rather than incrementally as in
+  the original bf-programs.
 
+  In the extended \texttt{compute3} and \texttt{run3} functions you should
+  implement this command by writing 0 to \pcode{mem(mp)}, that is use
+  \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}.
+  The easiest way to modify a string in this way is to use the regular
+  expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is 
+  not a bf-command. Similarly, the
+  regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]} and 
+  by using the Scala method \pcode{.replaceAll} you can replace it with the 
+  string \pcode{"0"} standing for the new bf-command.\hfill{[1 Mark]}
+
+\item[(7)] Finally, real compilers try to take advantage of CPUs which often
+  provide complex operations in hardware that can combine many smaller
+  instructions into a single faster instruction.
+
+  In our case we can optimise the several single increments of a
+  memory cell, for example \pcode{++++}, by a single ``increment by
+  4''. For this optimisation we just have to make sure these single
+  increments are all next to each other. Similarly optimisations should apply
+  for the bf-commands \pcode{-}, \pcode{<} and
+  \pcode{>}, which can all be replaced by extended versions that take
+  the amount of the increment (decrement) into account. We will do
+  this by introducing two-character bf-commands. For example
+
+  \begin{center}
+    \begin{tabular}{l|l}
+      original bf-cmds & replacement\\
+      \hline
+      \pcode{+} & \pcode{+A}\\
+      \pcode{++} & \pcode{+B}\\
+      \pcode{+++} & \pcode{+C}\\
+      \ldots{} & \ldots{}\\
+      \pcode{+++....++} & \pcode{+Z}\\
+      \hspace{5mm}(these are 26 \pcode{+}'s)\\
+    \end{tabular} 
+  \end{center}  
+
+
+  Similarly for \pcode{-}, \pcode{<} and \pcode{>}. If there are more
+  than 26 \pcode{+}'s in a row, then more than on ``two-character''
+  bf-commands need to be generated (the idea is that more than
+  26 copies of a single bf-command in a row is a rare occurrence in
+  actual bf-programs). All other bf-commands should be unaffected by this
+  change. 
+
+  For this write a function \texttt{combine} which replaces sequences
+  of repeated increment and decrement commands by appropriate
+  two-character commands.  In the functions \pcode{compute4} and
+  \pcode{run4}, the ``combine'' and the optimisation from (6) should
+  be performed. Make sure that when a two-character bf-command is
+  encountered you need to increase the \pcode{pc}-counter by two in
+  order to process the next command. For example
+
+  \begin{center}
+  \pcode{combine(optimise(load_bff("benchmark.bf")))}  
+  \end{center}  
+
+  generates the improved program
+
+  \begin{center}
+  \pcode{>A+B[<A+M>A-A]<A[[}\hspace{3mm}\ldots{}
+  \end{center}  
+
+  for the original benchmark program
+
+  \begin{center}
+    \pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots
+  \end{center}    
+
+  As you can see, the compiler bets on saving so much time on the
+  \pcode{+B} and \pcode{+M} steps so that the optimisations is
+  worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a
+  penalty). Luckily, after you have performed all
+  optimisations in (5) - (7), you can expect that the
+  \pcode{benchmark.bf} program runs four to five times faster.\hfill{[2 Marks]}
+\end{itemize}  
 
 \end{document}