diff -r e461b5325b5e -r db4d2fcd8063 cws/cw05.tex --- a/cws/cw05.tex Thu Dec 06 21:49:43 2018 +0000 +++ b/cws/cw05.tex Thu Dec 06 22:51:46 2018 +0000 @@ -97,7 +97,7 @@ set of instructions---just 8 instructions in total and all of which are single characters. Despite the minimalism, this language has been shown to be Turing complete\ldots{}if this doesn't ring any bell with -you: it roughly means that every algorithm we know can, in principle, +you: it roughly means that every(!) algorithm can, in principle, be implemented in brainf***. It just takes a lot of determination and quite a lot of memory resources. Some relatively sophisticated sample programs in brainf*** are given in the file \texttt{bf.scala}, including @@ -136,9 +136,9 @@ \subsubsection*{Tasks (file bf.scala)} \begin{itemize} -\item[(1)] Write a function that takes a file name as an argument +\item[(1)] Write a function that takes a filename (a string) as an argument and requests the corresponding file from disk. It returns the - content of the file as a String. If the file does not exists, + content of the file as a string. If the file does not exists, the function should return the empty string.\\ \mbox{}\hfill[1 Mark] @@ -150,21 +150,21 @@ convention is that if we query the memory at a location that is \emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write a function, \texttt{sread}, that takes a memory (a \texttt{Map}) and - a memory pointer (an \texttt{Int}) as argument, and `safely' reads the + a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the corresponding memory location. If the \texttt{Map} is not defined at the memory pointer, \texttt{sread} returns \texttt{0}. Write another function \texttt{write}, which takes a memory, a - memory pointer and an integer value as argument and updates the - \texttt{Map} with the value at the given memory location. As usual + memory pointer and an integer value as arguments and updates the + \texttt{Map} with the value at the given memory location. As usual, the \texttt{Map} is not updated `in-place' but a new map is created - with the same data, except the value is stored at the given memory + with the same data, except the new value is stored at the given memory pointer.\hfill[1 Mark] \item[(3)] Write two functions, \texttt{jumpRight} and - \texttt{jumpLeft} that are needed to implement the loop constructs - of brainf***. They take a program (a \texttt{String}) and a program - counter (an \texttt{Int}) as argument and move right (respectively + \texttt{jumpLeft}, that are needed to implement the loop constructs + in brainf***. They take a program (a \texttt{String}) and a program + counter (an \texttt{Int}) as arguments and move right (respectively left) in the string in order to find the \textbf{matching} opening/closing bracket. For example, given the following program with the program counter indicated by an arrow: @@ -180,7 +180,7 @@ \texttt{--[..+>--]\barbelow{,}>,++} \end{center} - meaning it jumps to after the loop. Similarly, if you are in 8th position + meaning it jumps to after the loop. Similarly, if you are in 8th position, then \texttt{jumpLeft} is supposed to jump to just after the opening bracket (that is jumping to the beginning of the loop): @@ -340,9 +340,9 @@ For this write a function \texttt{jtable} that precomputes the ``jump table'' for a bf-program. This function takes a bf-program as an argument and returns a \texttt{Map[Int, Int]}. The - purpose of this Map is to record the information - that given on the pc-position $pc$ is a '\texttt{[}' or a '\texttt{]}', - then to which pc-position do we need to jump next? + purpose of this Map is to record the information, in cases + a pc-position points to a '\texttt{[}' or a '\texttt{]}', + to which pc-position do we need to jump next? For example for the program @@ -374,20 +374,22 @@ the function \texttt{jumpLeft} and \texttt{jumpRight} for calculating the \texttt{jtable}.\hfill{[1 Mark]} -\item[(6)] Compilers try to eliminate any ``dead'' code that could slow - down programs and also perform what is often called - \emph{peephole optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} While it is difficult for compilers to comprehend what - is intended with whole programs, they are very good at finding out - what small snippets of code do, and then try to generate - faster code for such snippets. +\item[(6)] Compilers try to eliminate any ``dead'' code that could + slow down programs and also perform what is often called + \emph{peephole + optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} + For the latter consider that it is difficult for compilers to + comprehend what is intended with whole programs, they are very good + at finding out what small snippets of code do, and then try to + generate faster code for such snippets. In our case, dead code is everything that is not a bf-command. Therefore write a function \texttt{optimise} which deletes such - dead code. Moreover this function should replace every substring + dead code from a bf-program. Moreover this function should replace every substring of the form \pcode{[-]} by a new command \texttt{0}. The idea is that the loop \pcode{[-]} just resets the memory at the current location to 0. It is more efficient - to do this in a single step, rather than incrementally as in + to do this in a single step, rather than stepwise in a loop as in the original bf-programs. In the extended \texttt{compute3} and \texttt{run3} functions you should @@ -396,18 +398,18 @@ The easiest way to modify a string in this way is to use the regular expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is not a bf-command. Similarly, the - regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]} and - by using the Scala method \pcode{.replaceAll} you can replace it with the - string \pcode{"0"} standing for the new bf-command.\hfill{[1 Mark]} + regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings + with new strings.\\ + \mbox{}\hfill{[1 Mark]} \item[(7)] Finally, real compilers try to take advantage of CPUs which often provide complex operations in hardware that can combine many smaller instructions into a single faster instruction. - In our case we can optimise the several single increments of a + In our case we can optimise the several single increments performed at a memory cell, for example \pcode{++++}, by a single ``increment by 4''. For this optimisation we just have to make sure these single - increments are all next to each other. Similarly optimisations should apply + increments are all next to each other. Similar optimisations should apply for the bf-commands \pcode{-}, \pcode{<} and \pcode{>}, which can all be replaced by extended versions that take the amount of the increment (decrement) into account. We will do @@ -427,20 +429,22 @@ \end{center} - Similarly for \pcode{-}, \pcode{<} and \pcode{>}. If there are more - than 26 \pcode{+}'s in a row, then more than on ``two-character'' + If there are more + than 26 \pcode{+}'s in a row, then more than one ``two-character'' bf-commands need to be generated (the idea is that more than 26 copies of a single bf-command in a row is a rare occurrence in - actual bf-programs). All other bf-commands should be unaffected by this + actual bf-programs). Similar replacements apply + for \pcode{-}, \pcode{<} and \pcode{>}, but + all other bf-commands should be unaffected by this change. For this write a function \texttt{combine} which replaces sequences of repeated increment and decrement commands by appropriate - two-character commands. In the functions \pcode{compute4} and + two-character commands. In the functions \pcode{compute4} and \pcode{run4}, the ``combine'' and the optimisation from (6) should be performed. Make sure that when a two-character bf-command is encountered you need to increase the \pcode{pc}-counter by two in - order to process the next command. For example + order to progress to the next command. For example \begin{center} \pcode{combine(optimise(load_bff("benchmark.bf")))} @@ -459,7 +463,7 @@ \end{center} As you can see, the compiler bets on saving so much time on the - \pcode{+B} and \pcode{+M} steps so that the optimisations is + \pcode{+B} and \pcode{+M} steps such that the optimisations is worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a penalty). Luckily, after you have performed all optimisations in (5) - (7), you can expect that the