cws/main_cw05.tex
changeset 347 4de31fdc0d67
parent 337 c0d9e6548b08
child 348 b5b6ed38c2f2
equal deleted inserted replaced
346:663c2a9108d1 347:4de31fdc0d67
       
     1 % !TEX program = xelatex
       
     2 \documentclass{article}
       
     3 \usepackage{../style}
       
     4 \usepackage{../langs}
       
     5 \usepackage{disclaimer}
       
     6 \usepackage{tikz}
       
     7 \usepackage{pgf}
       
     8 \usepackage{pgfplots}
       
     9 \usepackage{stackengine}
       
    10 %% \usepackage{accents}
       
    11 \newcommand\barbelow[1]{\stackunder[1.2pt]{#1}{\raisebox{-4mm}{\boldmath$\uparrow$}}}
       
    12 
       
    13 %% change Console to scala.io.StdIn.readByte()
       
    14 
       
    15 
       
    16 \begin{document}
       
    17 
       
    18 \section*{Part 10 (Scala)}
       
    19 
       
    20 \mbox{}\hfill\textit{``If there's one feature that makes Scala, `Scala',}\\
       
    21 \mbox{}\hfill\textit{ I would pick implicits.''}\smallskip\\
       
    22 \mbox{}\hfill\textit{ --- Martin Odersky (creator of the Scala language)}\bigskip\bigskip
       
    23 
       
    24 
       
    25 \noindent
       
    26 This part is about a small (esoteric) programming language called
       
    27 brainf***. Actually, we will implement an interpreter for our own version
       
    28 of this language called brainf*ck++.\bigskip
       
    29 
       
    30 \IMPORTANT{This part is worth 10\% and you need to submit it on \cwTEN{} at 4pm.}
       
    31 
       
    32 \noindent
       
    33 Also note that the running time of each part will be restricted to a
       
    34 maximum of 30 seconds on my laptop.
       
    35 
       
    36 \DISCLAIMER{}
       
    37 \newpage
       
    38 
       
    39 \subsection*{Reference Implementation}
       
    40 
       
    41 As usual, this Scala assignment comes with a reference implementation in
       
    42 form of two \texttt{jar}-files. You can download them from KEATS. They
       
    43 allow you to run any test cases on your own computer. For example you
       
    44 can call Scala on the command line with the option \texttt{-cp bf.jar}
       
    45 and then query any function from the \texttt{bf.scala} template file.
       
    46 You have to prefix the calls with \texttt{CW10a} and \texttt{CW10b},
       
    47 respectively. For example
       
    48 
       
    49 
       
    50 \begin{lstlisting}[language={},xleftmargin=1mm,numbers=none,basicstyle=\ttfamily\small]
       
    51 $ scala -cp bf.jar
       
    52 scala> import CW10a._  
       
    53 scala> run(load_bff("sierpinski.bf")) ; ()
       
    54                                *
       
    55                               * *
       
    56                              *   *
       
    57                             * * * *
       
    58                            *       *
       
    59                           * *     * *
       
    60                          *   *   *   *
       
    61                         * * * * * * * *
       
    62                        *               *
       
    63                       * *             * *
       
    64                      *   *           *   *
       
    65                     * * * *         * * * *
       
    66                    *       *       *       *
       
    67                   * *     * *     * *     * *
       
    68                  *   *   *   *   *   *   *   *
       
    69                 * * * * * * * * * * * * * * * *
       
    70                *                               *
       
    71               * *                             * *
       
    72              *   *                           *   *
       
    73             * * * *                         * * * *
       
    74            *       *                       *       *
       
    75           * *     * *                     * *     * *
       
    76          *   *   *   *                   *   *   *   *
       
    77         * * * * * * * *                 * * * * * * * *
       
    78        *               *               *               *
       
    79       * *             * *             * *             * *
       
    80      *   *           *   *           *   *           *   *
       
    81     * * * *         * * * *         * * * *         * * * *
       
    82    *       *       *       *       *       *       *       *
       
    83   * *     * *     * *     * *     * *     * *     * *     * *
       
    84  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
       
    85 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
       
    86 \end{lstlisting}%$
       
    87 
       
    88 \newpage
       
    89 
       
    90 \subsection*{Part A (6 Marks)}
       
    91 
       
    92 Coming from Java or C++, you might think Scala is a rather esoteric
       
    93 programming language.  But remember, some serious companies have built
       
    94 their business on
       
    95 Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}}
       
    96 I claim functional programming is not a fad.  And there are far, far
       
    97 more esoteric languages out there. One is called \emph{brainf***}. 
       
    98 \here{https://esolangs.org/wiki/Brainfuck}
       
    99 You
       
   100 are asked in this part to implement an interpreter for
       
   101 a slight extension of this language.
       
   102 
       
   103 Urban M\"uller developed the original version of brainf*** in 1993.  A close
       
   104 relative of this language was already introduced in 1964 by Corado
       
   105 B\"ohm, an Italian computer pioneer. The main feature of brainf*** is
       
   106 its minimalistic set of instructions---just 8 instructions in total
       
   107 and all of which are single characters. Despite the minimalism, this
       
   108 language has been shown to be Turing complete\ldots{}if this doesn't
       
   109 ring any bell with you: it roughly means that every(!) algorithm can,
       
   110 in principle, be implemented in brainf***. It just takes a lot of
       
   111 determination and quite a lot of memory resources.
       
   112 
       
   113 Some relatively sophisticated sample programs in brainf*** are given
       
   114 in the file \texttt{bf.scala}, including a brainf*** program for the
       
   115 Sierpinski triangle and the Mandelbrot set.  There seems to be even a
       
   116 dedicated Windows IDE for bf programs, though I am not sure whether
       
   117 this is just an elaborate April fools' joke---judge yourself:
       
   118 
       
   119 \begin{center}
       
   120 \url{https://www.microsoft.com/en-us/p/brainf-ck/9nblgggzhvq5}
       
   121 \end{center}  \bigskip
       
   122 
       
   123 
       
   124 \noindent
       
   125 As mentioned above, the original brainf*** has 8 single-character
       
   126 commands. Our version of bf++ will contain the commands \texttt{'>'},
       
   127 \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'}, \texttt{'['}
       
   128 and \texttt{']'} from the original, and in addition the commands
       
   129 \texttt{'@'}, \texttt{'*'} and \texttt{'\#'}.  Every other character
       
   130 is considered a comment.
       
   131 
       
   132 Our interpreter for bf++ operates on memory cells containing
       
   133 integers. For this it uses a single memory pointer, called
       
   134 \texttt{mp}, that points at each stage to one memory cell.
       
   135 
       
   136 \begin{center}
       
   137 \begin{tikzpicture}
       
   138   \draw [line width=1mm, rounded corners] (0,0) rectangle (5, 0.5);
       
   139   \draw (0.5, 0) -- (0.5, 0.5);
       
   140   \draw (1.0, 0) -- (1.0, 0.5);
       
   141 
       
   142   \draw (2.5, 0) -- (2.5, 0.5);
       
   143   \draw (2.0, 0) -- (2.0, 0.5);
       
   144 
       
   145   \draw (4.5, 0) -- (4.5, 0.5);
       
   146   \draw (4.0, 0) -- (4.0, 0.5);
       
   147 
       
   148   \draw (1.5,0.25) node {$\cdots$};
       
   149   \draw (3.0,0.25) node {$\cdots$};
       
   150 
       
   151   \draw [->, thick] (2.25, -0.5) -- (2.25, -0.15);
       
   152   \draw (2.25,-0.8) node {\texttt{mp}};
       
   153 
       
   154   \draw (0.7,0.7) node {\sf\footnotesize memory};
       
   155 \end{tikzpicture}    
       
   156 \end{center}  
       
   157 
       
   158 \noindent
       
   159 This pointer can be moved forward by one memory cell by using the
       
   160 command \texttt{'>'}, and backward by using \texttt{'<'}. The commands
       
   161 \texttt{'+'} and \texttt{'-'} increase, respectively decrease, by 1
       
   162 the content of the memory cell to which the memory pointer currently
       
   163 points to. The command for output in bf++ is \texttt{'.'} whereby output works
       
   164 by reading the content of the memory cell to which the memory pointer
       
   165 points to and printing it out as an ASCII character.\footnote{In the
       
   166   original version of bf, there is also a command for input, but we
       
   167   omit it here. All our programs will be ``autonomous''.}  The
       
   168 commands \texttt{'['} and \texttt{']'} are looping
       
   169 constructs. Everything in between \texttt{'['} and \texttt{']'} is
       
   170 repeated until a counter (memory cell) reaches zero.  A typical
       
   171 program in brainf*** looks as follows:
       
   172 
       
   173 \begin{center}
       
   174 \begin{verbatim}
       
   175    ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.++
       
   176    +++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
       
   177 \end{verbatim}
       
   178 \end{center}  
       
   179 
       
   180 \noindent
       
   181 This one prints out Hello World\ldots{}obviously \texttt{;o)} We also
       
   182 add 3 new commands in the bf++-version of the bf-language. The purpose
       
   183 of these commands we explain later.
       
   184 
       
   185 
       
   186 \subsubsection*{Tasks (file bf.scala)}
       
   187 
       
   188 \begin{itemize}
       
   189 \item[(1)]  Write a function that takes a filename (a string) as an argument
       
   190   and requests the corresponding file from disk. It returns the
       
   191   content of the file as a string. If the file does not exists,
       
   192   the function should return the empty string.
       
   193   \mbox{}\hfill[1 Mark]
       
   194   
       
   195 \item[(2)] Brainf**k++ memory is represented by a \texttt{Map} from
       
   196   integers to integers. The empty memory is represented by
       
   197   \texttt{Map()}, that is nothing is stored in the
       
   198   memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at
       
   199   memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The
       
   200   convention is that if we query the memory at a location that is
       
   201   \emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write
       
   202   a `safe-read' function, \texttt{sread}, that takes a memory (a \texttt{Map}) and
       
   203   a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the
       
   204   corresponding memory location. If the \texttt{Map} is not defined at
       
   205   the memory pointer, \texttt{sread} returns \texttt{0}.
       
   206 
       
   207   Write another function \texttt{write}, which takes a memory, a
       
   208   memory pointer and an integer value as arguments and updates the
       
   209   \texttt{Map} with the value at the given memory location. As usual,
       
   210   the \texttt{Map} is not updated `in-place' but a new map is created
       
   211   with the same data, except the new value is stored at the given memory
       
   212   pointer.\hfill[1 Mark]
       
   213 
       
   214 \item[(3)] Write two functions, \texttt{jumpRight} and
       
   215   \texttt{jumpLeft}, that are needed to implement the loop constructs
       
   216   in brainf**k++. They take a program (a \texttt{String}) and a program
       
   217   counter (an \texttt{Int}) as arguments and move right (respectively
       
   218   left) in the string in order to find the \textbf{matching}
       
   219   opening/closing bracket. For example, given the following program
       
   220   with the program counter indicated by an arrow:
       
   221 
       
   222   \begin{center}
       
   223   \texttt{--[\barbelow{.}.+>--].>.++}
       
   224   \end{center}
       
   225 
       
   226   then the matching closing bracket is in 9th position (counting from 0) and
       
   227   \texttt{jumpRight} is supposed to return the position just after this
       
   228   
       
   229   \begin{center}
       
   230   \texttt{--[..+>--]\barbelow{.}>.++}
       
   231   \end{center}
       
   232 
       
   233   meaning it jumps to after the loop. Similarly, if you are in 8th position,
       
   234   then \texttt{jumpLeft} is supposed to jump to just after the opening
       
   235   bracket (that is jumping to the beginning of the loop):
       
   236 
       
   237   \begin{center}
       
   238     \texttt{--[..+>-\barbelow{-}].>.++}
       
   239     \qquad$\stackrel{\texttt{jumpLeft}}{\longrightarrow}$\qquad
       
   240     \texttt{--[\barbelow{.}.+>--].>.++}
       
   241   \end{center}
       
   242 
       
   243   Unfortunately we have to take into account that there might be
       
   244   other opening and closing brackets on the `way' to find the
       
   245   matching bracket. For example in the brain*ck++ program
       
   246 
       
   247   \begin{center}
       
   248   \texttt{--[\barbelow{.}.[+>]--].>.++}
       
   249   \end{center}
       
   250 
       
   251   we do not want to return the index for the \texttt{'-'} in the 9th
       
   252   position, but the program counter for \texttt{'.'} in 12th
       
   253   position. The easiest to find out whether a bracket is matched is by
       
   254   using levels (which are the third argument in \texttt{jumpLeft} and
       
   255   \texttt{jumpLeft}). In case of \texttt{jumpRight} you increase the
       
   256   level by one whenever you find an opening bracket and decrease by
       
   257   one for a closing bracket. Then in \texttt{jumpRight} you are looking
       
   258   for the closing bracket on level \texttt{0}. For \texttt{jumpLeft} you
       
   259   do the opposite. In this way you can find \textbf{matching} brackets
       
   260   in strings such as
       
   261 
       
   262   \begin{center}
       
   263   \texttt{--[\barbelow{.}.[[-]+>[.]]--].>.++}
       
   264   \end{center}
       
   265 
       
   266   for which \texttt{jumpRight} should produce the position:
       
   267 
       
   268   \begin{center}
       
   269   \texttt{--[..[[-]+>[.]]--]\barbelow{.}>.++}
       
   270   \end{center}
       
   271 
       
   272   It is also possible that the position returned by \texttt{jumpRight} or
       
   273   \texttt{jumpLeft} is outside the string in cases where there are
       
   274   no matching brackets. For example
       
   275 
       
   276   \begin{center}
       
   277   \texttt{--[\barbelow{.}.[[-]+>[.]]--.>.++}
       
   278   \qquad$\stackrel{\texttt{jumpRight}}{\longrightarrow}$\qquad
       
   279   \texttt{--[..[[-]+>[.]]-->.++\barbelow{\;\phantom{+}}}
       
   280   \end{center}
       
   281   \hfill[2 Marks]
       
   282 
       
   283 
       
   284 \item[(4)] Write a recursive function \texttt{compute} that runs a
       
   285   brain*u*k++ program. It takes a program, a program counter, a memory
       
   286   pointer and a memory as arguments. If the program counter is outside
       
   287   the program string, the execution stops and \texttt{compute} returns the
       
   288   memory. If the program counter is inside the string, it reads the
       
   289   corresponding character and updates the program counter \texttt{pc},
       
   290   memory pointer \texttt{mp} and memory \texttt{mem} according to the
       
   291   rules shown in Figure~\ref{comms}. It then calls recursively
       
   292   \texttt{compute} with the updated data. The most convenient way to
       
   293   implement the brainf**k++ rules in Scala is to use pattern-matching
       
   294   and to calculate a triple consisting of the updated \texttt{pc},
       
   295   \texttt{mp} and \texttt{mem}.
       
   296 
       
   297   Write another function \texttt{run} that calls \texttt{compute} with a
       
   298   given brainfu*k++ program and memory, and the program counter and memory pointer
       
   299   set to~$0$. Like \texttt{compute}, it returns the memory after the execution
       
   300   of the program finishes. You can test your brainf**k++ interpreter with the
       
   301   Sierpinski triangle or the Hello world programs (they seem to be particularly
       
   302   useful for debugging purposes), or have a look at
       
   303 
       
   304   \begin{center}
       
   305   \url{https://esolangs.org/wiki/Brainfuck}
       
   306   \end{center}
       
   307 
       
   308   \noindent for more bf/bf++-programs and the test cases given in \texttt{bf.scala}.\\
       
   309   \mbox{}\hfill[2 Marks]
       
   310   
       
   311   \begin{figure}[p]
       
   312   \begin{center}
       
   313     \begin{tabular}{|@{\hspace{0.5mm}}p{0.8cm}|l|}
       
   314       \hline
       
   315       \hfill\texttt{'>'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   316                        $\bullet$ & $\texttt{pc} + 1$\\
       
   317                        $\bullet$ & $\texttt{mp} + 1$\\
       
   318                        $\bullet$ & \texttt{mem} unchanged
       
   319                      \end{tabular}\\\hline   
       
   320       \hfill\texttt{'<'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   321                        $\bullet$ & $\texttt{pc} + 1$\\
       
   322                        $\bullet$ & $\texttt{mp} - 1$\\
       
   323                        $\bullet$ & \texttt{mem} unchanged
       
   324                      \end{tabular}\\\hline   
       
   325       \hfill\texttt{'+'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   326                        $\bullet$ & $\texttt{pc} + 1$\\
       
   327                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   328                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) + 1}\\
       
   329                      \end{tabular}\\\hline   
       
   330       \hfill\texttt{'-'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   331                        $\bullet$ & $\texttt{pc} + 1$\\
       
   332                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   333                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) - 1}\\
       
   334                      \end{tabular}\\\hline   
       
   335       \hfill\texttt{'.'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   336                        $\bullet$ & $\texttt{pc} + 1$\\
       
   337                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   338                        $\bullet$ & print out \,\texttt{mem(mp)} as a character\\
       
   339                      \end{tabular}\\\hline   
       
   340       %\hfill\texttt{','} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   341       %                 $\bullet$ & $\texttt{pc} + 1$\\
       
   342       %                 $\bullet$ & $\texttt{mp}$ unchanged\\
       
   343       %                 $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\
       
   344       %                 \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}}
       
   345       %               \end{tabular}\\\hline   
       
   346       \hfill\texttt{'['} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   347                        \multicolumn{2}{@{}l}{if \texttt{mem(mp) == 0} then}\\
       
   348                        $\bullet$ & $\texttt{pc = jumpRight(prog, pc + 1, 0)}$\\
       
   349                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
       
   350                        \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) != 0} then}\\
       
   351                        $\bullet$ & $\texttt{pc} + 1$\\
       
   352                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   353                      \end{tabular}
       
   354                      \\\hline   
       
   355       \hfill\texttt{']'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   356                        \multicolumn{2}{@{}l}{if \texttt{mem(mp) != 0} then}\\
       
   357                        $\bullet$ & $\texttt{pc = jumpLeft(prog, pc - 1, 0)}$\\
       
   358                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
       
   359                        \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\
       
   360                        $\bullet$ & $\texttt{pc} + 1$\\
       
   361                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   362                            \end{tabular}\\\hline
       
   363       \hfill\texttt{'*'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   364                        $\bullet$ & $\texttt{pc} + 1$\\
       
   365                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   366                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) * mem(mp - 1)}\\
       
   367                              \multicolumn{2}{@{}l}{this multiplies the content of the memory cells at
       
   368                              \texttt{mp} and \texttt{mp - 1}}\\
       
   369                              \multicolumn{2}{@{}l}{and stores the result at \texttt{mp}}
       
   370                            \end{tabular}\\\hline
       
   371       \hfill\texttt{'@'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   372                        $\bullet$ & $\texttt{pc} + 1$\\
       
   373                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   374                              $\bullet$ & \texttt{mem} updated with
       
   375                                          \texttt{mem(mp) -> mem(mp - 1)}\\
       
   376                              \multicolumn{2}{@{}l}{this updates the memory cell having the index stored at \texttt{mem(mp)},}\\
       
   377                              \multicolumn{2}{@{}l}{with the value stored at \texttt{mem(mp - 1)},}
       
   378                            \end{tabular}\\\hline
       
   379       \hfill\texttt{'\#'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   380                        $\bullet$ & $\texttt{pc} + 1$\\
       
   381                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   382                        $\bullet$ & print out \,\texttt{mem(mp)} as a number\\
       
   383                      \end{tabular}\\\hline  
       
   384       any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   385                          $\bullet$ & $\texttt{pc} + 1$\\
       
   386                          $\bullet$ & \texttt{mp} and \texttt{mem} unchanged
       
   387                        \end{tabular}\\
       
   388       \hline                 
       
   389     \end{tabular}
       
   390     \\\mbox{}\\[-10mm]\mbox{}
       
   391   \end{center}
       
   392   \caption{The rules for how commands in the brainf***++ language update the
       
   393     program counter \texttt{pc},
       
   394     the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}}
       
   395   \end{figure}
       
   396 \end{itemize}\bigskip  
       
   397 
       
   398 %%\newpage
       
   399 
       
   400 \subsection*{Part B (4 Marks)}
       
   401 
       
   402 I am sure you agree while it is fun to marvel at bf++-programs, like the
       
   403 Sierpinski triangle or the Mandelbrot program, being interpreted, it
       
   404 is much more fun to write a compiler for the bf++-language.
       
   405 
       
   406 
       
   407 \subsubsection*{Tasks (file bfc.scala)}
       
   408 
       
   409 \begin{itemize}
       
   410 \item[(5)] Compilers, in general, attempt to make programs run
       
   411   faster by precomputing as much information as possible
       
   412   before running the program. In our case we can precompute the
       
   413   addresses where we need to jump at the beginning and end of
       
   414   loops. 
       
   415 
       
   416   For this write a function \texttt{jtable} that precomputes the ``jump
       
   417   table'' for a bf++-program. This function takes a bf++-program 
       
   418   as an argument and returns a \texttt{Map[Int, Int]}. The 
       
   419   purpose of this Map is to record the information, in cases
       
   420   a pc-position points to a '\texttt{[}' or a '\texttt{]}',
       
   421   to which pc-position do we need to jump next?
       
   422  
       
   423   For example for the program
       
   424     
       
   425   \begin{center}
       
   426     \texttt{+++++[->++++++++++<]>--<+++[->>++++++++++}
       
   427     \texttt{<<]>>++<<----------[+>.>.<+<]}
       
   428   \end{center}
       
   429 
       
   430   we obtain the Map (note the precise numbers might differ depending on white
       
   431   spaces etc.~in the bf-program):
       
   432 
       
   433   \begin{center}
       
   434   \texttt{Map(69 -> 61, 5 -> 20, 60 -> 70, 27 -> 44, 43 -> 28, 19 -> 6)}
       
   435   \end{center}  
       
   436   
       
   437   This Map states that for the '\texttt{[}' on position 5, we need to
       
   438   jump to position 20, which is just after the corresponding '\texttt{]}'.
       
   439   Similarly, for the '\texttt{]}' on position 19, we need to jump to
       
   440   position 6, which is just after the '\texttt{[}' on position 5, and so
       
   441   on. The idea is to not calculate this information each time
       
   442   we hit a bracket, but just look up this information in the 
       
   443   \texttt{jtable}. 
       
   444 
       
   445   Then adapt the \texttt{compute} and \texttt{run} functions
       
   446   from Part 1 in order to take advantage of the information
       
   447   stored in the \texttt{jtable}. This means whenever \texttt{jumpLeft}
       
   448   and \texttt{jumpRight} was called previously, you should look
       
   449   up the jump address in the \texttt{jtable}. Feel free to reuse
       
   450   the function \texttt{jumpLeft} and \texttt{jumpRight} for
       
   451   calculating the \texttt{jtable}.\hfill{[1 Mark]}
       
   452 
       
   453 \item[(6)] Compilers try to eliminate any ``dead'' code that could
       
   454   slow down programs and also perform what is often called
       
   455   \emph{peephole
       
   456     optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}}
       
   457   For the latter consider that it is difficult for compilers to
       
   458   comprehend what is intended with whole programs, but they are very good
       
   459   at finding out what small snippets of code do, and then try to
       
   460   generate faster code for such snippets.
       
   461 
       
   462   In our case, dead code is everything that is not a bf++-command.
       
   463   Therefore write a function \texttt{optimise} which deletes such
       
   464   dead code from a bf++-program. Moreover this function should replace every substring
       
   465   of the form \pcode{[-]} by a new command \texttt{0}. 
       
   466   The idea is that the loop \pcode{[-]} just resets the
       
   467   memory at the current location to 0. It is more efficient
       
   468   to do this in a single step, rather than stepwise in a loop as in
       
   469   the original bf++-programs.
       
   470 
       
   471   In the extended \texttt{compute3} and \texttt{run3} functions you should
       
   472   implement this command by writing 0 to \pcode{mem(mp)}, that is use
       
   473   \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}.
       
   474   The easiest way to modify a string in this way is to use the regular
       
   475   expression \pcode{"""[^<>+-.\\[\\]@\#*]"""}, which recognises everything that is 
       
   476   not a bf++-command. Similarly, the
       
   477   regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}.  By using the Scala method \pcode{.replaceAll} you can replace substrings
       
   478   with new strings.\\
       
   479   \mbox{}\hfill{[1 Mark]}
       
   480 
       
   481 \item[(7)] Finally, real compilers try to take advantage of modern
       
   482   CPUs which often provide complex operations in hardware that can
       
   483   combine many smaller instructions into a single faster instruction.
       
   484 
       
   485   In our case we can optimise the several single increments performed at a
       
   486   memory cell, for example \pcode{++++}, by a single ``increment by
       
   487   4''. For this optimisation we just have to make sure these single
       
   488   increments are all next to each other. Similar optimisations should apply
       
   489   for the bf-commands \pcode{-}, \pcode{<} and
       
   490   \pcode{>}, which can all be replaced by extended versions that take
       
   491   the amount of the increment (decrement) into account. We will do
       
   492   this by introducing two-character bf++-commands. For example
       
   493 
       
   494   \begin{center}
       
   495     \begin{tabular}{l|l}
       
   496       original bf-cmds & replacement\\
       
   497       \hline
       
   498       \pcode{+} & \pcode{+A}\\
       
   499       \pcode{++} & \pcode{+B}\\
       
   500       \pcode{+++} & \pcode{+C}\\
       
   501       \ldots{} & \ldots{}\\
       
   502       \pcode{+++....++} & \pcode{+Z}\\
       
   503       \hspace{5mm}(these are 26 \pcode{+}'s)\\
       
   504     \end{tabular} 
       
   505   \end{center}  
       
   506 
       
   507 
       
   508   If there are more
       
   509   than 26 \pcode{+}'s in a row, then more than one ``two-character''
       
   510   bf-commands need to be generated (the idea is that more than
       
   511   26 copies of a single bf++-command in a row is a rare occurrence in
       
   512   actual bf++-programs). Similar replacements apply
       
   513   for \pcode{-}, \pcode{<} and \pcode{>}, but
       
   514   all other bf++-commands should be unaffected by this
       
   515   change. 
       
   516 
       
   517   For this write a function \texttt{combine} which replaces sequences
       
   518   of repeated increment and decrement commands by appropriate
       
   519   two-character commands. In the functions \pcode{compute4} and
       
   520   \pcode{run4}, the ``combine'' and the optimisation from (6) should
       
   521   be performed. Make sure that when a two-character bf++-command is
       
   522   encountered you need to increase the \pcode{pc}-counter by two in
       
   523   order to progress to the next command. For example
       
   524 
       
   525   \begin{center}
       
   526   \pcode{combine(optimise(load_bff("benchmark.bf")))}  
       
   527   \end{center}  
       
   528 
       
   529   generates the improved program
       
   530 
       
   531   \begin{center}
       
   532   \pcode{>A+B[<A+M>A-A]<A[[}\hspace{3mm}\ldots{}
       
   533   \end{center}  
       
   534 
       
   535   for the original benchmark program
       
   536 
       
   537   \begin{center}
       
   538     \pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots
       
   539   \end{center}    
       
   540 
       
   541   As you can see, the compiler bets on saving a lot of time on the
       
   542   \pcode{+B} and \pcode{+M} steps so that the optimisations is
       
   543   worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a
       
   544   penalty). Luckily, after you have performed all
       
   545   optimisations in (5) - (7), you can expect that the
       
   546   \pcode{benchmark.bf} program runs four to five times faster.
       
   547   You can also test whether your compiler produces the correct result
       
   548   by  testing for example
       
   549 
       
   550   \begin{center}
       
   551   \pcode{run(load_bff("sierpinski.bf")) == run4(load_bff("sierpinski.bf"))}
       
   552   \end{center}
       
   553 
       
   554   which should return true for all the different compiler stages. \\ 
       
   555   \mbox{}\hfill{[2 Marks]}
       
   556 \end{itemize}  
       
   557 
       
   558 \end{document}
       
   559 
       
   560 
       
   561 %%% Local Variables: 
       
   562 %%% mode: latex
       
   563 %%% TeX-master: t
       
   564 %%% End: