251
+ − 1
% !TEX program = xelatex
6
+ − 2
\documentclass{article}
62
+ − 3
\usepackage{../style}
78
+ − 4
\usepackage{../langs}
218
+ − 5
\usepackage{disclaimer}
+ − 6
\usepackage{tikz}
+ − 7
\usepackage{pgf}
+ − 8
\usepackage{pgfplots}
+ − 9
\usepackage{stackengine}
+ − 10
%% \usepackage{accents}
+ − 11
\newcommand\barbelow[1]{\stackunder[1.2pt]{#1}{\raisebox{-4mm}{\boldmath$\uparrow$}}}
+ − 12
6
+ − 13
+ − 14
\begin{document}
+ − 15
307
+ − 16
\section*{Part 10 (Scala)}
6
+ − 17
306
+ − 18
\mbox{}\hfill\textit{``If there's one feature that makes Scala,}\\
+ − 19
\mbox{}\hfill\textit{`Scala', I would pick implicits.''}\smallskip\\
+ − 20
\mbox{}\hfill\textit{ --- Martin Odersky (creator of the Scala language)}\bigskip\medskip
278
+ − 21
+ − 22
+ − 23
\noindent
307
+ − 24
This part is about a small programming
+ − 25
language called brainf***. The part is worth 10\% and you need to
287
+ − 26
submit on \cwTEN{} at 4pm.\bigskip
218
+ − 27
+ − 28
\IMPORTANT{}
62
+ − 29
+ − 30
\noindent
218
+ − 31
Also note that the running time of each part will be restricted to a
+ − 32
maximum of 30 seconds on my laptop.
+ − 33
+ − 34
\DISCLAIMER{}
268
+ − 35
\newpage
86
+ − 36
230
+ − 37
\subsection*{Reference Implementation}
+ − 38
268
+ − 39
As usual, this Scala assignment comes with a reference implementation in
+ − 40
form of two \texttt{jar}-files. You can download them from KEATS. They
+ − 41
allow you to run any test cases on your own computer. For example you
+ − 42
can call Scala on the command line with the option \texttt{-cp bf.jar}
+ − 43
and then query any function from the \texttt{bf.scala} template file.
+ − 44
You have to prefix the calls with \texttt{CW10a} and \texttt{CW10b},
+ − 45
respectively. For example
230
+ − 46
+ − 47
+ − 48
\begin{lstlisting}[language={},xleftmargin=1mm,numbers=none,basicstyle=\ttfamily\small]
+ − 49
$ scala -cp bf.jar
+ − 50
scala> import CW10a._
292
+ − 51
scala> run(load_bff("sierpinski.bf")) ; ()
230
+ − 52
*
+ − 53
* *
+ − 54
* *
+ − 55
* * * *
+ − 56
* *
+ − 57
* * * *
+ − 58
* * * *
+ − 59
* * * * * * * *
+ − 60
* *
+ − 61
* * * *
+ − 62
* * * *
+ − 63
* * * * * * * *
+ − 64
* * * *
+ − 65
* * * * * * * *
+ − 66
* * * * * * * *
+ − 67
* * * * * * * * * * * * * * * *
+ − 68
* *
+ − 69
* * * *
+ − 70
* * * *
+ − 71
* * * * * * * *
+ − 72
* * * *
+ − 73
* * * * * * * *
+ − 74
* * * * * * * *
+ − 75
* * * * * * * * * * * * * * * *
+ − 76
* * * *
+ − 77
* * * * * * * *
+ − 78
* * * * * * * *
+ − 79
* * * * * * * * * * * * * * * *
+ − 80
* * * * * * * *
+ − 81
* * * * * * * * * * * * * * * *
+ − 82
* * * * * * * * * * * * * * * *
+ − 83
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
+ − 84
\end{lstlisting}%$
+ − 85
268
+ − 86
\newpage
218
+ − 87
307
+ − 88
\subsection*{Part A (6 Marks)}
218
+ − 89
229
+ − 90
Coming from Java or C++, you might think Scala is a rather esoteric
218
+ − 91
programming language. But remember, some serious companies have built
+ − 92
their business on
+ − 93
Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}}
230
+ − 94
I claim functional programming is not a fad. And there are far, far
+ − 95
more esoteric languages out there. One is called \emph{brainf***}. You
+ − 96
are asked in this part to implement an interpreter for
+ − 97
this language.
218
+ − 98
+ − 99
Urban M\"uller developed brainf*** in 1993. A close relative of this
+ − 100
language was already introduced in 1964 by Corado B\"ohm, an Italian
229
+ − 101
computer pioneer. The main feature of brainf*** is its minimalistic
+ − 102
set of instructions---just 8 instructions in total and all of which
+ − 103
are single characters. Despite the minimalism, this language has been
+ − 104
shown to be Turing complete\ldots{}if this doesn't ring any bell with
237
+ − 105
you: it roughly means that every(!) algorithm can, in principle,
229
+ − 106
be implemented in brainf***. It just takes a lot of determination and
+ − 107
quite a lot of memory resources. Some relatively sophisticated sample
+ − 108
programs in brainf*** are given in the file \texttt{bf.scala}, including
247
+ − 109
a brainf*** program for the Sierpinski triangle and the Mandelbrot set.
+ − 110
There seems to be even a dedicated Windows IDE for bf programs, though
+ − 111
I am not sure whether this is just an elaborate April fools' joke---judge
+ − 112
yourself:
+ − 113
+ − 114
\begin{center}
+ − 115
\url{https://www.microsoft.com/en-us/p/brainf-ck/9nblgggzhvq5}
268
+ − 116
\end{center} \bigskip
247
+ − 117
218
+ − 118
+ − 119
\noindent
+ − 120
As mentioned above, brainf*** has 8 single-character commands, namely
+ − 121
\texttt{'>'}, \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'},
+ − 122
\texttt{','}, \texttt{'['} and \texttt{']'}. Every other character is
+ − 123
considered a comment. Brainf*** operates on memory cells containing
+ − 124
integers. For this it uses a single memory pointer that points at each
+ − 125
stage to one memory cell. This pointer can be moved forward by one
+ − 126
memory cell by using the command \texttt{'>'}, and backward by using
+ − 127
\texttt{'<'}. The commands \texttt{'+'} and \texttt{'-'} increase,
+ − 128
respectively decrease, by 1 the content of the memory cell to which
+ − 129
the memory pointer currently points to. The commands for input/output
+ − 130
are \texttt{','} and \texttt{'.'}. Output works by reading the content
+ − 131
of the memory cell to which the memory pointer points to and printing
+ − 132
it out as an ASCII character. Input works the other way, taking some
+ − 133
user input and storing it in the cell to which the memory pointer
+ − 134
points to. The commands \texttt{'['} and \texttt{']'} are looping
+ − 135
constructs. Everything in between \texttt{'['} and \texttt{']'} is
+ − 136
repeated until a counter (memory cell) reaches zero. A typical
+ − 137
program in brainf*** looks as follows:
+ − 138
+ − 139
\begin{center}
+ − 140
\begin{verbatim}
230
+ − 141
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.++
+ − 142
+++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
218
+ − 143
\end{verbatim}
+ − 144
\end{center}
+ − 145
+ − 146
\noindent
230
+ − 147
This one prints out Hello World\ldots{}obviously ;o)
218
+ − 148
+ − 149
\subsubsection*{Tasks (file bf.scala)}
109
+ − 150
+ − 151
\begin{itemize}
237
+ − 152
\item[(1)] Write a function that takes a filename (a string) as an argument
229
+ − 153
and requests the corresponding file from disk. It returns the
237
+ − 154
content of the file as a string. If the file does not exists,
229
+ − 155
the function should return the empty string.\\
+ − 156
\mbox{}\hfill[1 Mark]
+ − 157
+ − 158
\item[(2)] Brainf*** memory is represented by a \texttt{Map} from
218
+ − 159
integers to integers. The empty memory is represented by
+ − 160
\texttt{Map()}, that is nothing is stored in the
229
+ − 161
memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at
+ − 162
memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The
218
+ − 163
convention is that if we query the memory at a location that is
+ − 164
\emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write
268
+ − 165
a `safe-read' function, \texttt{sread}, that takes a memory (a \texttt{Map}) and
237
+ − 166
a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the
218
+ − 167
corresponding memory location. If the \texttt{Map} is not defined at
+ − 168
the memory pointer, \texttt{sread} returns \texttt{0}.
+ − 169
+ − 170
Write another function \texttt{write}, which takes a memory, a
237
+ − 171
memory pointer and an integer value as arguments and updates the
+ − 172
\texttt{Map} with the value at the given memory location. As usual,
218
+ − 173
the \texttt{Map} is not updated `in-place' but a new map is created
237
+ − 174
with the same data, except the new value is stored at the given memory
218
+ − 175
pointer.\hfill[1 Mark]
+ − 176
229
+ − 177
\item[(3)] Write two functions, \texttt{jumpRight} and
237
+ − 178
\texttt{jumpLeft}, that are needed to implement the loop constructs
+ − 179
in brainf***. They take a program (a \texttt{String}) and a program
+ − 180
counter (an \texttt{Int}) as arguments and move right (respectively
218
+ − 181
left) in the string in order to find the \textbf{matching}
+ − 182
opening/closing bracket. For example, given the following program
+ − 183
with the program counter indicated by an arrow:
+ − 184
+ − 185
\begin{center}
+ − 186
\texttt{--[\barbelow{.}.+>--],>,++}
+ − 187
\end{center}
+ − 188
+ − 189
then the matching closing bracket is in 9th position (counting from 0) and
+ − 190
\texttt{jumpRight} is supposed to return the position just after this
109
+ − 191
218
+ − 192
\begin{center}
+ − 193
\texttt{--[..+>--]\barbelow{,}>,++}
+ − 194
\end{center}
+ − 195
237
+ − 196
meaning it jumps to after the loop. Similarly, if you are in 8th position,
218
+ − 197
then \texttt{jumpLeft} is supposed to jump to just after the opening
+ − 198
bracket (that is jumping to the beginning of the loop):
109
+ − 199
218
+ − 200
\begin{center}
+ − 201
\texttt{--[..+>-\barbelow{-}],>,++}
+ − 202
\qquad$\stackrel{\texttt{jumpLeft}}{\longrightarrow}$\qquad
+ − 203
\texttt{--[\barbelow{.}.+>--],>,++}
+ − 204
\end{center}
+ − 205
+ − 206
Unfortunately we have to take into account that there might be
+ − 207
other opening and closing brackets on the `way' to find the
+ − 208
matching bracket. For example in the brainf*** program
+ − 209
+ − 210
\begin{center}
+ − 211
\texttt{--[\barbelow{.}.[+>]--],>,++}
+ − 212
\end{center}
109
+ − 213
218
+ − 214
we do not want to return the index for the \texttt{'-'} in the 9th
+ − 215
position, but the program counter for \texttt{','} in 12th
+ − 216
position. The easiest to find out whether a bracket is matched is by
+ − 217
using levels (which are the third argument in \texttt{jumpLeft} and
+ − 218
\texttt{jumpLeft}). In case of \texttt{jumpRight} you increase the
+ − 219
level by one whenever you find an opening bracket and decrease by
+ − 220
one for a closing bracket. Then in \texttt{jumpRight} you are looking
+ − 221
for the closing bracket on level \texttt{0}. For \texttt{jumpLeft} you
+ − 222
do the opposite. In this way you can find \textbf{matching} brackets
+ − 223
in strings such as
+ − 224
+ − 225
\begin{center}
+ − 226
\texttt{--[\barbelow{.}.[[-]+>[.]]--],>,++}
+ − 227
\end{center}
109
+ − 228
218
+ − 229
for which \texttt{jumpRight} should produce the position:
+ − 230
+ − 231
\begin{center}
+ − 232
\texttt{--[..[[-]+>[.]]--]\barbelow{,}>,++}
+ − 233
\end{center}
+ − 234
+ − 235
It is also possible that the position returned by \texttt{jumpRight} or
+ − 236
\texttt{jumpLeft} is outside the string in cases where there are
+ − 237
no matching brackets. For example
+ − 238
+ − 239
\begin{center}
+ − 240
\texttt{--[\barbelow{.}.[[-]+>[.]]--,>,++}
+ − 241
\qquad$\stackrel{\texttt{jumpRight}}{\longrightarrow}$\qquad
+ − 242
\texttt{--[..[[-]+>[.]]-->,++\barbelow{\;\phantom{+}}}
+ − 243
\end{center}
229
+ − 244
\hfill[2 Marks]
109
+ − 245
+ − 246
230
+ − 247
\item[(4)] Write a recursive function \texttt{compute} that runs a
218
+ − 248
brainf*** program. It takes a program, a program counter, a memory
+ − 249
pointer and a memory as arguments. If the program counter is outside
230
+ − 250
the program string, the execution stops and \texttt{compute} returns the
218
+ − 251
memory. If the program counter is inside the string, it reads the
+ − 252
corresponding character and updates the program counter \texttt{pc},
+ − 253
memory pointer \texttt{mp} and memory \texttt{mem} according to the
+ − 254
rules shown in Figure~\ref{comms}. It then calls recursively
230
+ − 255
\texttt{compute} with the updated data. The most convenient way to
+ − 256
implement the brainf**k rules in Scala is to use pattern-matching
+ − 257
and to calculate a triple consisting of the updated \texttt{pc},
229
+ − 258
\texttt{mp} and \texttt{mem}.
218
+ − 259
230
+ − 260
Write another function \texttt{run} that calls \texttt{compute} with a
218
+ − 261
given brainfu** program and memory, and the program counter and memory pointer
230
+ − 262
set to~$0$. Like \texttt{compute}, it returns the memory after the execution
218
+ − 263
of the program finishes. You can test your brainf**k interpreter with the
229
+ − 264
Sierpinski triangle or the Hello world programs (they seem to be particularly
+ − 265
useful for debugging purposes), or have a look at
109
+ − 266
218
+ − 267
\begin{center}
+ − 268
\url{https://esolangs.org/wiki/Brainfuck}
+ − 269
\end{center}\hfill[2 Marks]
109
+ − 270
218
+ − 271
\begin{figure}[p]
+ − 272
\begin{center}
230
+ − 273
\begin{tabular}{|@{\hspace{0.5mm}}p{0.8cm}|l|}
218
+ − 274
\hline
+ − 275
\hfill\texttt{'>'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 276
$\bullet$ & $\texttt{pc} + 1$\\
+ − 277
$\bullet$ & $\texttt{mp} + 1$\\
+ − 278
$\bullet$ & \texttt{mem} unchanged
+ − 279
\end{tabular}\\\hline
+ − 280
\hfill\texttt{'<'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 281
$\bullet$ & $\texttt{pc} + 1$\\
+ − 282
$\bullet$ & $\texttt{mp} - 1$\\
+ − 283
$\bullet$ & \texttt{mem} unchanged
+ − 284
\end{tabular}\\\hline
+ − 285
\hfill\texttt{'+'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 286
$\bullet$ & $\texttt{pc} + 1$\\
+ − 287
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 288
$\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) + 1}\\
+ − 289
\end{tabular}\\\hline
+ − 290
\hfill\texttt{'-'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 291
$\bullet$ & $\texttt{pc} + 1$\\
+ − 292
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 293
$\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) - 1}\\
+ − 294
\end{tabular}\\\hline
+ − 295
\hfill\texttt{'.'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 296
$\bullet$ & $\texttt{pc} + 1$\\
+ − 297
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 298
$\bullet$ & print out \,\texttt{mem(mp)} as a character\\
+ − 299
\end{tabular}\\\hline
+ − 300
\hfill\texttt{','} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 301
$\bullet$ & $\texttt{pc} + 1$\\
+ − 302
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 303
$\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\
+ − 304
\multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}}
+ − 305
\end{tabular}\\\hline
+ − 306
\hfill\texttt{'['} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 307
\multicolumn{2}{@{}l}{if \texttt{mem(mp) == 0} then}\\
+ − 308
$\bullet$ & $\texttt{pc = jumpRight(prog, pc + 1, 0)}$\\
+ − 309
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
+ − 310
\multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) != 0} then}\\
+ − 311
$\bullet$ & $\texttt{pc} + 1$\\
+ − 312
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 313
\end{tabular}
+ − 314
\\\hline
+ − 315
\hfill\texttt{']'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 316
\multicolumn{2}{@{}l}{if \texttt{mem(mp) != 0} then}\\
+ − 317
$\bullet$ & $\texttt{pc = jumpLeft(prog, pc - 1, 0)}$\\
+ − 318
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
+ − 319
\multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\
+ − 320
$\bullet$ & $\texttt{pc} + 1$\\
+ − 321
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 322
\end{tabular}\\\hline
+ − 323
any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 324
$\bullet$ & $\texttt{pc} + 1$\\
+ − 325
$\bullet$ & \texttt{mp} and \texttt{mem} unchanged
+ − 326
\end{tabular}\\
+ − 327
\hline
+ − 328
\end{tabular}
+ − 329
\end{center}
+ − 330
\caption{The rules for how commands in the brainf*** language update the program counter \texttt{pc},
230
+ − 331
the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}}
218
+ − 332
\end{figure}
+ − 333
\end{itemize}\bigskip
+ − 334
268
+ − 335
%%\newpage
233
+ − 336
307
+ − 337
\subsection*{Part B (4 Marks)}
218
+ − 338
233
+ − 339
I am sure you agree while it is fun to look at bf-programs, like the
+ − 340
Sierpinski triangle or the Mandelbrot program, being interpreted, it
+ − 341
is much more fun to write a compiler for the bf-language.
+ − 342
+ − 343
+ − 344
\subsubsection*{Tasks (file bfc.scala)}
+ − 345
+ − 346
\begin{itemize}
+ − 347
\item[(5)] Compilers in general attempt to make programs run
+ − 348
faster by precomputing as much information as possible
+ − 349
before running the program. In our case we can precompute the
+ − 350
addresses where we need to jump at the beginning and end of
+ − 351
loops.
+ − 352
+ − 353
For this write a function \texttt{jtable} that precomputes the ``jump
+ − 354
table'' for a bf-program. This function takes a bf-program
+ − 355
as an argument and returns a \texttt{Map[Int, Int]}. The
237
+ − 356
purpose of this Map is to record the information, in cases
+ − 357
a pc-position points to a '\texttt{[}' or a '\texttt{]}',
+ − 358
to which pc-position do we need to jump next?
233
+ − 359
+ − 360
For example for the program
+ − 361
+ − 362
\begin{center}
+ − 363
\texttt{+++++[->++++++++++<]>--<+++[->>++++++++++}
+ − 364
\texttt{<<]>>++<<----------[+>.>.<+<]}
+ − 365
\end{center}
+ − 366
+ − 367
we obtain the Map (note the precise numbers might differ depending on white
+ − 368
spaces etc.~in the bf-program):
+ − 369
+ − 370
\begin{center}
+ − 371
\texttt{Map(69 -> 61, 5 -> 20, 60 -> 70, 27 -> 44, 43 -> 28, 19 -> 6)}
+ − 372
\end{center}
+ − 373
+ − 374
This Map states that for the '\texttt{[}' on position 5, we need to
+ − 375
jump to position 20, which is just after the corresponding '\texttt{]}'.
+ − 376
Similarly, for the '\texttt{]}' on position 19, we need to jump to
+ − 377
position 6, which is just after the '\texttt{[}' on position 5, and so
+ − 378
on. The idea is to not calculate this information each time
+ − 379
we hit a bracket, but just look up this information in the
+ − 380
\texttt{jtable}.
+ − 381
+ − 382
Then adapt the \texttt{compute} and \texttt{run} functions
+ − 383
from Part 1 in order to take advantage of the information
+ − 384
stored in the \texttt{jtable}. This means whenever \texttt{jumpLeft}
+ − 385
and \texttt{jumpRight} was called previously, you should look
+ − 386
up the jump address in the \texttt{jtable}. Feel free to reuse
+ − 387
the function \texttt{jumpLeft} and \texttt{jumpRight} for
+ − 388
calculating the \texttt{jtable}.\hfill{[1 Mark]}
+ − 389
237
+ − 390
\item[(6)] Compilers try to eliminate any ``dead'' code that could
+ − 391
slow down programs and also perform what is often called
+ − 392
\emph{peephole
+ − 393
optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}}
+ − 394
For the latter consider that it is difficult for compilers to
241
+ − 395
comprehend what is intended with whole programs, but they are very good
237
+ − 396
at finding out what small snippets of code do, and then try to
+ − 397
generate faster code for such snippets.
233
+ − 398
+ − 399
In our case, dead code is everything that is not a bf-command.
+ − 400
Therefore write a function \texttt{optimise} which deletes such
237
+ − 401
dead code from a bf-program. Moreover this function should replace every substring
233
+ − 402
of the form \pcode{[-]} by a new command \texttt{0}.
+ − 403
The idea is that the loop \pcode{[-]} just resets the
+ − 404
memory at the current location to 0. It is more efficient
237
+ − 405
to do this in a single step, rather than stepwise in a loop as in
233
+ − 406
the original bf-programs.
218
+ − 407
233
+ − 408
In the extended \texttt{compute3} and \texttt{run3} functions you should
+ − 409
implement this command by writing 0 to \pcode{mem(mp)}, that is use
+ − 410
\pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}.
+ − 411
The easiest way to modify a string in this way is to use the regular
+ − 412
expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is
+ − 413
not a bf-command. Similarly, the
237
+ − 414
regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings
+ − 415
with new strings.\\
+ − 416
\mbox{}\hfill{[1 Mark]}
233
+ − 417
+ − 418
\item[(7)] Finally, real compilers try to take advantage of CPUs which often
+ − 419
provide complex operations in hardware that can combine many smaller
+ − 420
instructions into a single faster instruction.
+ − 421
237
+ − 422
In our case we can optimise the several single increments performed at a
233
+ − 423
memory cell, for example \pcode{++++}, by a single ``increment by
+ − 424
4''. For this optimisation we just have to make sure these single
237
+ − 425
increments are all next to each other. Similar optimisations should apply
233
+ − 426
for the bf-commands \pcode{-}, \pcode{<} and
+ − 427
\pcode{>}, which can all be replaced by extended versions that take
+ − 428
the amount of the increment (decrement) into account. We will do
+ − 429
this by introducing two-character bf-commands. For example
+ − 430
+ − 431
\begin{center}
+ − 432
\begin{tabular}{l|l}
+ − 433
original bf-cmds & replacement\\
+ − 434
\hline
+ − 435
\pcode{+} & \pcode{+A}\\
+ − 436
\pcode{++} & \pcode{+B}\\
+ − 437
\pcode{+++} & \pcode{+C}\\
+ − 438
\ldots{} & \ldots{}\\
+ − 439
\pcode{+++....++} & \pcode{+Z}\\
+ − 440
\hspace{5mm}(these are 26 \pcode{+}'s)\\
+ − 441
\end{tabular}
+ − 442
\end{center}
+ − 443
+ − 444
237
+ − 445
If there are more
+ − 446
than 26 \pcode{+}'s in a row, then more than one ``two-character''
233
+ − 447
bf-commands need to be generated (the idea is that more than
+ − 448
26 copies of a single bf-command in a row is a rare occurrence in
237
+ − 449
actual bf-programs). Similar replacements apply
+ − 450
for \pcode{-}, \pcode{<} and \pcode{>}, but
+ − 451
all other bf-commands should be unaffected by this
233
+ − 452
change.
+ − 453
+ − 454
For this write a function \texttt{combine} which replaces sequences
+ − 455
of repeated increment and decrement commands by appropriate
237
+ − 456
two-character commands. In the functions \pcode{compute4} and
233
+ − 457
\pcode{run4}, the ``combine'' and the optimisation from (6) should
+ − 458
be performed. Make sure that when a two-character bf-command is
+ − 459
encountered you need to increase the \pcode{pc}-counter by two in
237
+ − 460
order to progress to the next command. For example
233
+ − 461
+ − 462
\begin{center}
+ − 463
\pcode{combine(optimise(load_bff("benchmark.bf")))}
+ − 464
\end{center}
+ − 465
+ − 466
generates the improved program
+ − 467
+ − 468
\begin{center}
+ − 469
\pcode{>A+B[<A+M>A-A]<A[[}\hspace{3mm}\ldots{}
+ − 470
\end{center}
+ − 471
+ − 472
for the original benchmark program
+ − 473
+ − 474
\begin{center}
+ − 475
\pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots
+ − 476
\end{center}
+ − 477
241
+ − 478
As you can see, the compiler bets on saving a lot of time on the
+ − 479
\pcode{+B} and \pcode{+M} steps so that the optimisations is
233
+ − 480
worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a
+ − 481
penalty). Luckily, after you have performed all
+ − 482
optimisations in (5) - (7), you can expect that the
234
+ − 483
\pcode{benchmark.bf} program runs four to five times faster.
+ − 484
You can also test whether your compiler produces the correct result
+ − 485
by for example testing
+ − 486
+ − 487
\begin{center}
+ − 488
\pcode{run(load_bff("sierpinski.bf")) == run4(load_bff("sierpinski.bf"))}
+ − 489
\end{center}
+ − 490
+ − 491
which should return true for all the different compiler stages. \\
+ − 492
\mbox{}\hfill{[2 Marks]}
233
+ − 493
\end{itemize}
6
+ − 494
+ − 495
\end{document}
+ − 496
68
+ − 497
6
+ − 498
%%% Local Variables:
+ − 499
%%% mode: latex
+ − 500
%%% TeX-master: t
+ − 501
%%% End: