251
+ − 1
% !TEX program = xelatex
6
+ − 2
\documentclass{article}
62
+ − 3
\usepackage{../style}
78
+ − 4
\usepackage{../langs}
218
+ − 5
\usepackage{disclaimer}
+ − 6
\usepackage{tikz}
+ − 7
\usepackage{pgf}
+ − 8
\usepackage{pgfplots}
+ − 9
\usepackage{stackengine}
+ − 10
%% \usepackage{accents}
+ − 11
\newcommand\barbelow[1]{\stackunder[1.2pt]{#1}{\raisebox{-4mm}{\boldmath$\uparrow$}}}
+ − 12
325
+ − 13
%% change Console to scala.io.StdIn.readByte()
+ − 14
6
+ − 15
+ − 16
\begin{document}
+ − 17
307
+ − 18
\section*{Part 10 (Scala)}
6
+ − 19
336
+ − 20
\mbox{}\hfill\textit{``If there's one feature that makes Scala, `Scala',}\\
+ − 21
\mbox{}\hfill\textit{ I would pick implicits.''}\smallskip\\
+ − 22
\mbox{}\hfill\textit{ --- Martin Odersky (creator of the Scala language)}\bigskip\bigskip
278
+ − 23
+ − 24
+ − 25
\noindent
337
+ − 26
This part is about a small (esoteric) programming language called
336
+ − 27
brainf***. Actually, we will implement an interpreter for our own version
+ − 28
of this language called brainf*ck++.\bigskip
218
+ − 29
337
+ − 30
\IMPORTANT{This part is worth 10\% and you need to submit it on \cwTEN{} at 4pm.}
62
+ − 31
+ − 32
\noindent
218
+ − 33
Also note that the running time of each part will be restricted to a
+ − 34
maximum of 30 seconds on my laptop.
+ − 35
+ − 36
\DISCLAIMER{}
268
+ − 37
\newpage
86
+ − 38
230
+ − 39
\subsection*{Reference Implementation}
+ − 40
268
+ − 41
As usual, this Scala assignment comes with a reference implementation in
+ − 42
form of two \texttt{jar}-files. You can download them from KEATS. They
+ − 43
allow you to run any test cases on your own computer. For example you
+ − 44
can call Scala on the command line with the option \texttt{-cp bf.jar}
+ − 45
and then query any function from the \texttt{bf.scala} template file.
+ − 46
You have to prefix the calls with \texttt{CW10a} and \texttt{CW10b},
+ − 47
respectively. For example
230
+ − 48
+ − 49
+ − 50
\begin{lstlisting}[language={},xleftmargin=1mm,numbers=none,basicstyle=\ttfamily\small]
+ − 51
$ scala -cp bf.jar
+ − 52
scala> import CW10a._
292
+ − 53
scala> run(load_bff("sierpinski.bf")) ; ()
230
+ − 54
*
+ − 55
* *
+ − 56
* *
+ − 57
* * * *
+ − 58
* *
+ − 59
* * * *
+ − 60
* * * *
+ − 61
* * * * * * * *
+ − 62
* *
+ − 63
* * * *
+ − 64
* * * *
+ − 65
* * * * * * * *
+ − 66
* * * *
+ − 67
* * * * * * * *
+ − 68
* * * * * * * *
+ − 69
* * * * * * * * * * * * * * * *
+ − 70
* *
+ − 71
* * * *
+ − 72
* * * *
+ − 73
* * * * * * * *
+ − 74
* * * *
+ − 75
* * * * * * * *
+ − 76
* * * * * * * *
+ − 77
* * * * * * * * * * * * * * * *
+ − 78
* * * *
+ − 79
* * * * * * * *
+ − 80
* * * * * * * *
+ − 81
* * * * * * * * * * * * * * * *
+ − 82
* * * * * * * *
+ − 83
* * * * * * * * * * * * * * * *
+ − 84
* * * * * * * * * * * * * * * *
+ − 85
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
+ − 86
\end{lstlisting}%$
+ − 87
268
+ − 88
\newpage
218
+ − 89
307
+ − 90
\subsection*{Part A (6 Marks)}
218
+ − 91
229
+ − 92
Coming from Java or C++, you might think Scala is a rather esoteric
218
+ − 93
programming language. But remember, some serious companies have built
+ − 94
their business on
+ − 95
Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}}
230
+ − 96
I claim functional programming is not a fad. And there are far, far
336
+ − 97
more esoteric languages out there. One is called \emph{brainf***}.
+ − 98
\here{https://esolangs.org/wiki/Brainfuck}
+ − 99
You
230
+ − 100
are asked in this part to implement an interpreter for
336
+ − 101
a slight extension of this language.
218
+ − 102
336
+ − 103
Urban M\"uller developed the original version of brainf*** in 1993. A close
+ − 104
relative of this language was already introduced in 1964 by Corado
+ − 105
B\"ohm, an Italian computer pioneer. The main feature of brainf*** is
+ − 106
its minimalistic set of instructions---just 8 instructions in total
+ − 107
and all of which are single characters. Despite the minimalism, this
+ − 108
language has been shown to be Turing complete\ldots{}if this doesn't
+ − 109
ring any bell with you: it roughly means that every(!) algorithm can,
+ − 110
in principle, be implemented in brainf***. It just takes a lot of
+ − 111
determination and quite a lot of memory resources.
+ − 112
+ − 113
Some relatively sophisticated sample programs in brainf*** are given
+ − 114
in the file \texttt{bf.scala}, including a brainf*** program for the
+ − 115
Sierpinski triangle and the Mandelbrot set. There seems to be even a
+ − 116
dedicated Windows IDE for bf programs, though I am not sure whether
+ − 117
this is just an elaborate April fools' joke---judge yourself:
247
+ − 118
+ − 119
\begin{center}
+ − 120
\url{https://www.microsoft.com/en-us/p/brainf-ck/9nblgggzhvq5}
268
+ − 121
\end{center} \bigskip
247
+ − 122
218
+ − 123
+ − 124
\noindent
336
+ − 125
As mentioned above, the original brainf*** has 8 single-character
+ − 126
commands. Our version of bf++ will contain the commands \texttt{'>'},
+ − 127
\texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'}, \texttt{'['}
+ − 128
and \texttt{']'} from the original, and in addition the commands
+ − 129
\texttt{'@'}, \texttt{'*'} and \texttt{'\#'}. Every other character
+ − 130
is considered a comment.
+ − 131
+ − 132
Our interpreter for bf++ operates on memory cells containing
337
+ − 133
integers. For this it uses a single memory pointer, called
+ − 134
\texttt{mp}, that points at each stage to one memory cell.
+ − 135
+ − 136
\begin{center}
+ − 137
\begin{tikzpicture}
+ − 138
\draw [line width=1mm, rounded corners] (0,0) rectangle (5, 0.5);
+ − 139
\draw (0.5, 0) -- (0.5, 0.5);
+ − 140
\draw (1.0, 0) -- (1.0, 0.5);
+ − 141
+ − 142
\draw (2.5, 0) -- (2.5, 0.5);
+ − 143
\draw (2.0, 0) -- (2.0, 0.5);
+ − 144
+ − 145
\draw (4.5, 0) -- (4.5, 0.5);
+ − 146
\draw (4.0, 0) -- (4.0, 0.5);
+ − 147
+ − 148
\draw (1.5,0.25) node {$\cdots$};
+ − 149
\draw (3.0,0.25) node {$\cdots$};
+ − 150
+ − 151
\draw [->, thick] (2.25, -0.5) -- (2.25, -0.15);
+ − 152
\draw (2.25,-0.8) node {\texttt{mp}};
+ − 153
+ − 154
\draw (0.7,0.7) node {\sf\footnotesize memory};
+ − 155
\end{tikzpicture}
+ − 156
\end{center}
+ − 157
+ − 158
\noindent
+ − 159
This pointer can be moved forward by one memory cell by using the
+ − 160
command \texttt{'>'}, and backward by using \texttt{'<'}. The commands
+ − 161
\texttt{'+'} and \texttt{'-'} increase, respectively decrease, by 1
+ − 162
the content of the memory cell to which the memory pointer currently
+ − 163
points to. The command for output in bf++ is \texttt{'.'} whereby output works
+ − 164
by reading the content of the memory cell to which the memory pointer
+ − 165
points to and printing it out as an ASCII character.\footnote{In the
+ − 166
original version of bf, there is also a command for input, but we
+ − 167
omit it here. All our programs will be ``autonomous''.} The
336
+ − 168
commands \texttt{'['} and \texttt{']'} are looping
218
+ − 169
constructs. Everything in between \texttt{'['} and \texttt{']'} is
+ − 170
repeated until a counter (memory cell) reaches zero. A typical
+ − 171
program in brainf*** looks as follows:
+ − 172
+ − 173
\begin{center}
+ − 174
\begin{verbatim}
230
+ − 175
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.++
+ − 176
+++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
218
+ − 177
\end{verbatim}
+ − 178
\end{center}
+ − 179
+ − 180
\noindent
336
+ − 181
This one prints out Hello World\ldots{}obviously \texttt{;o)} We also
337
+ − 182
add 3 new commands in the bf++-version of the bf-language. The purpose
+ − 183
of these commands we explain later.
336
+ − 184
218
+ − 185
+ − 186
\subsubsection*{Tasks (file bf.scala)}
109
+ − 187
+ − 188
\begin{itemize}
237
+ − 189
\item[(1)] Write a function that takes a filename (a string) as an argument
229
+ − 190
and requests the corresponding file from disk. It returns the
237
+ − 191
content of the file as a string. If the file does not exists,
337
+ − 192
the function should return the empty string.
229
+ − 193
\mbox{}\hfill[1 Mark]
+ − 194
336
+ − 195
\item[(2)] Brainf**k++ memory is represented by a \texttt{Map} from
218
+ − 196
integers to integers. The empty memory is represented by
+ − 197
\texttt{Map()}, that is nothing is stored in the
229
+ − 198
memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at
+ − 199
memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The
218
+ − 200
convention is that if we query the memory at a location that is
+ − 201
\emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write
268
+ − 202
a `safe-read' function, \texttt{sread}, that takes a memory (a \texttt{Map}) and
237
+ − 203
a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the
218
+ − 204
corresponding memory location. If the \texttt{Map} is not defined at
+ − 205
the memory pointer, \texttt{sread} returns \texttt{0}.
+ − 206
+ − 207
Write another function \texttt{write}, which takes a memory, a
237
+ − 208
memory pointer and an integer value as arguments and updates the
+ − 209
\texttt{Map} with the value at the given memory location. As usual,
218
+ − 210
the \texttt{Map} is not updated `in-place' but a new map is created
237
+ − 211
with the same data, except the new value is stored at the given memory
218
+ − 212
pointer.\hfill[1 Mark]
+ − 213
229
+ − 214
\item[(3)] Write two functions, \texttt{jumpRight} and
237
+ − 215
\texttt{jumpLeft}, that are needed to implement the loop constructs
336
+ − 216
in brainf**k++. They take a program (a \texttt{String}) and a program
237
+ − 217
counter (an \texttt{Int}) as arguments and move right (respectively
218
+ − 218
left) in the string in order to find the \textbf{matching}
+ − 219
opening/closing bracket. For example, given the following program
+ − 220
with the program counter indicated by an arrow:
+ − 221
+ − 222
\begin{center}
336
+ − 223
\texttt{--[\barbelow{.}.+>--].>.++}
218
+ − 224
\end{center}
+ − 225
+ − 226
then the matching closing bracket is in 9th position (counting from 0) and
+ − 227
\texttt{jumpRight} is supposed to return the position just after this
109
+ − 228
218
+ − 229
\begin{center}
336
+ − 230
\texttt{--[..+>--]\barbelow{.}>.++}
218
+ − 231
\end{center}
+ − 232
237
+ − 233
meaning it jumps to after the loop. Similarly, if you are in 8th position,
218
+ − 234
then \texttt{jumpLeft} is supposed to jump to just after the opening
+ − 235
bracket (that is jumping to the beginning of the loop):
109
+ − 236
218
+ − 237
\begin{center}
336
+ − 238
\texttt{--[..+>-\barbelow{-}].>.++}
218
+ − 239
\qquad$\stackrel{\texttt{jumpLeft}}{\longrightarrow}$\qquad
336
+ − 240
\texttt{--[\barbelow{.}.+>--].>.++}
218
+ − 241
\end{center}
+ − 242
+ − 243
Unfortunately we have to take into account that there might be
+ − 244
other opening and closing brackets on the `way' to find the
336
+ − 245
matching bracket. For example in the brain*ck++ program
218
+ − 246
+ − 247
\begin{center}
336
+ − 248
\texttt{--[\barbelow{.}.[+>]--].>.++}
218
+ − 249
\end{center}
109
+ − 250
218
+ − 251
we do not want to return the index for the \texttt{'-'} in the 9th
336
+ − 252
position, but the program counter for \texttt{'.'} in 12th
218
+ − 253
position. The easiest to find out whether a bracket is matched is by
+ − 254
using levels (which are the third argument in \texttt{jumpLeft} and
+ − 255
\texttt{jumpLeft}). In case of \texttt{jumpRight} you increase the
+ − 256
level by one whenever you find an opening bracket and decrease by
+ − 257
one for a closing bracket. Then in \texttt{jumpRight} you are looking
+ − 258
for the closing bracket on level \texttt{0}. For \texttt{jumpLeft} you
+ − 259
do the opposite. In this way you can find \textbf{matching} brackets
+ − 260
in strings such as
+ − 261
+ − 262
\begin{center}
336
+ − 263
\texttt{--[\barbelow{.}.[[-]+>[.]]--].>.++}
218
+ − 264
\end{center}
109
+ − 265
218
+ − 266
for which \texttt{jumpRight} should produce the position:
+ − 267
+ − 268
\begin{center}
336
+ − 269
\texttt{--[..[[-]+>[.]]--]\barbelow{.}>.++}
218
+ − 270
\end{center}
+ − 271
+ − 272
It is also possible that the position returned by \texttt{jumpRight} or
+ − 273
\texttt{jumpLeft} is outside the string in cases where there are
+ − 274
no matching brackets. For example
+ − 275
+ − 276
\begin{center}
336
+ − 277
\texttt{--[\barbelow{.}.[[-]+>[.]]--.>.++}
218
+ − 278
\qquad$\stackrel{\texttt{jumpRight}}{\longrightarrow}$\qquad
336
+ − 279
\texttt{--[..[[-]+>[.]]-->.++\barbelow{\;\phantom{+}}}
218
+ − 280
\end{center}
229
+ − 281
\hfill[2 Marks]
109
+ − 282
+ − 283
230
+ − 284
\item[(4)] Write a recursive function \texttt{compute} that runs a
336
+ − 285
brain*u*k++ program. It takes a program, a program counter, a memory
218
+ − 286
pointer and a memory as arguments. If the program counter is outside
230
+ − 287
the program string, the execution stops and \texttt{compute} returns the
218
+ − 288
memory. If the program counter is inside the string, it reads the
+ − 289
corresponding character and updates the program counter \texttt{pc},
+ − 290
memory pointer \texttt{mp} and memory \texttt{mem} according to the
+ − 291
rules shown in Figure~\ref{comms}. It then calls recursively
230
+ − 292
\texttt{compute} with the updated data. The most convenient way to
336
+ − 293
implement the brainf**k++ rules in Scala is to use pattern-matching
230
+ − 294
and to calculate a triple consisting of the updated \texttt{pc},
229
+ − 295
\texttt{mp} and \texttt{mem}.
218
+ − 296
230
+ − 297
Write another function \texttt{run} that calls \texttt{compute} with a
336
+ − 298
given brainfu*k++ program and memory, and the program counter and memory pointer
230
+ − 299
set to~$0$. Like \texttt{compute}, it returns the memory after the execution
336
+ − 300
of the program finishes. You can test your brainf**k++ interpreter with the
229
+ − 301
Sierpinski triangle or the Hello world programs (they seem to be particularly
+ − 302
useful for debugging purposes), or have a look at
109
+ − 303
218
+ − 304
\begin{center}
+ − 305
\url{https://esolangs.org/wiki/Brainfuck}
337
+ − 306
\end{center}
+ − 307
+ − 308
\noindent for more bf/bf++-programs and the test cases given in \texttt{bf.scala}.\\
+ − 309
\mbox{}\hfill[2 Marks]
109
+ − 310
218
+ − 311
\begin{figure}[p]
+ − 312
\begin{center}
230
+ − 313
\begin{tabular}{|@{\hspace{0.5mm}}p{0.8cm}|l|}
218
+ − 314
\hline
+ − 315
\hfill\texttt{'>'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 316
$\bullet$ & $\texttt{pc} + 1$\\
+ − 317
$\bullet$ & $\texttt{mp} + 1$\\
+ − 318
$\bullet$ & \texttt{mem} unchanged
+ − 319
\end{tabular}\\\hline
+ − 320
\hfill\texttt{'<'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 321
$\bullet$ & $\texttt{pc} + 1$\\
+ − 322
$\bullet$ & $\texttt{mp} - 1$\\
+ − 323
$\bullet$ & \texttt{mem} unchanged
+ − 324
\end{tabular}\\\hline
+ − 325
\hfill\texttt{'+'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 326
$\bullet$ & $\texttt{pc} + 1$\\
+ − 327
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 328
$\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) + 1}\\
+ − 329
\end{tabular}\\\hline
+ − 330
\hfill\texttt{'-'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 331
$\bullet$ & $\texttt{pc} + 1$\\
+ − 332
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 333
$\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) - 1}\\
+ − 334
\end{tabular}\\\hline
+ − 335
\hfill\texttt{'.'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 336
$\bullet$ & $\texttt{pc} + 1$\\
+ − 337
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 338
$\bullet$ & print out \,\texttt{mem(mp)} as a character\\
+ − 339
\end{tabular}\\\hline
336
+ − 340
%\hfill\texttt{','} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 341
% $\bullet$ & $\texttt{pc} + 1$\\
+ − 342
% $\bullet$ & $\texttt{mp}$ unchanged\\
+ − 343
% $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\
+ − 344
% \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}}
+ − 345
% \end{tabular}\\\hline
218
+ − 346
\hfill\texttt{'['} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 347
\multicolumn{2}{@{}l}{if \texttt{mem(mp) == 0} then}\\
+ − 348
$\bullet$ & $\texttt{pc = jumpRight(prog, pc + 1, 0)}$\\
+ − 349
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
+ − 350
\multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) != 0} then}\\
+ − 351
$\bullet$ & $\texttt{pc} + 1$\\
+ − 352
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 353
\end{tabular}
+ − 354
\\\hline
+ − 355
\hfill\texttt{']'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 356
\multicolumn{2}{@{}l}{if \texttt{mem(mp) != 0} then}\\
+ − 357
$\bullet$ & $\texttt{pc = jumpLeft(prog, pc - 1, 0)}$\\
+ − 358
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
+ − 359
\multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\
+ − 360
$\bullet$ & $\texttt{pc} + 1$\\
+ − 361
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
336
+ − 362
\end{tabular}\\\hline
+ − 363
\hfill\texttt{'*'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 364
$\bullet$ & $\texttt{pc} + 1$\\
+ − 365
$\bullet$ & $\texttt{mp}$ unchanged\\
337
+ − 366
$\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) * mem(mp - 1)}\\
+ − 367
\multicolumn{2}{@{}l}{this multiplies the content of the memory cells at
+ − 368
\texttt{mp} and \texttt{mp - 1}}\\
+ − 369
\multicolumn{2}{@{}l}{and stores the result at \texttt{mp}}
+ − 370
\end{tabular}\\\hline
+ − 371
\hfill\texttt{'@'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 372
$\bullet$ & $\texttt{pc} + 1$\\
+ − 373
$\bullet$ & $\texttt{mp}$ unchanged\\
+ − 374
$\bullet$ & \texttt{mem} updated with
+ − 375
\texttt{mem(mp) -> mem(mp - 1)}\\
+ − 376
\multicolumn{2}{@{}l}{this updates the memory cell having the index stored at \texttt{mem(mp)},}\\
+ − 377
\multicolumn{2}{@{}l}{with the value stored at \texttt{mem(mp - 1)},}
336
+ − 378
\end{tabular}\\\hline
+ − 379
\hfill\texttt{'\#'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 380
$\bullet$ & $\texttt{pc} + 1$\\
337
+ − 381
$\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
+ − 382
$\bullet$ & print out \,\texttt{mem(mp)} as a number\\
336
+ − 383
\end{tabular}\\\hline
218
+ − 384
any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
+ − 385
$\bullet$ & $\texttt{pc} + 1$\\
+ − 386
$\bullet$ & \texttt{mp} and \texttt{mem} unchanged
+ − 387
\end{tabular}\\
+ − 388
\hline
+ − 389
\end{tabular}
337
+ − 390
\\\mbox{}\\[-10mm]\mbox{}
218
+ − 391
\end{center}
337
+ − 392
\caption{The rules for how commands in the brainf***++ language update the
+ − 393
program counter \texttt{pc},
230
+ − 394
the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}}
218
+ − 395
\end{figure}
+ − 396
\end{itemize}\bigskip
+ − 397
268
+ − 398
%%\newpage
233
+ − 399
307
+ − 400
\subsection*{Part B (4 Marks)}
218
+ − 401
337
+ − 402
I am sure you agree while it is fun to marvel at bf++-programs, like the
233
+ − 403
Sierpinski triangle or the Mandelbrot program, being interpreted, it
337
+ − 404
is much more fun to write a compiler for the bf++-language.
233
+ − 405
+ − 406
+ − 407
\subsubsection*{Tasks (file bfc.scala)}
+ − 408
+ − 409
\begin{itemize}
337
+ − 410
\item[(5)] Compilers, in general, attempt to make programs run
233
+ − 411
faster by precomputing as much information as possible
+ − 412
before running the program. In our case we can precompute the
+ − 413
addresses where we need to jump at the beginning and end of
+ − 414
loops.
+ − 415
+ − 416
For this write a function \texttt{jtable} that precomputes the ``jump
337
+ − 417
table'' for a bf++-program. This function takes a bf++-program
233
+ − 418
as an argument and returns a \texttt{Map[Int, Int]}. The
237
+ − 419
purpose of this Map is to record the information, in cases
+ − 420
a pc-position points to a '\texttt{[}' or a '\texttt{]}',
+ − 421
to which pc-position do we need to jump next?
233
+ − 422
+ − 423
For example for the program
+ − 424
+ − 425
\begin{center}
+ − 426
\texttt{+++++[->++++++++++<]>--<+++[->>++++++++++}
+ − 427
\texttt{<<]>>++<<----------[+>.>.<+<]}
+ − 428
\end{center}
+ − 429
+ − 430
we obtain the Map (note the precise numbers might differ depending on white
+ − 431
spaces etc.~in the bf-program):
+ − 432
+ − 433
\begin{center}
+ − 434
\texttt{Map(69 -> 61, 5 -> 20, 60 -> 70, 27 -> 44, 43 -> 28, 19 -> 6)}
+ − 435
\end{center}
+ − 436
+ − 437
This Map states that for the '\texttt{[}' on position 5, we need to
+ − 438
jump to position 20, which is just after the corresponding '\texttt{]}'.
+ − 439
Similarly, for the '\texttt{]}' on position 19, we need to jump to
+ − 440
position 6, which is just after the '\texttt{[}' on position 5, and so
+ − 441
on. The idea is to not calculate this information each time
+ − 442
we hit a bracket, but just look up this information in the
+ − 443
\texttt{jtable}.
+ − 444
+ − 445
Then adapt the \texttt{compute} and \texttt{run} functions
+ − 446
from Part 1 in order to take advantage of the information
+ − 447
stored in the \texttt{jtable}. This means whenever \texttt{jumpLeft}
+ − 448
and \texttt{jumpRight} was called previously, you should look
+ − 449
up the jump address in the \texttt{jtable}. Feel free to reuse
+ − 450
the function \texttt{jumpLeft} and \texttt{jumpRight} for
+ − 451
calculating the \texttt{jtable}.\hfill{[1 Mark]}
+ − 452
237
+ − 453
\item[(6)] Compilers try to eliminate any ``dead'' code that could
+ − 454
slow down programs and also perform what is often called
+ − 455
\emph{peephole
+ − 456
optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}}
+ − 457
For the latter consider that it is difficult for compilers to
241
+ − 458
comprehend what is intended with whole programs, but they are very good
237
+ − 459
at finding out what small snippets of code do, and then try to
+ − 460
generate faster code for such snippets.
233
+ − 461
337
+ − 462
In our case, dead code is everything that is not a bf++-command.
233
+ − 463
Therefore write a function \texttt{optimise} which deletes such
337
+ − 464
dead code from a bf++-program. Moreover this function should replace every substring
233
+ − 465
of the form \pcode{[-]} by a new command \texttt{0}.
+ − 466
The idea is that the loop \pcode{[-]} just resets the
+ − 467
memory at the current location to 0. It is more efficient
237
+ − 468
to do this in a single step, rather than stepwise in a loop as in
337
+ − 469
the original bf++-programs.
218
+ − 470
233
+ − 471
In the extended \texttt{compute3} and \texttt{run3} functions you should
+ − 472
implement this command by writing 0 to \pcode{mem(mp)}, that is use
+ − 473
\pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}.
+ − 474
The easiest way to modify a string in this way is to use the regular
337
+ − 475
expression \pcode{"""[^<>+-.\\[\\]@\#*]"""}, which recognises everything that is
+ − 476
not a bf++-command. Similarly, the
237
+ − 477
regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings
+ − 478
with new strings.\\
+ − 479
\mbox{}\hfill{[1 Mark]}
233
+ − 480
337
+ − 481
\item[(7)] Finally, real compilers try to take advantage of modern
+ − 482
CPUs which often provide complex operations in hardware that can
+ − 483
combine many smaller instructions into a single faster instruction.
233
+ − 484
237
+ − 485
In our case we can optimise the several single increments performed at a
233
+ − 486
memory cell, for example \pcode{++++}, by a single ``increment by
+ − 487
4''. For this optimisation we just have to make sure these single
237
+ − 488
increments are all next to each other. Similar optimisations should apply
233
+ − 489
for the bf-commands \pcode{-}, \pcode{<} and
+ − 490
\pcode{>}, which can all be replaced by extended versions that take
+ − 491
the amount of the increment (decrement) into account. We will do
337
+ − 492
this by introducing two-character bf++-commands. For example
233
+ − 493
+ − 494
\begin{center}
+ − 495
\begin{tabular}{l|l}
+ − 496
original bf-cmds & replacement\\
+ − 497
\hline
+ − 498
\pcode{+} & \pcode{+A}\\
+ − 499
\pcode{++} & \pcode{+B}\\
+ − 500
\pcode{+++} & \pcode{+C}\\
+ − 501
\ldots{} & \ldots{}\\
+ − 502
\pcode{+++....++} & \pcode{+Z}\\
+ − 503
\hspace{5mm}(these are 26 \pcode{+}'s)\\
+ − 504
\end{tabular}
+ − 505
\end{center}
+ − 506
+ − 507
237
+ − 508
If there are more
+ − 509
than 26 \pcode{+}'s in a row, then more than one ``two-character''
233
+ − 510
bf-commands need to be generated (the idea is that more than
337
+ − 511
26 copies of a single bf++-command in a row is a rare occurrence in
+ − 512
actual bf++-programs). Similar replacements apply
237
+ − 513
for \pcode{-}, \pcode{<} and \pcode{>}, but
337
+ − 514
all other bf++-commands should be unaffected by this
233
+ − 515
change.
+ − 516
+ − 517
For this write a function \texttt{combine} which replaces sequences
+ − 518
of repeated increment and decrement commands by appropriate
237
+ − 519
two-character commands. In the functions \pcode{compute4} and
233
+ − 520
\pcode{run4}, the ``combine'' and the optimisation from (6) should
337
+ − 521
be performed. Make sure that when a two-character bf++-command is
233
+ − 522
encountered you need to increase the \pcode{pc}-counter by two in
237
+ − 523
order to progress to the next command. For example
233
+ − 524
+ − 525
\begin{center}
+ − 526
\pcode{combine(optimise(load_bff("benchmark.bf")))}
+ − 527
\end{center}
+ − 528
+ − 529
generates the improved program
+ − 530
+ − 531
\begin{center}
+ − 532
\pcode{>A+B[<A+M>A-A]<A[[}\hspace{3mm}\ldots{}
+ − 533
\end{center}
+ − 534
+ − 535
for the original benchmark program
+ − 536
+ − 537
\begin{center}
+ − 538
\pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots
+ − 539
\end{center}
+ − 540
241
+ − 541
As you can see, the compiler bets on saving a lot of time on the
+ − 542
\pcode{+B} and \pcode{+M} steps so that the optimisations is
233
+ − 543
worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a
+ − 544
penalty). Luckily, after you have performed all
+ − 545
optimisations in (5) - (7), you can expect that the
234
+ − 546
\pcode{benchmark.bf} program runs four to five times faster.
+ − 547
You can also test whether your compiler produces the correct result
337
+ − 548
by testing for example
234
+ − 549
+ − 550
\begin{center}
+ − 551
\pcode{run(load_bff("sierpinski.bf")) == run4(load_bff("sierpinski.bf"))}
+ − 552
\end{center}
+ − 553
+ − 554
which should return true for all the different compiler stages. \\
+ − 555
\mbox{}\hfill{[2 Marks]}
233
+ − 556
\end{itemize}
6
+ − 557
+ − 558
\end{document}
+ − 559
68
+ − 560
6
+ − 561
%%% Local Variables:
+ − 562
%%% mode: latex
+ − 563
%%% TeX-master: t
+ − 564
%%% End: