|
1 % !TEX program = xelatex |
|
2 \documentclass{article} |
|
3 \usepackage{../style} |
|
4 \usepackage{../langs} |
|
5 \usepackage{disclaimer} |
|
6 \usepackage{tikz} |
|
7 \usepackage{pgf} |
|
8 \usepackage{pgfplots} |
|
9 \usepackage{stackengine} |
|
10 %% \usepackage{accents} |
|
11 \newcommand\barbelow[1]{\stackunder[1.2pt]{#1}{\raisebox{-4mm}{\boldmath$\uparrow$}}} |
|
12 |
|
13 %% change Console to scala.io.StdIn.readByte() |
|
14 |
|
15 |
|
16 \begin{document} |
|
17 |
|
18 \section*{Part 10 (Scala)} |
|
19 |
|
20 \mbox{}\hfill\textit{``If there's one feature that makes Scala, `Scala',}\\ |
|
21 \mbox{}\hfill\textit{ I would pick implicits.''}\smallskip\\ |
|
22 \mbox{}\hfill\textit{ --- Martin Odersky (creator of the Scala language)}\bigskip\bigskip |
|
23 |
|
24 |
|
25 \noindent |
|
26 This part is about a small (esoteric) programming language called |
|
27 brainf***. Actually, we will implement an interpreter for our own version |
|
28 of this language called brainf*ck++.\bigskip |
|
29 |
|
30 \IMPORTANT{This part is worth 10\% and you need to submit it on \cwTEN{} at 4pm.} |
|
31 |
|
32 \noindent |
|
33 Also note that the running time of each part will be restricted to a |
|
34 maximum of 30 seconds on my laptop. |
|
35 |
|
36 \DISCLAIMER{} |
|
37 \newpage |
|
38 |
|
39 \subsection*{Reference Implementation} |
|
40 |
|
41 As usual, this Scala assignment comes with a reference implementation in |
|
42 form of two \texttt{jar}-files. You can download them from KEATS. They |
|
43 allow you to run any test cases on your own computer. For example you |
|
44 can call Scala on the command line with the option \texttt{-cp bf.jar} |
|
45 and then query any function from the \texttt{bf.scala} template file. |
|
46 You have to prefix the calls with \texttt{CW10a} and \texttt{CW10b}, |
|
47 respectively. For example |
|
48 |
|
49 |
|
50 \begin{lstlisting}[language={},xleftmargin=1mm,numbers=none,basicstyle=\ttfamily\small] |
|
51 $ scala -cp bf.jar |
|
52 scala> import CW10a._ |
|
53 scala> run(load_bff("sierpinski.bf")) ; () |
|
54 * |
|
55 * * |
|
56 * * |
|
57 * * * * |
|
58 * * |
|
59 * * * * |
|
60 * * * * |
|
61 * * * * * * * * |
|
62 * * |
|
63 * * * * |
|
64 * * * * |
|
65 * * * * * * * * |
|
66 * * * * |
|
67 * * * * * * * * |
|
68 * * * * * * * * |
|
69 * * * * * * * * * * * * * * * * |
|
70 * * |
|
71 * * * * |
|
72 * * * * |
|
73 * * * * * * * * |
|
74 * * * * |
|
75 * * * * * * * * |
|
76 * * * * * * * * |
|
77 * * * * * * * * * * * * * * * * |
|
78 * * * * |
|
79 * * * * * * * * |
|
80 * * * * * * * * |
|
81 * * * * * * * * * * * * * * * * |
|
82 * * * * * * * * |
|
83 * * * * * * * * * * * * * * * * |
|
84 * * * * * * * * * * * * * * * * |
|
85 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * |
|
86 \end{lstlisting}%$ |
|
87 |
|
88 \newpage |
|
89 |
|
90 \subsection*{Part A (6 Marks)} |
|
91 |
|
92 Coming from Java or C++, you might think Scala is a rather esoteric |
|
93 programming language. But remember, some serious companies have built |
|
94 their business on |
|
95 Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}} |
|
96 I claim functional programming is not a fad. And there are far, far |
|
97 more esoteric languages out there. One is called \emph{brainf***}. |
|
98 \here{https://esolangs.org/wiki/Brainfuck} |
|
99 You |
|
100 are asked in this part to implement an interpreter for |
|
101 a slight extension of this language. |
|
102 |
|
103 Urban M\"uller developed the original version of brainf*** in 1993. A close |
|
104 relative of this language was already introduced in 1964 by Corado |
|
105 B\"ohm, an Italian computer pioneer. The main feature of brainf*** is |
|
106 its minimalistic set of instructions---just 8 instructions in total |
|
107 and all of which are single characters. Despite the minimalism, this |
|
108 language has been shown to be Turing complete\ldots{}if this doesn't |
|
109 ring any bell with you: it roughly means that every(!) algorithm can, |
|
110 in principle, be implemented in brainf***. It just takes a lot of |
|
111 determination and quite a lot of memory resources. |
|
112 |
|
113 Some relatively sophisticated sample programs in brainf*** are given |
|
114 in the file \texttt{bf.scala}, including a brainf*** program for the |
|
115 Sierpinski triangle and the Mandelbrot set. There seems to be even a |
|
116 dedicated Windows IDE for bf programs, though I am not sure whether |
|
117 this is just an elaborate April fools' joke---judge yourself: |
|
118 |
|
119 \begin{center} |
|
120 \url{https://www.microsoft.com/en-us/p/brainf-ck/9nblgggzhvq5} |
|
121 \end{center} \bigskip |
|
122 |
|
123 |
|
124 \noindent |
|
125 As mentioned above, the original brainf*** has 8 single-character |
|
126 commands. Our version of bf++ will contain the commands \texttt{'>'}, |
|
127 \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'}, \texttt{'['} |
|
128 and \texttt{']'} from the original, and in addition the commands |
|
129 \texttt{'@'}, \texttt{'*'} and \texttt{'\#'}. Every other character |
|
130 is considered a comment. |
|
131 |
|
132 Our interpreter for bf++ operates on memory cells containing |
|
133 integers. For this it uses a single memory pointer, called |
|
134 \texttt{mp}, that points at each stage to one memory cell. |
|
135 |
|
136 \begin{center} |
|
137 \begin{tikzpicture} |
|
138 \draw [line width=1mm, rounded corners] (0,0) rectangle (5, 0.5); |
|
139 \draw (0.5, 0) -- (0.5, 0.5); |
|
140 \draw (1.0, 0) -- (1.0, 0.5); |
|
141 |
|
142 \draw (2.5, 0) -- (2.5, 0.5); |
|
143 \draw (2.0, 0) -- (2.0, 0.5); |
|
144 |
|
145 \draw (4.5, 0) -- (4.5, 0.5); |
|
146 \draw (4.0, 0) -- (4.0, 0.5); |
|
147 |
|
148 \draw (1.5,0.25) node {$\cdots$}; |
|
149 \draw (3.0,0.25) node {$\cdots$}; |
|
150 |
|
151 \draw [->, thick] (2.25, -0.5) -- (2.25, -0.15); |
|
152 \draw (2.25,-0.8) node {\texttt{mp}}; |
|
153 |
|
154 \draw (0.7,0.7) node {\sf\footnotesize memory}; |
|
155 \end{tikzpicture} |
|
156 \end{center} |
|
157 |
|
158 \noindent |
|
159 This pointer can be moved forward by one memory cell by using the |
|
160 command \texttt{'>'}, and backward by using \texttt{'<'}. The commands |
|
161 \texttt{'+'} and \texttt{'-'} increase, respectively decrease, by 1 |
|
162 the content of the memory cell to which the memory pointer currently |
|
163 points to. The command for output in bf++ is \texttt{'.'} whereby output works |
|
164 by reading the content of the memory cell to which the memory pointer |
|
165 points to and printing it out as an ASCII character.\footnote{In the |
|
166 original version of bf, there is also a command for input, but we |
|
167 omit it here. All our programs will be ``autonomous''.} The |
|
168 commands \texttt{'['} and \texttt{']'} are looping |
|
169 constructs. Everything in between \texttt{'['} and \texttt{']'} is |
|
170 repeated until a counter (memory cell) reaches zero. A typical |
|
171 program in brainf*** looks as follows: |
|
172 |
|
173 \begin{center} |
|
174 \begin{verbatim} |
|
175 ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.++ |
|
176 +++++..+++.>>.<-.<.+++.------.--------.>>+.>++. |
|
177 \end{verbatim} |
|
178 \end{center} |
|
179 |
|
180 \noindent |
|
181 This one prints out Hello World\ldots{}obviously \texttt{;o)} We also |
|
182 add 3 new commands in the bf++-version of the bf-language. The purpose |
|
183 of these commands we explain later. |
|
184 |
|
185 |
|
186 \subsubsection*{Tasks (file bf.scala)} |
|
187 |
|
188 \begin{itemize} |
|
189 \item[(1)] Write a function that takes a filename (a string) as an argument |
|
190 and requests the corresponding file from disk. It returns the |
|
191 content of the file as a string. If the file does not exists, |
|
192 the function should return the empty string. |
|
193 \mbox{}\hfill[1 Mark] |
|
194 |
|
195 \item[(2)] Brainf**k++ memory is represented by a \texttt{Map} from |
|
196 integers to integers. The empty memory is represented by |
|
197 \texttt{Map()}, that is nothing is stored in the |
|
198 memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at |
|
199 memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The |
|
200 convention is that if we query the memory at a location that is |
|
201 \emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write |
|
202 a `safe-read' function, \texttt{sread}, that takes a memory (a \texttt{Map}) and |
|
203 a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the |
|
204 corresponding memory location. If the \texttt{Map} is not defined at |
|
205 the memory pointer, \texttt{sread} returns \texttt{0}. |
|
206 |
|
207 Write another function \texttt{write}, which takes a memory, a |
|
208 memory pointer and an integer value as arguments and updates the |
|
209 \texttt{Map} with the value at the given memory location. As usual, |
|
210 the \texttt{Map} is not updated `in-place' but a new map is created |
|
211 with the same data, except the new value is stored at the given memory |
|
212 pointer.\hfill[1 Mark] |
|
213 |
|
214 \item[(3)] Write two functions, \texttt{jumpRight} and |
|
215 \texttt{jumpLeft}, that are needed to implement the loop constructs |
|
216 in brainf**k++. They take a program (a \texttt{String}) and a program |
|
217 counter (an \texttt{Int}) as arguments and move right (respectively |
|
218 left) in the string in order to find the \textbf{matching} |
|
219 opening/closing bracket. For example, given the following program |
|
220 with the program counter indicated by an arrow: |
|
221 |
|
222 \begin{center} |
|
223 \texttt{--[\barbelow{.}.+>--].>.++} |
|
224 \end{center} |
|
225 |
|
226 then the matching closing bracket is in 9th position (counting from 0) and |
|
227 \texttt{jumpRight} is supposed to return the position just after this |
|
228 |
|
229 \begin{center} |
|
230 \texttt{--[..+>--]\barbelow{.}>.++} |
|
231 \end{center} |
|
232 |
|
233 meaning it jumps to after the loop. Similarly, if you are in 8th position, |
|
234 then \texttt{jumpLeft} is supposed to jump to just after the opening |
|
235 bracket (that is jumping to the beginning of the loop): |
|
236 |
|
237 \begin{center} |
|
238 \texttt{--[..+>-\barbelow{-}].>.++} |
|
239 \qquad$\stackrel{\texttt{jumpLeft}}{\longrightarrow}$\qquad |
|
240 \texttt{--[\barbelow{.}.+>--].>.++} |
|
241 \end{center} |
|
242 |
|
243 Unfortunately we have to take into account that there might be |
|
244 other opening and closing brackets on the `way' to find the |
|
245 matching bracket. For example in the brain*ck++ program |
|
246 |
|
247 \begin{center} |
|
248 \texttt{--[\barbelow{.}.[+>]--].>.++} |
|
249 \end{center} |
|
250 |
|
251 we do not want to return the index for the \texttt{'-'} in the 9th |
|
252 position, but the program counter for \texttt{'.'} in 12th |
|
253 position. The easiest to find out whether a bracket is matched is by |
|
254 using levels (which are the third argument in \texttt{jumpLeft} and |
|
255 \texttt{jumpLeft}). In case of \texttt{jumpRight} you increase the |
|
256 level by one whenever you find an opening bracket and decrease by |
|
257 one for a closing bracket. Then in \texttt{jumpRight} you are looking |
|
258 for the closing bracket on level \texttt{0}. For \texttt{jumpLeft} you |
|
259 do the opposite. In this way you can find \textbf{matching} brackets |
|
260 in strings such as |
|
261 |
|
262 \begin{center} |
|
263 \texttt{--[\barbelow{.}.[[-]+>[.]]--].>.++} |
|
264 \end{center} |
|
265 |
|
266 for which \texttt{jumpRight} should produce the position: |
|
267 |
|
268 \begin{center} |
|
269 \texttt{--[..[[-]+>[.]]--]\barbelow{.}>.++} |
|
270 \end{center} |
|
271 |
|
272 It is also possible that the position returned by \texttt{jumpRight} or |
|
273 \texttt{jumpLeft} is outside the string in cases where there are |
|
274 no matching brackets. For example |
|
275 |
|
276 \begin{center} |
|
277 \texttt{--[\barbelow{.}.[[-]+>[.]]--.>.++} |
|
278 \qquad$\stackrel{\texttt{jumpRight}}{\longrightarrow}$\qquad |
|
279 \texttt{--[..[[-]+>[.]]-->.++\barbelow{\;\phantom{+}}} |
|
280 \end{center} |
|
281 \hfill[2 Marks] |
|
282 |
|
283 |
|
284 \item[(4)] Write a recursive function \texttt{compute} that runs a |
|
285 brain*u*k++ program. It takes a program, a program counter, a memory |
|
286 pointer and a memory as arguments. If the program counter is outside |
|
287 the program string, the execution stops and \texttt{compute} returns the |
|
288 memory. If the program counter is inside the string, it reads the |
|
289 corresponding character and updates the program counter \texttt{pc}, |
|
290 memory pointer \texttt{mp} and memory \texttt{mem} according to the |
|
291 rules shown in Figure~\ref{comms}. It then calls recursively |
|
292 \texttt{compute} with the updated data. The most convenient way to |
|
293 implement the brainf**k++ rules in Scala is to use pattern-matching |
|
294 and to calculate a triple consisting of the updated \texttt{pc}, |
|
295 \texttt{mp} and \texttt{mem}. |
|
296 |
|
297 Write another function \texttt{run} that calls \texttt{compute} with a |
|
298 given brainfu*k++ program and memory, and the program counter and memory pointer |
|
299 set to~$0$. Like \texttt{compute}, it returns the memory after the execution |
|
300 of the program finishes. You can test your brainf**k++ interpreter with the |
|
301 Sierpinski triangle or the Hello world programs (they seem to be particularly |
|
302 useful for debugging purposes), or have a look at |
|
303 |
|
304 \begin{center} |
|
305 \url{https://esolangs.org/wiki/Brainfuck} |
|
306 \end{center} |
|
307 |
|
308 \noindent for more bf/bf++-programs and the test cases given in \texttt{bf.scala}.\\ |
|
309 \mbox{}\hfill[2 Marks] |
|
310 |
|
311 \begin{figure}[p] |
|
312 \begin{center} |
|
313 \begin{tabular}{|@{\hspace{0.5mm}}p{0.8cm}|l|} |
|
314 \hline |
|
315 \hfill\texttt{'>'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
316 $\bullet$ & $\texttt{pc} + 1$\\ |
|
317 $\bullet$ & $\texttt{mp} + 1$\\ |
|
318 $\bullet$ & \texttt{mem} unchanged |
|
319 \end{tabular}\\\hline |
|
320 \hfill\texttt{'<'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
321 $\bullet$ & $\texttt{pc} + 1$\\ |
|
322 $\bullet$ & $\texttt{mp} - 1$\\ |
|
323 $\bullet$ & \texttt{mem} unchanged |
|
324 \end{tabular}\\\hline |
|
325 \hfill\texttt{'+'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
326 $\bullet$ & $\texttt{pc} + 1$\\ |
|
327 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
328 $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) + 1}\\ |
|
329 \end{tabular}\\\hline |
|
330 \hfill\texttt{'-'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
331 $\bullet$ & $\texttt{pc} + 1$\\ |
|
332 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
333 $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) - 1}\\ |
|
334 \end{tabular}\\\hline |
|
335 \hfill\texttt{'.'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
336 $\bullet$ & $\texttt{pc} + 1$\\ |
|
337 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
|
338 $\bullet$ & print out \,\texttt{mem(mp)} as a character\\ |
|
339 \end{tabular}\\\hline |
|
340 %\hfill\texttt{','} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
341 % $\bullet$ & $\texttt{pc} + 1$\\ |
|
342 % $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
343 % $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\ |
|
344 % \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}} |
|
345 % \end{tabular}\\\hline |
|
346 \hfill\texttt{'['} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
347 \multicolumn{2}{@{}l}{if \texttt{mem(mp) == 0} then}\\ |
|
348 $\bullet$ & $\texttt{pc = jumpRight(prog, pc + 1, 0)}$\\ |
|
349 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\ |
|
350 \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) != 0} then}\\ |
|
351 $\bullet$ & $\texttt{pc} + 1$\\ |
|
352 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
|
353 \end{tabular} |
|
354 \\\hline |
|
355 \hfill\texttt{']'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
356 \multicolumn{2}{@{}l}{if \texttt{mem(mp) != 0} then}\\ |
|
357 $\bullet$ & $\texttt{pc = jumpLeft(prog, pc - 1, 0)}$\\ |
|
358 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\ |
|
359 \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\ |
|
360 $\bullet$ & $\texttt{pc} + 1$\\ |
|
361 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
|
362 \end{tabular}\\\hline |
|
363 \hfill\texttt{'*'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
364 $\bullet$ & $\texttt{pc} + 1$\\ |
|
365 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
366 $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) * mem(mp - 1)}\\ |
|
367 \multicolumn{2}{@{}l}{this multiplies the content of the memory cells at |
|
368 \texttt{mp} and \texttt{mp - 1}}\\ |
|
369 \multicolumn{2}{@{}l}{and stores the result at \texttt{mp}} |
|
370 \end{tabular}\\\hline |
|
371 \hfill\texttt{'@'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
372 $\bullet$ & $\texttt{pc} + 1$\\ |
|
373 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
374 $\bullet$ & \texttt{mem} updated with |
|
375 \texttt{mem(mp) -> mem(mp - 1)}\\ |
|
376 \multicolumn{2}{@{}l}{this updates the memory cell having the index stored at \texttt{mem(mp)},}\\ |
|
377 \multicolumn{2}{@{}l}{with the value stored at \texttt{mem(mp - 1)},} |
|
378 \end{tabular}\\\hline |
|
379 \hfill\texttt{'\#'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
380 $\bullet$ & $\texttt{pc} + 1$\\ |
|
381 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
|
382 $\bullet$ & print out \,\texttt{mem(mp)} as a number\\ |
|
383 \end{tabular}\\\hline |
|
384 any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
385 $\bullet$ & $\texttt{pc} + 1$\\ |
|
386 $\bullet$ & \texttt{mp} and \texttt{mem} unchanged |
|
387 \end{tabular}\\ |
|
388 \hline |
|
389 \end{tabular} |
|
390 \\\mbox{}\\[-10mm]\mbox{} |
|
391 \end{center} |
|
392 \caption{The rules for how commands in the brainf***++ language update the |
|
393 program counter \texttt{pc}, |
|
394 the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}} |
|
395 \end{figure} |
|
396 \end{itemize}\bigskip |
|
397 |
|
398 %%\newpage |
|
399 |
|
400 \subsection*{Part B (4 Marks)} |
|
401 |
|
402 I am sure you agree while it is fun to marvel at bf++-programs, like the |
|
403 Sierpinski triangle or the Mandelbrot program, being interpreted, it |
|
404 is much more fun to write a compiler for the bf++-language. |
|
405 |
|
406 |
|
407 \subsubsection*{Tasks (file bfc.scala)} |
|
408 |
|
409 \begin{itemize} |
|
410 \item[(5)] Compilers, in general, attempt to make programs run |
|
411 faster by precomputing as much information as possible |
|
412 before running the program. In our case we can precompute the |
|
413 addresses where we need to jump at the beginning and end of |
|
414 loops. |
|
415 |
|
416 For this write a function \texttt{jtable} that precomputes the ``jump |
|
417 table'' for a bf++-program. This function takes a bf++-program |
|
418 as an argument and returns a \texttt{Map[Int, Int]}. The |
|
419 purpose of this Map is to record the information, in cases |
|
420 a pc-position points to a '\texttt{[}' or a '\texttt{]}', |
|
421 to which pc-position do we need to jump next? |
|
422 |
|
423 For example for the program |
|
424 |
|
425 \begin{center} |
|
426 \texttt{+++++[->++++++++++<]>--<+++[->>++++++++++} |
|
427 \texttt{<<]>>++<<----------[+>.>.<+<]} |
|
428 \end{center} |
|
429 |
|
430 we obtain the Map (note the precise numbers might differ depending on white |
|
431 spaces etc.~in the bf-program): |
|
432 |
|
433 \begin{center} |
|
434 \texttt{Map(69 -> 61, 5 -> 20, 60 -> 70, 27 -> 44, 43 -> 28, 19 -> 6)} |
|
435 \end{center} |
|
436 |
|
437 This Map states that for the '\texttt{[}' on position 5, we need to |
|
438 jump to position 20, which is just after the corresponding '\texttt{]}'. |
|
439 Similarly, for the '\texttt{]}' on position 19, we need to jump to |
|
440 position 6, which is just after the '\texttt{[}' on position 5, and so |
|
441 on. The idea is to not calculate this information each time |
|
442 we hit a bracket, but just look up this information in the |
|
443 \texttt{jtable}. |
|
444 |
|
445 Then adapt the \texttt{compute} and \texttt{run} functions |
|
446 from Part 1 in order to take advantage of the information |
|
447 stored in the \texttt{jtable}. This means whenever \texttt{jumpLeft} |
|
448 and \texttt{jumpRight} was called previously, you should look |
|
449 up the jump address in the \texttt{jtable}. Feel free to reuse |
|
450 the function \texttt{jumpLeft} and \texttt{jumpRight} for |
|
451 calculating the \texttt{jtable}.\hfill{[1 Mark]} |
|
452 |
|
453 \item[(6)] Compilers try to eliminate any ``dead'' code that could |
|
454 slow down programs and also perform what is often called |
|
455 \emph{peephole |
|
456 optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} |
|
457 For the latter consider that it is difficult for compilers to |
|
458 comprehend what is intended with whole programs, but they are very good |
|
459 at finding out what small snippets of code do, and then try to |
|
460 generate faster code for such snippets. |
|
461 |
|
462 In our case, dead code is everything that is not a bf++-command. |
|
463 Therefore write a function \texttt{optimise} which deletes such |
|
464 dead code from a bf++-program. Moreover this function should replace every substring |
|
465 of the form \pcode{[-]} by a new command \texttt{0}. |
|
466 The idea is that the loop \pcode{[-]} just resets the |
|
467 memory at the current location to 0. It is more efficient |
|
468 to do this in a single step, rather than stepwise in a loop as in |
|
469 the original bf++-programs. |
|
470 |
|
471 In the extended \texttt{compute3} and \texttt{run3} functions you should |
|
472 implement this command by writing 0 to \pcode{mem(mp)}, that is use |
|
473 \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}. |
|
474 The easiest way to modify a string in this way is to use the regular |
|
475 expression \pcode{"""[^<>+-.\\[\\]@\#*]"""}, which recognises everything that is |
|
476 not a bf++-command. Similarly, the |
|
477 regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings |
|
478 with new strings.\\ |
|
479 \mbox{}\hfill{[1 Mark]} |
|
480 |
|
481 \item[(7)] Finally, real compilers try to take advantage of modern |
|
482 CPUs which often provide complex operations in hardware that can |
|
483 combine many smaller instructions into a single faster instruction. |
|
484 |
|
485 In our case we can optimise the several single increments performed at a |
|
486 memory cell, for example \pcode{++++}, by a single ``increment by |
|
487 4''. For this optimisation we just have to make sure these single |
|
488 increments are all next to each other. Similar optimisations should apply |
|
489 for the bf-commands \pcode{-}, \pcode{<} and |
|
490 \pcode{>}, which can all be replaced by extended versions that take |
|
491 the amount of the increment (decrement) into account. We will do |
|
492 this by introducing two-character bf++-commands. For example |
|
493 |
|
494 \begin{center} |
|
495 \begin{tabular}{l|l} |
|
496 original bf-cmds & replacement\\ |
|
497 \hline |
|
498 \pcode{+} & \pcode{+A}\\ |
|
499 \pcode{++} & \pcode{+B}\\ |
|
500 \pcode{+++} & \pcode{+C}\\ |
|
501 \ldots{} & \ldots{}\\ |
|
502 \pcode{+++....++} & \pcode{+Z}\\ |
|
503 \hspace{5mm}(these are 26 \pcode{+}'s)\\ |
|
504 \end{tabular} |
|
505 \end{center} |
|
506 |
|
507 |
|
508 If there are more |
|
509 than 26 \pcode{+}'s in a row, then more than one ``two-character'' |
|
510 bf-commands need to be generated (the idea is that more than |
|
511 26 copies of a single bf++-command in a row is a rare occurrence in |
|
512 actual bf++-programs). Similar replacements apply |
|
513 for \pcode{-}, \pcode{<} and \pcode{>}, but |
|
514 all other bf++-commands should be unaffected by this |
|
515 change. |
|
516 |
|
517 For this write a function \texttt{combine} which replaces sequences |
|
518 of repeated increment and decrement commands by appropriate |
|
519 two-character commands. In the functions \pcode{compute4} and |
|
520 \pcode{run4}, the ``combine'' and the optimisation from (6) should |
|
521 be performed. Make sure that when a two-character bf++-command is |
|
522 encountered you need to increase the \pcode{pc}-counter by two in |
|
523 order to progress to the next command. For example |
|
524 |
|
525 \begin{center} |
|
526 \pcode{combine(optimise(load_bff("benchmark.bf")))} |
|
527 \end{center} |
|
528 |
|
529 generates the improved program |
|
530 |
|
531 \begin{center} |
|
532 \pcode{>A+B[<A+M>A-A]<A[[}\hspace{3mm}\ldots{} |
|
533 \end{center} |
|
534 |
|
535 for the original benchmark program |
|
536 |
|
537 \begin{center} |
|
538 \pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots |
|
539 \end{center} |
|
540 |
|
541 As you can see, the compiler bets on saving a lot of time on the |
|
542 \pcode{+B} and \pcode{+M} steps so that the optimisations is |
|
543 worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a |
|
544 penalty). Luckily, after you have performed all |
|
545 optimisations in (5) - (7), you can expect that the |
|
546 \pcode{benchmark.bf} program runs four to five times faster. |
|
547 You can also test whether your compiler produces the correct result |
|
548 by testing for example |
|
549 |
|
550 \begin{center} |
|
551 \pcode{run(load_bff("sierpinski.bf")) == run4(load_bff("sierpinski.bf"))} |
|
552 \end{center} |
|
553 |
|
554 which should return true for all the different compiler stages. \\ |
|
555 \mbox{}\hfill{[2 Marks]} |
|
556 \end{itemize} |
|
557 |
|
558 \end{document} |
|
559 |
|
560 |
|
561 %%% Local Variables: |
|
562 %%% mode: latex |
|
563 %%% TeX-master: t |
|
564 %%% End: |