128 and \texttt{']'} from the original, and in addition the commands |
128 and \texttt{']'} from the original, and in addition the commands |
129 \texttt{'@'}, \texttt{'*'} and \texttt{'\#'}. Every other character |
129 \texttt{'@'}, \texttt{'*'} and \texttt{'\#'}. Every other character |
130 is considered a comment. |
130 is considered a comment. |
131 |
131 |
132 Our interpreter for bf++ operates on memory cells containing |
132 Our interpreter for bf++ operates on memory cells containing |
133 integers. For this it uses a single memory pointer that points at each |
133 integers. For this it uses a single memory pointer, called |
134 stage to one memory cell. This pointer can be moved forward by one |
134 \texttt{mp}, that points at each stage to one memory cell. |
135 memory cell by using the command \texttt{'>'}, and backward by using |
135 |
136 \texttt{'<'}. The commands \texttt{'+'} and \texttt{'-'} increase, |
136 \begin{center} |
137 respectively decrease, by 1 the content of the memory cell to which |
137 \begin{tikzpicture} |
138 the memory pointer currently points to. The command for output is |
138 \draw [line width=1mm, rounded corners] (0,0) rectangle (5, 0.5); |
139 \texttt{'.'} whereby output works by reading the content of the memory |
139 \draw (0.5, 0) -- (0.5, 0.5); |
140 cell to which the memory pointer points to and printing it out as an |
140 \draw (1.0, 0) -- (1.0, 0.5); |
141 ASCII character.\footnote{In the originial version of bf, there is also a |
141 |
142 command for input, but we omit it here. All our programs will be |
142 \draw (2.5, 0) -- (2.5, 0.5); |
143 ``autonomous''.} The |
143 \draw (2.0, 0) -- (2.0, 0.5); |
|
144 |
|
145 \draw (4.5, 0) -- (4.5, 0.5); |
|
146 \draw (4.0, 0) -- (4.0, 0.5); |
|
147 |
|
148 \draw (1.5,0.25) node {$\cdots$}; |
|
149 \draw (3.0,0.25) node {$\cdots$}; |
|
150 |
|
151 \draw [->, thick] (2.25, -0.5) -- (2.25, -0.15); |
|
152 \draw (2.25,-0.8) node {\texttt{mp}}; |
|
153 |
|
154 \draw (0.7,0.7) node {\sf\footnotesize memory}; |
|
155 \end{tikzpicture} |
|
156 \end{center} |
|
157 |
|
158 \noindent |
|
159 This pointer can be moved forward by one memory cell by using the |
|
160 command \texttt{'>'}, and backward by using \texttt{'<'}. The commands |
|
161 \texttt{'+'} and \texttt{'-'} increase, respectively decrease, by 1 |
|
162 the content of the memory cell to which the memory pointer currently |
|
163 points to. The command for output in bf++ is \texttt{'.'} whereby output works |
|
164 by reading the content of the memory cell to which the memory pointer |
|
165 points to and printing it out as an ASCII character.\footnote{In the |
|
166 original version of bf, there is also a command for input, but we |
|
167 omit it here. All our programs will be ``autonomous''.} The |
144 commands \texttt{'['} and \texttt{']'} are looping |
168 commands \texttt{'['} and \texttt{']'} are looping |
145 constructs. Everything in between \texttt{'['} and \texttt{']'} is |
169 constructs. Everything in between \texttt{'['} and \texttt{']'} is |
146 repeated until a counter (memory cell) reaches zero. A typical |
170 repeated until a counter (memory cell) reaches zero. A typical |
147 program in brainf*** looks as follows: |
171 program in brainf*** looks as follows: |
148 |
172 |
153 \end{verbatim} |
177 \end{verbatim} |
154 \end{center} |
178 \end{center} |
155 |
179 |
156 \noindent |
180 \noindent |
157 This one prints out Hello World\ldots{}obviously \texttt{;o)} We also |
181 This one prints out Hello World\ldots{}obviously \texttt{;o)} We also |
158 add 3 new commands to the bf++ langauge, whose purpose we explain later. |
182 add 3 new commands in the bf++-version of the bf-language. The purpose |
|
183 of these commands we explain later. |
159 |
184 |
160 |
185 |
161 \subsubsection*{Tasks (file bf.scala)} |
186 \subsubsection*{Tasks (file bf.scala)} |
162 |
187 |
163 \begin{itemize} |
188 \begin{itemize} |
164 \item[(1)] Write a function that takes a filename (a string) as an argument |
189 \item[(1)] Write a function that takes a filename (a string) as an argument |
165 and requests the corresponding file from disk. It returns the |
190 and requests the corresponding file from disk. It returns the |
166 content of the file as a string. If the file does not exists, |
191 content of the file as a string. If the file does not exists, |
167 the function should return the empty string.\\ |
192 the function should return the empty string. |
168 \mbox{}\hfill[1 Mark] |
193 \mbox{}\hfill[1 Mark] |
169 |
194 |
170 \item[(2)] Brainf**k++ memory is represented by a \texttt{Map} from |
195 \item[(2)] Brainf**k++ memory is represented by a \texttt{Map} from |
171 integers to integers. The empty memory is represented by |
196 integers to integers. The empty memory is represented by |
172 \texttt{Map()}, that is nothing is stored in the |
197 \texttt{Map()}, that is nothing is stored in the |
330 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\ |
358 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\ |
331 \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\ |
359 \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\ |
332 $\bullet$ & $\texttt{pc} + 1$\\ |
360 $\bullet$ & $\texttt{pc} + 1$\\ |
333 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
361 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
334 \end{tabular}\\\hline |
362 \end{tabular}\\\hline |
|
363 \hfill\texttt{'*'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
|
364 $\bullet$ & $\texttt{pc} + 1$\\ |
|
365 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
366 $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) * mem(mp - 1)}\\ |
|
367 \multicolumn{2}{@{}l}{this multiplies the content of the memory cells at |
|
368 \texttt{mp} and \texttt{mp - 1}}\\ |
|
369 \multicolumn{2}{@{}l}{and stores the result at \texttt{mp}} |
|
370 \end{tabular}\\\hline |
335 \hfill\texttt{'@'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
371 \hfill\texttt{'@'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
336 $\bullet$ & $\texttt{pc} + 1$\\ |
372 $\bullet$ & $\texttt{pc} + 1$\\ |
337 $\bullet$ & $\texttt{mp}$ unchanged\\ |
373 $\bullet$ & $\texttt{mp}$ unchanged\\ |
338 $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\ |
374 $\bullet$ & \texttt{mem} updated with |
339 \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}} |
375 \texttt{mem(mp) -> mem(mp - 1)}\\ |
340 \end{tabular}\\\hline |
376 \multicolumn{2}{@{}l}{this updates the memory cell having the index stored at \texttt{mem(mp)},}\\ |
341 \hfill\texttt{'*'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
377 \multicolumn{2}{@{}l}{with the value stored at \texttt{mem(mp - 1)},} |
342 $\bullet$ & $\texttt{pc} + 1$\\ |
|
343 $\bullet$ & $\texttt{mp}$ unchanged\\ |
|
344 $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\ |
|
345 \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}} |
|
346 \end{tabular}\\\hline |
378 \end{tabular}\\\hline |
347 \hfill\texttt{'\#'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
379 \hfill\texttt{'\#'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
348 $\bullet$ & $\texttt{pc} + 1$\\ |
380 $\bullet$ & $\texttt{pc} + 1$\\ |
349 $\bullet$ & $\texttt{mp}$ unchanged\\ |
381 $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\ |
350 $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\ |
382 $\bullet$ & print out \,\texttt{mem(mp)} as a number\\ |
351 \multicolumn{2}{@{}l}{the input is given by \texttt{Console.in.read().toByte}} |
|
352 \end{tabular}\\\hline |
383 \end{tabular}\\\hline |
353 any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
384 any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}} |
354 $\bullet$ & $\texttt{pc} + 1$\\ |
385 $\bullet$ & $\texttt{pc} + 1$\\ |
355 $\bullet$ & \texttt{mp} and \texttt{mem} unchanged |
386 $\bullet$ & \texttt{mp} and \texttt{mem} unchanged |
356 \end{tabular}\\ |
387 \end{tabular}\\ |
357 \hline |
388 \hline |
358 \end{tabular} |
389 \end{tabular} |
359 \end{center} |
390 \\\mbox{}\\[-10mm]\mbox{} |
360 \caption{The rules for how commands in the brainf*** language update the program counter \texttt{pc}, |
391 \end{center} |
|
392 \caption{The rules for how commands in the brainf***++ language update the |
|
393 program counter \texttt{pc}, |
361 the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}} |
394 the memory pointer \texttt{mp} and the memory \texttt{mem}.\label{comms}} |
362 \end{figure} |
395 \end{figure} |
363 \end{itemize}\bigskip |
396 \end{itemize}\bigskip |
364 |
397 |
365 %%\newpage |
398 %%\newpage |
366 |
399 |
367 \subsection*{Part B (4 Marks)} |
400 \subsection*{Part B (4 Marks)} |
368 |
401 |
369 I am sure you agree while it is fun to look at bf-programs, like the |
402 I am sure you agree while it is fun to marvel at bf++-programs, like the |
370 Sierpinski triangle or the Mandelbrot program, being interpreted, it |
403 Sierpinski triangle or the Mandelbrot program, being interpreted, it |
371 is much more fun to write a compiler for the bf-language. |
404 is much more fun to write a compiler for the bf++-language. |
372 |
405 |
373 |
406 |
374 \subsubsection*{Tasks (file bfc.scala)} |
407 \subsubsection*{Tasks (file bfc.scala)} |
375 |
408 |
376 \begin{itemize} |
409 \begin{itemize} |
377 \item[(5)] Compilers in general attempt to make programs run |
410 \item[(5)] Compilers, in general, attempt to make programs run |
378 faster by precomputing as much information as possible |
411 faster by precomputing as much information as possible |
379 before running the program. In our case we can precompute the |
412 before running the program. In our case we can precompute the |
380 addresses where we need to jump at the beginning and end of |
413 addresses where we need to jump at the beginning and end of |
381 loops. |
414 loops. |
382 |
415 |
383 For this write a function \texttt{jtable} that precomputes the ``jump |
416 For this write a function \texttt{jtable} that precomputes the ``jump |
384 table'' for a bf-program. This function takes a bf-program |
417 table'' for a bf++-program. This function takes a bf++-program |
385 as an argument and returns a \texttt{Map[Int, Int]}. The |
418 as an argument and returns a \texttt{Map[Int, Int]}. The |
386 purpose of this Map is to record the information, in cases |
419 purpose of this Map is to record the information, in cases |
387 a pc-position points to a '\texttt{[}' or a '\texttt{]}', |
420 a pc-position points to a '\texttt{[}' or a '\texttt{]}', |
388 to which pc-position do we need to jump next? |
421 to which pc-position do we need to jump next? |
389 |
422 |
424 For the latter consider that it is difficult for compilers to |
457 For the latter consider that it is difficult for compilers to |
425 comprehend what is intended with whole programs, but they are very good |
458 comprehend what is intended with whole programs, but they are very good |
426 at finding out what small snippets of code do, and then try to |
459 at finding out what small snippets of code do, and then try to |
427 generate faster code for such snippets. |
460 generate faster code for such snippets. |
428 |
461 |
429 In our case, dead code is everything that is not a bf-command. |
462 In our case, dead code is everything that is not a bf++-command. |
430 Therefore write a function \texttt{optimise} which deletes such |
463 Therefore write a function \texttt{optimise} which deletes such |
431 dead code from a bf-program. Moreover this function should replace every substring |
464 dead code from a bf++-program. Moreover this function should replace every substring |
432 of the form \pcode{[-]} by a new command \texttt{0}. |
465 of the form \pcode{[-]} by a new command \texttt{0}. |
433 The idea is that the loop \pcode{[-]} just resets the |
466 The idea is that the loop \pcode{[-]} just resets the |
434 memory at the current location to 0. It is more efficient |
467 memory at the current location to 0. It is more efficient |
435 to do this in a single step, rather than stepwise in a loop as in |
468 to do this in a single step, rather than stepwise in a loop as in |
436 the original bf-programs. |
469 the original bf++-programs. |
437 |
470 |
438 In the extended \texttt{compute3} and \texttt{run3} functions you should |
471 In the extended \texttt{compute3} and \texttt{run3} functions you should |
439 implement this command by writing 0 to \pcode{mem(mp)}, that is use |
472 implement this command by writing 0 to \pcode{mem(mp)}, that is use |
440 \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}. |
473 \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}. |
441 The easiest way to modify a string in this way is to use the regular |
474 The easiest way to modify a string in this way is to use the regular |
442 expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is |
475 expression \pcode{"""[^<>+-.\\[\\]@\#*]"""}, which recognises everything that is |
443 not a bf-command. Similarly, the |
476 not a bf++-command. Similarly, the |
444 regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings |
477 regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings |
445 with new strings.\\ |
478 with new strings.\\ |
446 \mbox{}\hfill{[1 Mark]} |
479 \mbox{}\hfill{[1 Mark]} |
447 |
480 |
448 \item[(7)] Finally, real compilers try to take advantage of CPUs which often |
481 \item[(7)] Finally, real compilers try to take advantage of modern |
449 provide complex operations in hardware that can combine many smaller |
482 CPUs which often provide complex operations in hardware that can |
450 instructions into a single faster instruction. |
483 combine many smaller instructions into a single faster instruction. |
451 |
484 |
452 In our case we can optimise the several single increments performed at a |
485 In our case we can optimise the several single increments performed at a |
453 memory cell, for example \pcode{++++}, by a single ``increment by |
486 memory cell, for example \pcode{++++}, by a single ``increment by |
454 4''. For this optimisation we just have to make sure these single |
487 4''. For this optimisation we just have to make sure these single |
455 increments are all next to each other. Similar optimisations should apply |
488 increments are all next to each other. Similar optimisations should apply |
456 for the bf-commands \pcode{-}, \pcode{<} and |
489 for the bf-commands \pcode{-}, \pcode{<} and |
457 \pcode{>}, which can all be replaced by extended versions that take |
490 \pcode{>}, which can all be replaced by extended versions that take |
458 the amount of the increment (decrement) into account. We will do |
491 the amount of the increment (decrement) into account. We will do |
459 this by introducing two-character bf-commands. For example |
492 this by introducing two-character bf++-commands. For example |
460 |
493 |
461 \begin{center} |
494 \begin{center} |
462 \begin{tabular}{l|l} |
495 \begin{tabular}{l|l} |
463 original bf-cmds & replacement\\ |
496 original bf-cmds & replacement\\ |
464 \hline |
497 \hline |
473 |
506 |
474 |
507 |
475 If there are more |
508 If there are more |
476 than 26 \pcode{+}'s in a row, then more than one ``two-character'' |
509 than 26 \pcode{+}'s in a row, then more than one ``two-character'' |
477 bf-commands need to be generated (the idea is that more than |
510 bf-commands need to be generated (the idea is that more than |
478 26 copies of a single bf-command in a row is a rare occurrence in |
511 26 copies of a single bf++-command in a row is a rare occurrence in |
479 actual bf-programs). Similar replacements apply |
512 actual bf++-programs). Similar replacements apply |
480 for \pcode{-}, \pcode{<} and \pcode{>}, but |
513 for \pcode{-}, \pcode{<} and \pcode{>}, but |
481 all other bf-commands should be unaffected by this |
514 all other bf++-commands should be unaffected by this |
482 change. |
515 change. |
483 |
516 |
484 For this write a function \texttt{combine} which replaces sequences |
517 For this write a function \texttt{combine} which replaces sequences |
485 of repeated increment and decrement commands by appropriate |
518 of repeated increment and decrement commands by appropriate |
486 two-character commands. In the functions \pcode{compute4} and |
519 two-character commands. In the functions \pcode{compute4} and |
487 \pcode{run4}, the ``combine'' and the optimisation from (6) should |
520 \pcode{run4}, the ``combine'' and the optimisation from (6) should |
488 be performed. Make sure that when a two-character bf-command is |
521 be performed. Make sure that when a two-character bf++-command is |
489 encountered you need to increase the \pcode{pc}-counter by two in |
522 encountered you need to increase the \pcode{pc}-counter by two in |
490 order to progress to the next command. For example |
523 order to progress to the next command. For example |
491 |
524 |
492 \begin{center} |
525 \begin{center} |
493 \pcode{combine(optimise(load_bff("benchmark.bf")))} |
526 \pcode{combine(optimise(load_bff("benchmark.bf")))} |