95 language was already introduced in 1964 by Corado B\"ohm, an Italian |
95 language was already introduced in 1964 by Corado B\"ohm, an Italian |
96 computer pioneer. The main feature of brainf*** is its minimalistic |
96 computer pioneer. The main feature of brainf*** is its minimalistic |
97 set of instructions---just 8 instructions in total and all of which |
97 set of instructions---just 8 instructions in total and all of which |
98 are single characters. Despite the minimalism, this language has been |
98 are single characters. Despite the minimalism, this language has been |
99 shown to be Turing complete\ldots{}if this doesn't ring any bell with |
99 shown to be Turing complete\ldots{}if this doesn't ring any bell with |
100 you: it roughly means that every algorithm we know can, in principle, |
100 you: it roughly means that every(!) algorithm can, in principle, |
101 be implemented in brainf***. It just takes a lot of determination and |
101 be implemented in brainf***. It just takes a lot of determination and |
102 quite a lot of memory resources. Some relatively sophisticated sample |
102 quite a lot of memory resources. Some relatively sophisticated sample |
103 programs in brainf*** are given in the file \texttt{bf.scala}, including |
103 programs in brainf*** are given in the file \texttt{bf.scala}, including |
104 a brainf*** program for the Sierpinski triangle and the Mandelbrot set.\bigskip |
104 a brainf*** program for the Sierpinski triangle and the Mandelbrot set.\bigskip |
105 |
105 |
134 This one prints out Hello World\ldots{}obviously ;o) |
134 This one prints out Hello World\ldots{}obviously ;o) |
135 |
135 |
136 \subsubsection*{Tasks (file bf.scala)} |
136 \subsubsection*{Tasks (file bf.scala)} |
137 |
137 |
138 \begin{itemize} |
138 \begin{itemize} |
139 \item[(1)] Write a function that takes a file name as an argument |
139 \item[(1)] Write a function that takes a filename (a string) as an argument |
140 and requests the corresponding file from disk. It returns the |
140 and requests the corresponding file from disk. It returns the |
141 content of the file as a String. If the file does not exists, |
141 content of the file as a string. If the file does not exists, |
142 the function should return the empty string.\\ |
142 the function should return the empty string.\\ |
143 \mbox{}\hfill[1 Mark] |
143 \mbox{}\hfill[1 Mark] |
144 |
144 |
145 \item[(2)] Brainf*** memory is represented by a \texttt{Map} from |
145 \item[(2)] Brainf*** memory is represented by a \texttt{Map} from |
146 integers to integers. The empty memory is represented by |
146 integers to integers. The empty memory is represented by |
148 memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at |
148 memory; \texttt{Map(0 -> 1, 2 -> 3)} stores \texttt{1} at |
149 memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The |
149 memory location \texttt{0}, and at \texttt{2} it stores \texttt{3}. The |
150 convention is that if we query the memory at a location that is |
150 convention is that if we query the memory at a location that is |
151 \emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write |
151 \emph{not} defined in the \texttt{Map}, we return \texttt{0}. Write |
152 a function, \texttt{sread}, that takes a memory (a \texttt{Map}) and |
152 a function, \texttt{sread}, that takes a memory (a \texttt{Map}) and |
153 a memory pointer (an \texttt{Int}) as argument, and `safely' reads the |
153 a memory pointer (an \texttt{Int}) as arguments, and `safely' reads the |
154 corresponding memory location. If the \texttt{Map} is not defined at |
154 corresponding memory location. If the \texttt{Map} is not defined at |
155 the memory pointer, \texttt{sread} returns \texttt{0}. |
155 the memory pointer, \texttt{sread} returns \texttt{0}. |
156 |
156 |
157 Write another function \texttt{write}, which takes a memory, a |
157 Write another function \texttt{write}, which takes a memory, a |
158 memory pointer and an integer value as argument and updates the |
158 memory pointer and an integer value as arguments and updates the |
159 \texttt{Map} with the value at the given memory location. As usual |
159 \texttt{Map} with the value at the given memory location. As usual, |
160 the \texttt{Map} is not updated `in-place' but a new map is created |
160 the \texttt{Map} is not updated `in-place' but a new map is created |
161 with the same data, except the value is stored at the given memory |
161 with the same data, except the new value is stored at the given memory |
162 pointer.\hfill[1 Mark] |
162 pointer.\hfill[1 Mark] |
163 |
163 |
164 \item[(3)] Write two functions, \texttt{jumpRight} and |
164 \item[(3)] Write two functions, \texttt{jumpRight} and |
165 \texttt{jumpLeft} that are needed to implement the loop constructs |
165 \texttt{jumpLeft}, that are needed to implement the loop constructs |
166 of brainf***. They take a program (a \texttt{String}) and a program |
166 in brainf***. They take a program (a \texttt{String}) and a program |
167 counter (an \texttt{Int}) as argument and move right (respectively |
167 counter (an \texttt{Int}) as arguments and move right (respectively |
168 left) in the string in order to find the \textbf{matching} |
168 left) in the string in order to find the \textbf{matching} |
169 opening/closing bracket. For example, given the following program |
169 opening/closing bracket. For example, given the following program |
170 with the program counter indicated by an arrow: |
170 with the program counter indicated by an arrow: |
171 |
171 |
172 \begin{center} |
172 \begin{center} |
372 and \texttt{jumpRight} was called previously, you should look |
372 and \texttt{jumpRight} was called previously, you should look |
373 up the jump address in the \texttt{jtable}. Feel free to reuse |
373 up the jump address in the \texttt{jtable}. Feel free to reuse |
374 the function \texttt{jumpLeft} and \texttt{jumpRight} for |
374 the function \texttt{jumpLeft} and \texttt{jumpRight} for |
375 calculating the \texttt{jtable}.\hfill{[1 Mark]} |
375 calculating the \texttt{jtable}.\hfill{[1 Mark]} |
376 |
376 |
377 \item[(6)] Compilers try to eliminate any ``dead'' code that could slow |
377 \item[(6)] Compilers try to eliminate any ``dead'' code that could |
378 down programs and also perform what is often called |
378 slow down programs and also perform what is often called |
379 \emph{peephole optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} While it is difficult for compilers to comprehend what |
379 \emph{peephole |
380 is intended with whole programs, they are very good at finding out |
380 optimisations}.\footnote{\url{https://en.wikipedia.org/wiki/Peephole_optimization}} |
381 what small snippets of code do, and then try to generate |
381 For the latter consider that it is difficult for compilers to |
382 faster code for such snippets. |
382 comprehend what is intended with whole programs, they are very good |
|
383 at finding out what small snippets of code do, and then try to |
|
384 generate faster code for such snippets. |
383 |
385 |
384 In our case, dead code is everything that is not a bf-command. |
386 In our case, dead code is everything that is not a bf-command. |
385 Therefore write a function \texttt{optimise} which deletes such |
387 Therefore write a function \texttt{optimise} which deletes such |
386 dead code. Moreover this function should replace every substring |
388 dead code from a bf-program. Moreover this function should replace every substring |
387 of the form \pcode{[-]} by a new command \texttt{0}. |
389 of the form \pcode{[-]} by a new command \texttt{0}. |
388 The idea is that the loop \pcode{[-]} just resets the |
390 The idea is that the loop \pcode{[-]} just resets the |
389 memory at the current location to 0. It is more efficient |
391 memory at the current location to 0. It is more efficient |
390 to do this in a single step, rather than incrementally as in |
392 to do this in a single step, rather than stepwise in a loop as in |
391 the original bf-programs. |
393 the original bf-programs. |
392 |
394 |
393 In the extended \texttt{compute3} and \texttt{run3} functions you should |
395 In the extended \texttt{compute3} and \texttt{run3} functions you should |
394 implement this command by writing 0 to \pcode{mem(mp)}, that is use |
396 implement this command by writing 0 to \pcode{mem(mp)}, that is use |
395 \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}. |
397 \pcode{write(mem, mp, 0)} as the rule for the command \texttt{0}. |
396 The easiest way to modify a string in this way is to use the regular |
398 The easiest way to modify a string in this way is to use the regular |
397 expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is |
399 expression \pcode{"""[^<>+-.,\\[\\]]"""}, which recognises everything that is |
398 not a bf-command. Similarly, the |
400 not a bf-command. Similarly, the |
399 regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]} and |
401 regular expression \pcode{"""\\[-\\]"""} finds all occurrences of \pcode{[-]}. By using the Scala method \pcode{.replaceAll} you can replace substrings |
400 by using the Scala method \pcode{.replaceAll} you can replace it with the |
402 with new strings.\\ |
401 string \pcode{"0"} standing for the new bf-command.\hfill{[1 Mark]} |
403 \mbox{}\hfill{[1 Mark]} |
402 |
404 |
403 \item[(7)] Finally, real compilers try to take advantage of CPUs which often |
405 \item[(7)] Finally, real compilers try to take advantage of CPUs which often |
404 provide complex operations in hardware that can combine many smaller |
406 provide complex operations in hardware that can combine many smaller |
405 instructions into a single faster instruction. |
407 instructions into a single faster instruction. |
406 |
408 |
407 In our case we can optimise the several single increments of a |
409 In our case we can optimise the several single increments performed at a |
408 memory cell, for example \pcode{++++}, by a single ``increment by |
410 memory cell, for example \pcode{++++}, by a single ``increment by |
409 4''. For this optimisation we just have to make sure these single |
411 4''. For this optimisation we just have to make sure these single |
410 increments are all next to each other. Similarly optimisations should apply |
412 increments are all next to each other. Similar optimisations should apply |
411 for the bf-commands \pcode{-}, \pcode{<} and |
413 for the bf-commands \pcode{-}, \pcode{<} and |
412 \pcode{>}, which can all be replaced by extended versions that take |
414 \pcode{>}, which can all be replaced by extended versions that take |
413 the amount of the increment (decrement) into account. We will do |
415 the amount of the increment (decrement) into account. We will do |
414 this by introducing two-character bf-commands. For example |
416 this by introducing two-character bf-commands. For example |
415 |
417 |
425 \hspace{5mm}(these are 26 \pcode{+}'s)\\ |
427 \hspace{5mm}(these are 26 \pcode{+}'s)\\ |
426 \end{tabular} |
428 \end{tabular} |
427 \end{center} |
429 \end{center} |
428 |
430 |
429 |
431 |
430 Similarly for \pcode{-}, \pcode{<} and \pcode{>}. If there are more |
432 If there are more |
431 than 26 \pcode{+}'s in a row, then more than on ``two-character'' |
433 than 26 \pcode{+}'s in a row, then more than one ``two-character'' |
432 bf-commands need to be generated (the idea is that more than |
434 bf-commands need to be generated (the idea is that more than |
433 26 copies of a single bf-command in a row is a rare occurrence in |
435 26 copies of a single bf-command in a row is a rare occurrence in |
434 actual bf-programs). All other bf-commands should be unaffected by this |
436 actual bf-programs). Similar replacements apply |
|
437 for \pcode{-}, \pcode{<} and \pcode{>}, but |
|
438 all other bf-commands should be unaffected by this |
435 change. |
439 change. |
436 |
440 |
437 For this write a function \texttt{combine} which replaces sequences |
441 For this write a function \texttt{combine} which replaces sequences |
438 of repeated increment and decrement commands by appropriate |
442 of repeated increment and decrement commands by appropriate |
439 two-character commands. In the functions \pcode{compute4} and |
443 two-character commands. In the functions \pcode{compute4} and |
440 \pcode{run4}, the ``combine'' and the optimisation from (6) should |
444 \pcode{run4}, the ``combine'' and the optimisation from (6) should |
441 be performed. Make sure that when a two-character bf-command is |
445 be performed. Make sure that when a two-character bf-command is |
442 encountered you need to increase the \pcode{pc}-counter by two in |
446 encountered you need to increase the \pcode{pc}-counter by two in |
443 order to process the next command. For example |
447 order to progress to the next command. For example |
444 |
448 |
445 \begin{center} |
449 \begin{center} |
446 \pcode{combine(optimise(load_bff("benchmark.bf")))} |
450 \pcode{combine(optimise(load_bff("benchmark.bf")))} |
447 \end{center} |
451 \end{center} |
448 |
452 |
457 \begin{center} |
461 \begin{center} |
458 \pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots |
462 \pcode{>++[<+++++++++++++>-]<[[}\hspace{3mm}\ldots |
459 \end{center} |
463 \end{center} |
460 |
464 |
461 As you can see, the compiler bets on saving so much time on the |
465 As you can see, the compiler bets on saving so much time on the |
462 \pcode{+B} and \pcode{+M} steps so that the optimisations is |
466 \pcode{+B} and \pcode{+M} steps such that the optimisations is |
463 worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a |
467 worthwhile overall (of course for the \pcode{>A}'s and so on, the compiler incurs a |
464 penalty). Luckily, after you have performed all |
468 penalty). Luckily, after you have performed all |
465 optimisations in (5) - (7), you can expect that the |
469 optimisations in (5) - (7), you can expect that the |
466 \pcode{benchmark.bf} program runs four to five times faster. |
470 \pcode{benchmark.bf} program runs four to five times faster. |
467 You can also test whether your compiler produces the correct result |
471 You can also test whether your compiler produces the correct result |