cws/cw03.tex
changeset 154 39c6b93718f0
parent 153 4383809c176a
child 155 371acb50643d
equal deleted inserted replaced
153:4383809c176a 154:39c6b93718f0
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 \usepackage{../langs}
     3 \usepackage{../langs}
     4 \usepackage{tikz}
     4 \usepackage{tikz}
     5 \usepackage{pgf}
     5 \usepackage{pgf}
     6 \usepackage{pgfplots}
     6 \usepackage{pgfplots}
       
     7 \usepackage{stackengine}
       
     8 %%\usepackage{accents}
       
     9 \newcommand\barbelow[1]{\stackunder[1.2pt]{#1}{\raisebox{-4mm}{\boldmath$\uparrow$}}}
     7 
    10 
     8 \begin{filecontents}{re-python2.data}
    11 \begin{filecontents}{re-python2.data}
     9 1 0.033
    12 1 0.033
    10 5 0.036
    13 5 0.036
    11 10 0.034
    14 10 0.034
    43 \end{filecontents}
    46 \end{filecontents}
    44 
    47 
    45 
    48 
    46 \begin{document}
    49 \begin{document}
    47 
    50 
    48 \section*{Coursework 8 (Scala, Regular Expressions, Brainf***)}
    51   
       
    52 \section*{Coursework 8 (Regular Expressions and Brainf***)}
    49 
    53 
    50 This coursework is worth 10\%. It is about regular expressions,
    54 This coursework is worth 10\%. It is about regular expressions,
    51 pattern matching and an interpreter. The first part is due on 30
    55 pattern matching and an interpreter. The first part is due on 30
    52 November at 11pm; the second, more advanced part, is due on 21
    56 November at 11pm; the second, more advanced part, is due on 21
    53 December at 11pm. In the first part, you are asked to implement a
    57 December at 11pm. In the first part, you are asked to implement a
   326 If you want to see how badly the regular expression matchers do in
   330 If you want to see how badly the regular expression matchers do in
   327 Java and Python with the `evil' regular expression, then have a look
   331 Java and Python with the `evil' regular expression, then have a look
   328 at the graphs below (you can try it out for yourself: have a look at
   332 at the graphs below (you can try it out for yourself: have a look at
   329 the file \texttt{catastrophic.java} on KEATS). Compare this with the
   333 the file \texttt{catastrophic.java} on KEATS). Compare this with the
   330 matcher you have implemented. How long can the string of $a$'s be
   334 matcher you have implemented. How long can the string of $a$'s be
   331 in your matcher and stay within the 30 seconds time limit?
   335 in your matcher and still stay within the 30 seconds time limit?
   332 
   336 
   333 \begin{center}
   337 \begin{center}
   334 \begin{tikzpicture}
   338 \begin{tikzpicture}
   335 \begin{axis}[
   339 \begin{axis}[
   336     title={Graph: $(a^*)^*\cdot b$ and strings 
   340     title={Graph: $(a^*)^*\cdot b$ and strings 
   356 \end{center}
   360 \end{center}
   357 \newpage
   361 \newpage
   358 
   362 
   359 \subsection*{Part 2 (4 Marks)}
   363 \subsection*{Part 2 (4 Marks)}
   360 
   364 
   361 Comming from Java or C++, you might think Scala is a quite
   365 Coming from Java or C++, you might think Scala is a quite esoteric
   362 esotheric programming language.  But remember, some serious companies
   366 programming language.  But remember, some serious companies have built
   363 have built their business on Scala. And there are far more esotheric
   367 their business on
   364 languages out there. One is called \emph{brainf***}. Urban M\"uller
   368 Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}}
   365 developed this language in 1993.  A close relative was already
   369 And there are far more esoteric languages out there. One is called
   366 introduced in ... by Corado B\"ohm, an Italian computer pionier, who
   370 \emph{brainf***}. You are asked in this part to implement an
   367 unfortunately died a few months ago. One feature of brainf*** is its
   371 interpreter for this language.
   368 minimalistic set of instructions. It has just 8 instructions, all of
   372 
   369 which are single characters. Despite this minimalism, this language,
   373 Urban M\"uller developed brainf*** in 1993.  A close relative of this
   370 given enough memory, has been shown to be Turing complete. In this
   374 language was already introduced in 1964 by Corado B\"ohm, an Italian
   371 part you will implement an interpreter for this language.
   375 computer pioneer, who unfortunately died a few months ago. The main
   372 
   376 feature of brainf*** is its minimalistic set of instructions---just 8
   373 
   377 instructions in total and all of which are single characters. Despite
       
   378 the minimalism, this language has been shown to be Turing
       
   379 complete\ldots{}if this doesn't ring any bell with you: it roughly
       
   380 means every algorithm we know can, in principle, be implemented in
       
   381 brainf***. It just takes a lot of determination and quite a lot of
       
   382 memory resources. Some relatively sophisticated example programs in
       
   383 brainf*** are given in the file \texttt{bf.scala}.\bigskip
       
   384 
       
   385 \noindent
       
   386 As mentioned above, brainf*** has 8 single-character commands, namely
       
   387 \texttt{'>'}, \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'},
       
   388 \texttt{','}, \texttt{'['} and \texttt{']'}. Every other character is
       
   389 considered a comment.  Brainf*** operates on memory cells containing
       
   390 integers. For this it uses a single memory pointer that points at each
       
   391 stage to one memory cell. This pointer can be moved forward by one
       
   392 memory cell by using the command \texttt{'>'}, and backward by using
       
   393 \texttt{'<'}. The commands \texttt{'+'} and \texttt{'-'} increase,
       
   394 respectively decrease, by 1 the content of the memory cell to which
       
   395 the memory pointer currently points to. The commands for input/output
       
   396 are \texttt{','} and \texttt{'.'}. Output works by reading the content
       
   397 of the memory cell to which the memory pointer points to and printing
       
   398 it out as an ASCII character. Input works the other way, taking some
       
   399 user input and storing it in the cell to which the memory pointer
       
   400 points to. The commands \texttt{'['} and \texttt{']'} are looping
       
   401 constructs. Everything in between \texttt{'['} and \texttt{']'} is
       
   402 repeated until a counter (memory cell) reaches zero.  A typical
       
   403 program in brainf*** looks as follows:
       
   404 
       
   405 \begin{center}
       
   406 \begin{verbatim}
       
   407  ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++
       
   408  ..+++.>>.<-.<.+++.------.--------.>>+.>++.
       
   409 \end{verbatim}
       
   410 \end{center}  
       
   411 
       
   412 \noindent
       
   413 This one prints out Hello World\ldots{}obviously. 
   374 
   414 
   375 \subsubsection*{Tasks (file bf.scala)}
   415 \subsubsection*{Tasks (file bf.scala)}
   376 
   416 
   377 \begin{itemize}
   417 \begin{itemize}
   378 \item[(2a)] 
   418 \item[(2a)] Brainf*** memory is represented by a \texttt{Map} from
   379 
   419   integers to integers. The empty memory is represented by
   380 \item[(2b)]   
   420   \texttt{Map()}, that is nothing is stored in the
   381 
   421   memory. \texttt{Map(0 -> 1, 2 -> 3)} clearly has stored \texttt{1}
   382 \item[(2c)]
   422   at memory location \texttt{0}, at \texttt{2} it stores
   383 
   423   \texttt{3}. The convention is that if we query the memory at a
       
   424   location that is not defined in the \texttt{Map} we return
       
   425   \texttt{0}. Write a function, \texttt{sread}, that takes a memory (a
       
   426   \texttt{Map}) and a memory pointer (an \texttt{Int}) as argument,
       
   427   and safely reads the corresponding memory location. If the map is not
       
   428   defined at the memory pointer it returns \texttt{0}.
       
   429 
       
   430   Write another function \texttt{write}, which takes a memory, a
       
   431   memory pointer and a integer value as argument and updates the map
       
   432   with the value at the given memory location. As usual the map is not
       
   433   updated `in-place' but a new map is created with the same data,
       
   434   except the value is stored at the given memory pointer.\hfill[1 Mark]
       
   435 
       
   436 \item[(2b)] Write two functions, \texttt{jumpRight} and
       
   437   \texttt{jumpLeft} that are needed to implement the loop constructs
       
   438   of brainf***. They take a program (a \texttt{String}) and a program
       
   439   counter (an \texttt{Int}) as argument and move right (respectively
       
   440   left) in the string in order to find the \textbf{matching}
       
   441   opening/closing bracket. For example, given the following program
       
   442   with the program counter indicated by an arrow:
       
   443 
       
   444   \begin{center}
       
   445   \texttt{--[\barbelow{.}.+>--],>,++}
       
   446   \end{center}
       
   447 
       
   448   then the matching closing bracket is in 9th position (counting from 0) and
       
   449   \texttt{jumpRight} is supposed to return the position just after this
       
   450   
       
   451   \begin{center}
       
   452   \texttt{--[..+>--]\barbelow{,}>,++}
       
   453   \end{center}
       
   454 
       
   455   meaning it jumps after the loop. Similarly, if you in 8th position
       
   456   then \texttt{jumpLeft} is supposed to jump to just after the opening
       
   457   bracket (that is jumping to the beginning of the loop):
       
   458 
       
   459   \begin{center}
       
   460     \texttt{--[..+>-\barbelow{-}],>,++}
       
   461     \qquad$\stackrel{\texttt{jumpLeft}}{\longrightarrow}$\qquad
       
   462     \texttt{--[\barbelow{.}.+>--],>,++}
       
   463   \end{center}
       
   464 
       
   465   Unfortunately we have to take into account that there might be
       
   466   another opening and closing bracket on the `way' to find the
       
   467   matching bracket. For example in the brainf*** program
       
   468 
       
   469   \begin{center}
       
   470   \texttt{--[\barbelow{.}.[+>]--],>,++}
       
   471   \end{center}
       
   472 
       
   473   we do not want to return the index for the \texttt{'-'} in the 9th
       
   474   position, but the program counter for \texttt{','} in 12th
       
   475   position. The easiest to find out whether a bracket is matched is to
       
   476   use levels (which are the third argument in \texttt{jumpLeft} and
       
   477   \texttt{jumpLeft}). In case of \texttt{jumpRight} you increase the
       
   478   level by one whenever you find an opening bracket and decrease by
       
   479   one for a closing bracket. Then in \texttt{jumpRight} you are looking
       
   480   for the closing bracket on level \texttt{0}. For \texttt{jumpLeft} you
       
   481   do the opposite. In this way you can find \textbf{matching} brackets
       
   482   in strings such as
       
   483 
       
   484   \begin{center}
       
   485   \texttt{--[\barbelow{.}.[[-]+>[.]]--],>,++}
       
   486   \end{center}
       
   487 
       
   488   for which \texttt{jumpRight} should produce the position:
       
   489 
       
   490   \begin{center}
       
   491   \texttt{--[..[[-]+>[.]]--]\barbelow{,}>,++}
       
   492   \end{center}
       
   493 
       
   494   It is also possible that the position returned by \texttt{jumpRight} or
       
   495   \texttt{jumpLeft} is outside the string in cases where there are
       
   496   no matching brackets. For example
       
   497 
       
   498   \begin{center}
       
   499   \texttt{--[\barbelow{.}.[[-]+>[.]]--,>,++}
       
   500   \qquad$\stackrel{\texttt{jumpRight}}{\longrightarrow}$\qquad
       
   501   \texttt{--[..[[-]+>[.]]-->,++\barbelow{\;\phantom{+}}}
       
   502   \end{center}
       
   503   \hfill[1 Mark]
       
   504 
       
   505 
       
   506 \item[(2c)] Write a recursive function \texttt{run} that executes a
       
   507   brainf*** program. It takes a program, a program counter, a memory
       
   508   counter and a memory as arguments. If the program counter is outside
       
   509   the program string, the execution stops and \texttt{run} returns the
       
   510   memory. If the program counter is inside the string, it reads the
       
   511   corresponding character and updates the program counter \texttt{pc}, memory
       
   512   pointer \texttt{mp} and memory \texttt{mem} according to the rules shown
       
   513   in Figure~\ref{comms}. It the calls recursively \texttt{run} with the updated
       
   514   data.
       
   515 
       
   516   Write another function \texttt{start} that calls \texttt{run} with a
       
   517   given brainfu** program and memory, and the program counter and memory counter
       
   518   set to~$0$. Like \texttt{run} it returns the memory after the execution
       
   519   of the program finishes. You can test your brainf**k interpreter with the
       
   520   Sierpinski triangle or the Hello world programs.\\\mbox{}\hfill[2 Marks]
       
   521   
       
   522   \begin{figure}[p]
       
   523   \begin{center}
       
   524     \begin{tabular}{|@{}p{0.8cm}|l|}
       
   525       \hline
       
   526       \hfill\texttt{'>'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   527                        $\bullet$ & $\texttt{pc} + 1$\\
       
   528                        $\bullet$ & $\texttt{mp} + 1$\\
       
   529                        $\bullet$ & \texttt{mem} unchanged
       
   530                      \end{tabular}\\\hline   
       
   531       \hfill\texttt{'<'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   532                        $\bullet$ & $\texttt{pc} + 1$\\
       
   533                        $\bullet$ & $\texttt{mp} - 1$\\
       
   534                        $\bullet$ & \texttt{mem} unchanged
       
   535                      \end{tabular}\\\hline   
       
   536       \hfill\texttt{'+'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   537                        $\bullet$ & $\texttt{pc} + 1$\\
       
   538                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   539                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) + 1}\\
       
   540                      \end{tabular}\\\hline   
       
   541       \hfill\texttt{'-'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   542                        $\bullet$ & $\texttt{pc} + 1$\\
       
   543                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   544                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> mem(mp) - 1}\\
       
   545                      \end{tabular}\\\hline   
       
   546       \hfill\texttt{'.'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   547                        $\bullet$ & $\texttt{pc} + 1$\\
       
   548                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   549                        $\bullet$ & print out\texttt{mem(mp)} as a character\\
       
   550                      \end{tabular}\\\hline   
       
   551       \hfill\texttt{','} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   552                        $\bullet$ & $\texttt{pc} + 1$\\
       
   553                        $\bullet$ & $\texttt{mp}$ unchanged\\
       
   554                        $\bullet$ & \texttt{mem} updated with \texttt{mp -> \textrm{input}}\\
       
   555                        \multicolumn{2}{@{}l}{input given by \texttt{Console.in.read().toByte}}
       
   556                      \end{tabular}\\\hline   
       
   557       \hfill\texttt{'['} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   558                        \multicolumn{2}{@{}l}{if \texttt{mem(mp) == 0} then}\\
       
   559                        $\bullet$ & $\texttt{pc = jumpRight(prog, pc + 1, 0)}$\\
       
   560                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
       
   561                        \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) != 0} then}\\
       
   562                        $\bullet$ & $\texttt{pc} + 1$\\
       
   563                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   564                      \end{tabular}
       
   565                      \\\hline   
       
   566       \hfill\texttt{']'} & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   567                        \multicolumn{2}{@{}l}{if \texttt{mem(mp) != 0} then}\\
       
   568                        $\bullet$ & $\texttt{pc = jumpLeft(prog, pc - 1 1, 0)}$\\
       
   569                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\medskip\\
       
   570                        \multicolumn{2}{@{}l}{otherwise if \texttt{mem(mp) == 0} then}\\
       
   571                        $\bullet$ & $\texttt{pc} + 1$\\
       
   572                        $\bullet$ & $\texttt{mp}$ and \texttt{mem} unchanged\\
       
   573                      \end{tabular}\\\hline   
       
   574       any other char & \begin{tabular}[t]{@{}l@{\hspace{2mm}}l@{}}
       
   575                          $\bullet$ & $\texttt{pc} + 1$\\
       
   576                          $\bullet$ & \texttt{mp} and \texttt{mem} unchanged
       
   577                        \end{tabular}\\
       
   578       \hline                 
       
   579     \end{tabular}
       
   580   \end{center}
       
   581   \caption{The rules for how commands in the brainf*** language update the program counter,
       
   582     memory counter and memory.\label{comms}}
       
   583   \end{figure}
   384 \end{itemize}\bigskip  
   584 \end{itemize}\bigskip  
   385 
   585 
   386 
   586 
   387 
   587 
   388 
   588