55 pattern matching and an interpreter. The first part is due on 30 |
55 pattern matching and an interpreter. The first part is due on 30 |
56 November at 11pm; the second, more advanced part, is due on 21 |
56 November at 11pm; the second, more advanced part, is due on 21 |
57 December at 11pm. In the first part, you are asked to implement a |
57 December at 11pm. In the first part, you are asked to implement a |
58 regular expression matcher based on derivatives of regular |
58 regular expression matcher based on derivatives of regular |
59 expressions. The reason is that regular expression matching in Java |
59 expressions. The reason is that regular expression matching in Java |
60 can sometimes be extremely slow. The advanced part is about an |
60 and Python can sometimes be extremely slow. The advanced part is about |
61 interpreter for a very simple programming language.\bigskip |
61 an interpreter for a very simple programming language.\bigskip |
62 |
62 |
63 \noindent |
63 \noindent |
64 \textbf{Important:} |
64 \textbf{Important:} |
65 |
65 |
66 \begin{itemize} |
66 \begin{itemize} |
326 \url{https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)} |
326 \url{https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)} |
327 \end{center} |
327 \end{center} |
328 |
328 |
329 |
329 |
330 If you want to see how badly the regular expression matchers do in |
330 If you want to see how badly the regular expression matchers do in |
331 Java and Python with the `evil' regular expression, then have a look |
331 Java\footnote{Version 8 and below} and in Python with the `evil' regular |
332 at the graphs below (you can try it out for yourself: have a look at |
332 expression $(a^*)^*\cdot b$, then have a look at the graphs below (you |
333 the file \texttt{catastrophic.java} on KEATS). Compare this with the |
333 can try it out for yourself: have a look at the file |
334 matcher you have implemented. How long can the string of $a$'s be |
334 \texttt{catastrophic.java} and \texttt{catastrophic.py} on |
335 in your matcher and still stay within the 30 seconds time limit? |
335 KEATS). Compare this with the matcher you have implemented. How long |
|
336 can the string of $a$'s be in your matcher and still stay within the |
|
337 30 seconds time limit? |
336 |
338 |
337 \begin{center} |
339 \begin{center} |
338 \begin{tikzpicture} |
340 \begin{tikzpicture} |
339 \begin{axis}[ |
341 \begin{axis}[ |
340 title={Graph: $(a^*)^*\cdot b$ and strings |
342 title={Graph: $(a^*)^*\cdot b$ and strings |
349 ytick={0,5,...,30}, |
351 ytick={0,5,...,30}, |
350 scaled ticks=false, |
352 scaled ticks=false, |
351 axis lines=left, |
353 axis lines=left, |
352 width=6cm, |
354 width=6cm, |
353 height=5.0cm, |
355 height=5.0cm, |
354 legend entries={Python, Java}, |
356 legend entries={Python, Java 8}, |
355 legend pos=outer north east] |
357 legend pos=outer north east] |
356 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data}; |
358 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data}; |
357 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data}; |
359 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data}; |
358 \end{axis} |
360 \end{axis} |
359 \end{tikzpicture} |
361 \end{tikzpicture} |
364 |
366 |
365 Coming from Java or C++, you might think Scala is a quite esoteric |
367 Coming from Java or C++, you might think Scala is a quite esoteric |
366 programming language. But remember, some serious companies have built |
368 programming language. But remember, some serious companies have built |
367 their business on |
369 their business on |
368 Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}} |
370 Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}} |
369 And there are far more esoteric languages out there. One is called |
371 And there are far, far more esoteric languages out there. One is |
370 \emph{brainf***}. You are asked in this part to implement an |
372 called \emph{brainf***}. You are asked in this part to implement an |
371 interpreter for this language. |
373 interpreter for this language. |
372 |
374 |
373 Urban M\"uller developed brainf*** in 1993. A close relative of this |
375 Urban M\"uller developed brainf*** in 1993. A close relative of this |
374 language was already introduced in 1964 by Corado B\"ohm, an Italian |
376 language was already introduced in 1964 by Corado B\"ohm, an Italian |
375 computer pioneer, who unfortunately died a few months ago. The main |
377 computer pioneer, who unfortunately died a few months ago. The main |
376 feature of brainf*** is its minimalistic set of instructions---just 8 |
378 feature of brainf*** is its minimalistic set of instructions---just 8 |
377 instructions in total and all of which are single characters. Despite |
379 instructions in total and all of which are single characters. Despite |
378 the minimalism, this language has been shown to be Turing |
380 the minimalism, this language has been shown to be Turing |
379 complete\ldots{}if this doesn't ring any bell with you: it roughly |
381 complete\ldots{}if this doesn't ring any bell with you: it roughly |
380 means every algorithm we know can, in principle, be implemented in |
382 that means every algorithm we know can, in principle, be implemented in |
381 brainf***. It just takes a lot of determination and quite a lot of |
383 brainf***. It just takes a lot of determination and quite a lot of |
382 memory resources. Some relatively sophisticated example programs in |
384 memory resources. Some relatively sophisticated sample programs in |
383 brainf*** are given in the file \texttt{bf.scala}.\bigskip |
385 brainf*** are given in the file \texttt{bf.scala}.\bigskip |
384 |
386 |
385 \noindent |
387 \noindent |
386 As mentioned above, brainf*** has 8 single-character commands, namely |
388 As mentioned above, brainf*** has 8 single-character commands, namely |
387 \texttt{'>'}, \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'}, |
389 \texttt{'>'}, \texttt{'<'}, \texttt{'+'}, \texttt{'-'}, \texttt{'.'}, |