# HG changeset patch # User Christian Urban # Date 1511513891 0 # Node ID cc6d036401f4ab97b7534d0acb55bbf052140b07 # Parent 371acb50643d10d47a87244a862ff932b7266d55 updated diff -r 371acb50643d -r cc6d036401f4 cws/cw03.pdf Binary file cws/cw03.pdf has changed diff -r 371acb50643d -r cc6d036401f4 cws/cw03.tex --- a/cws/cw03.tex Fri Nov 24 03:10:23 2017 +0000 +++ b/cws/cw03.tex Fri Nov 24 08:58:11 2017 +0000 @@ -57,8 +57,8 @@ December at 11pm. In the first part, you are asked to implement a regular expression matcher based on derivatives of regular expressions. The reason is that regular expression matching in Java -can sometimes be extremely slow. The advanced part is about an -interpreter for a very simple programming language.\bigskip +and Python can sometimes be extremely slow. The advanced part is about +an interpreter for a very simple programming language.\bigskip \noindent \textbf{Important:} @@ -328,11 +328,13 @@ If you want to see how badly the regular expression matchers do in -Java and Python with the `evil' regular expression, then have a look -at the graphs below (you can try it out for yourself: have a look at -the file \texttt{catastrophic.java} on KEATS). Compare this with the -matcher you have implemented. How long can the string of $a$'s be -in your matcher and still stay within the 30 seconds time limit? +Java\footnote{Version 8 and below} and in Python with the `evil' regular +expression $(a^*)^*\cdot b$, then have a look at the graphs below (you +can try it out for yourself: have a look at the file +\texttt{catastrophic.java} and \texttt{catastrophic.py} on +KEATS). Compare this with the matcher you have implemented. How long +can the string of $a$'s be in your matcher and still stay within the +30 seconds time limit? \begin{center} \begin{tikzpicture} @@ -351,7 +353,7 @@ axis lines=left, width=6cm, height=5.0cm, - legend entries={Python, Java}, + legend entries={Python, Java 8}, legend pos=outer north east] \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data}; \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data}; @@ -366,8 +368,8 @@ programming language. But remember, some serious companies have built their business on Scala.\footnote{\url{https://en.wikipedia.org/wiki/Scala_(programming_language)\#Companies}} -And there are far more esoteric languages out there. One is called -\emph{brainf***}. You are asked in this part to implement an +And there are far, far more esoteric languages out there. One is +called \emph{brainf***}. You are asked in this part to implement an interpreter for this language. Urban M\"uller developed brainf*** in 1993. A close relative of this @@ -377,9 +379,9 @@ instructions in total and all of which are single characters. Despite the minimalism, this language has been shown to be Turing complete\ldots{}if this doesn't ring any bell with you: it roughly -means every algorithm we know can, in principle, be implemented in +that means every algorithm we know can, in principle, be implemented in brainf***. It just takes a lot of determination and quite a lot of -memory resources. Some relatively sophisticated example programs in +memory resources. Some relatively sophisticated sample programs in brainf*** are given in the file \texttt{bf.scala}.\bigskip \noindent diff -r 371acb50643d -r cc6d036401f4 progs/catastrophic.py --- a/progs/catastrophic.py Fri Nov 24 03:10:23 2017 +0000 +++ b/progs/catastrophic.py Fri Nov 24 08:58:11 2017 +0000 @@ -4,8 +4,8 @@ # case of catastrophic backtracking in Python # -# regex: (a?){n} a{n} -# strings: aa... +# regex: (a*)*b +# strings: aa...a # # call with timing as: # @@ -14,11 +14,8 @@ # counter n given on the command line cn = sys.argv[1] -# constructing the regex -r1 = '((a?){%s})' % cn -r2 = 'a{%s}' % cn +# calling the matching function +s = ("a" * int(cn)) +m = re.match('(a*)*b' , s) -# calling the matching function -m = re.match(r1 + r2 , "a" * int(cn)) - -print m.group(0) +print s