# HG changeset patch # User Christian Urban <christian dot urban at kcl dot ac dot uk> # Date 1448107492 0 # Node ID 27b7af6a00e532690b1fe665a4030c0d0c77f198 # Parent 8787b77c94725a2c9c0a44d864aa1995c866d87f updatad diff -r 8787b77c9472 -r 27b7af6a00e5 msc-projects-15.html --- a/msc-projects-15.html Sat Nov 21 11:48:23 2015 +0000 +++ b/msc-projects-15.html Sat Nov 21 12:04:52 2015 +0000 @@ -68,8 +68,7 @@ and can “stab you in the back” according to this <A HREF="http://peterscott.github.io/2013/01/17/regular-expressions-will-stab-you-in-the-back/">blog post</A>. For example, if you use in <A HREF="http://www.python.org">Python</A> or - in <A HREF="http://www.ruby-lang.org/en/">Ruby</A> (or also in a number of other mainstream programming languages according to this - <A HREF="http://www.computerbytesman.com/redos/">blog</A>) the + in <A HREF="http://www.ruby-lang.org/en/">Ruby</A> (or also in a number of other mainstream programming languages) the innocently looking regular expression <code>a?{28}a{28}</code> and match it, say, against the string <code>aaaaaaaaaaaaaaaaaaaaaaaaaaaa</code> (that is 28 <code>a</code>s), you will soon notice that your CPU usage goes to 100%. In fact, Python and Ruby need approximately 30 seconds of hard work for matching this string. You can try it for yourself: @@ -83,7 +82,7 @@ the regular expression and string further to, say, 4,600 <code>a</code>s, then you get a <code>StackOverflowError</code> potentially crashing your program. Moreover (beside the "minor" problem of being painfully slow) according to this <A HREF="http://www.haskell.org/haskellwiki/Regex_Posix">report</A> - nearly all POSIX regular expression matchers are actually buggy. + nearly all regular expression matchers using the POSIX rules are actually buggy. </p> <p> @@ -124,7 +123,8 @@ <p> <B>Literature:</B> The place to start with this project is obviously this - <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">paper</A>. + <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">paper</A> + and this <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">one</A>. Traditional methods for regular expression matching are explained in the Wikipedia articles <A HREF="http://en.wikipedia.org/wiki/DFA_minimization">here</A> and @@ -621,7 +621,7 @@ </TABLE> <P> -<!-- hhmts start --> Last modified: Sat Nov 21 11:45:43 GMT 2015 <!-- hhmts end --> +<!-- hhmts start --> Last modified: Sat Nov 21 11:58:49 GMT 2015 <!-- hhmts end --> <a href="http://validator.w3.org/check/referer">[Validate this page.]</a> </BODY> </HTML>