47 <li> <H4>[CU1] Regular Expression Matching and Derivatives</H4>  | 
    47 <li> <H4>[CU1] Regular Expression Matching and Derivatives</H4>  | 
    48   | 
    48   | 
    49   <p>  | 
    49   <p>  | 
    50   <B>Description:</b>    | 
    50   <B>Description:</b>    | 
    51   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A>   | 
    51   <A HREF="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</A>   | 
    52   are extremely useful for many text-processing tasks such as finding patterns in texts,  | 
    52   are extremely useful for many text-processing tasks, such as finding patterns in texts,  | 
    53   lexing programs, syntax highlighting and so on. Given that regular expressions were  | 
    53   lexing programs, syntax highlighting and so on. Given that regular expressions were  | 
    54   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>,  | 
    54   introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>,  | 
    55   you might think regular expressions have since been studied and implemented to death. But you would definitely be  | 
    55   you might think regular expressions have since been studied and implemented to death. But you would definitely be  | 
    56   mistaken: in fact they are still an active research area. For example  | 
    56   mistaken: in fact they are still an active research area. For example  | 
    57   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A>   | 
    57   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A>   | 
    58   about regular expression matching and derivatives was presented just last summer at the international   | 
    58   about regular expression matching and derivatives was presented just last summer at the international   | 
    59   FLOPS'14 conference. The task in this project is to implement their results.</p>  | 
    59   FLOPS'14 conference. The task in this project is to implement their results and use them for lexing.</p>  | 
    60   | 
    60   | 
    61   <p>The background for this project is that some regular expressions are   | 
    61   <p>The background for this project is that some regular expressions are   | 
    62   “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>”  | 
    62   “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>”  | 
    63   and can “stab you in the back” according to  | 
    63   and can “stab you in the back” according to  | 
    64   this <A HREF="http://peterscott.github.io/2013/01/17/regular-expressions-will-stab-you-in-the-back/">blog post</A>.  | 
    64   this <A HREF="http://peterscott.github.io/2013/01/17/regular-expressions-will-stab-you-in-the-back/">blog post</A>.  | 
   104   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">FLOPS'14-paper</A> mentioned   | 
   104   <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">FLOPS'14-paper</A> mentioned   | 
   105   above claim they are even faster than me and can deal with even more features of regular expressions  | 
   105   above claim they are even faster than me and can deal with even more features of regular expressions  | 
   106   (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought  | 
   106   (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought  | 
   107   about the problem much longer than a single afternoon. The task   | 
   107   about the problem much longer than a single afternoon. The task   | 
   108   in this project is to find out how good they actually are by implementing the results from their paper.   | 
   108   in this project is to find out how good they actually are by implementing the results from their paper.   | 
   109   Their approach is based on the concept of derivatives.  | 
   109   Their approach to regular expression matching is also based on the concept of derivatives.  | 
   110   I used them once myself in a <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A>   | 
   110   I used derivatives very successfully once for something completely different in a  | 
   111   in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.  | 
   111   <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A>   | 
         | 
   112   about the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>.  | 
   112   So I know they are worth their money. Still, it would be interesting to actually compare their results  | 
   113   So I know they are worth their money. Still, it would be interesting to actually compare their results  | 
   113   with my simple rainy-afternoon matcher and potentially “blow away” the regular expression matchers   | 
   114   with my simple rainy-afternoon matcher and potentially “blow away” the regular expression matchers   | 
   114   in Python and Ruby (and possibly in Scala too). The application would be to implement a fast lexer for  | 
   115   in Python and Ruby (and possibly in Scala too). The application would be to implement a fast lexer for  | 
   115   programming languages.   | 
   116   programming languages.   | 
   116   </p>  | 
   117   </p>  |