52 are extremely useful for many text-processing tasks such as finding patterns in texts, |
52 are extremely useful for many text-processing tasks such as finding patterns in texts, |
53 lexing programs, syntax highlighting and so on. Given that regular expressions were |
53 lexing programs, syntax highlighting and so on. Given that regular expressions were |
54 introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, |
54 introduced in 1950 by <A HREF="http://en.wikipedia.org/wiki/Stephen_Cole_Kleene">Stephen Kleene</A>, |
55 you might think regular expressions have since been studied and implemented to death. But you would definitely be |
55 you might think regular expressions have since been studied and implemented to death. But you would definitely be |
56 mistaken: in fact they are still an active research area. For example |
56 mistaken: in fact they are still an active research area. For example |
57 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">this paper</A> |
57 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">this paper</A> |
58 about regular expression matching and partial derivatives was presented last summer at the international |
58 about regular expression matching and partial derivatives was presented last summer at the international |
59 FLOPS'14 conference. The task in this project is to implement their results.</p> |
59 FLOPS'14 conference. The task in this project is to implement their results.</p> |
60 |
60 |
61 <p>The background for this project is that some regular expressions are |
61 <p>The background for this project is that some regular expressions are |
62 “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>” |
62 “<A HREF="http://en.wikipedia.org/wiki/ReDoS#Examples">evil</A>” |
99 expression matchers in Python and Ruby. |
99 expression matchers in Python and Ruby. |
100 </p> |
100 </p> |
101 |
101 |
102 <p> |
102 <p> |
103 Now the authors from the |
103 Now the authors from the |
104 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp12-part-deriv-sub-match.pdf">FLOPS'14-paper</A> mentioned |
104 <A HREF="http://www.home.hs-karlsruhe.de/~suma0002/publications/regex-parsing-derivatives.pdf">FLOPS'14-paper</A> mentioned |
105 above claim they are even faster than me and can deal with even more features of regular expressions |
105 above claim they are even faster than me and can deal with even more features of regular expressions |
106 (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought |
106 (for example subexpression matching, which my rainy-afternoon matcher cannot). I am sure they thought |
107 about the problem much longer than a single afternoon. The task |
107 about the problem much longer than a single afternoon. The task |
108 in this project is to find out how good they actually are by implementing the results from their paper. |
108 in this project is to find out how good they actually are by implementing the results from their paper. |
109 Their approach is based on the concept of derivatives introduced in 1994 by |
109 Their approach is based on the concept of derivatives introduced in 1994 by |