83 it can match up to 11,000 <code>a</code>s in less than 5 seconds without raising any exception |
83 it can match up to 11,000 <code>a</code>s in less than 5 seconds without raising any exception |
84 (remember Python and Ruby both need nearly 30 seconds to process 28(!) <code>a</code>s, and Scala's |
84 (remember Python and Ruby both need nearly 30 seconds to process 28(!) <code>a</code>s, and Scala's |
85 offical matcher maxes out at 4,600 <code>a</code>s). My matcher is approximately |
85 offical matcher maxes out at 4,600 <code>a</code>s). My matcher is approximately |
86 85 lines of code and based on the concept of |
86 85 lines of code and based on the concept of |
87 <A HREF="http://lambda-the-ultimate.org/node/2293">derivatives of regular experssions</A>. |
87 <A HREF="http://lambda-the-ultimate.org/node/2293">derivatives of regular experssions</A>. |
88 Derivatives were introduced in 1964 by <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)"> |
88 These derivatives were introduced in 1964 by <A HREF="http://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)"> |
89 Janusz Brzozowski</A>, but according to this |
89 Janusz Brzozowski</A>, but according to this |
90 <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> had been lost in the "sands of time". |
90 <A HREF="http://www.cl.cam.ac.uk/~so294/documents/jfp09.pdf">paper</A> had been lost in the "sands of time". |
91 The advantage of derivatives is that they side-step completely the usual |
91 The advantage of derivatives is that they side-step completely the usual |
92 <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">translations</A> of regular expressions |
92 <A HREF="http://hackingoff.com/compilers/regular-expression-to-nfa-dfa">translations</A> of regular expressions |
93 into NFAs or DFAs, which can introduce the exponential behaviour exhibited by the regular |
93 into NFAs or DFAs, which can introduce the exponential behaviour exhibited by the regular |
101 (for example subexpression matching, which my rainy-afternoon matcher lacks). I am sure they thought |
101 (for example subexpression matching, which my rainy-afternoon matcher lacks). I am sure they thought |
102 about the problem much longer than a single afternoon. The task |
102 about the problem much longer than a single afternoon. The task |
103 in this project is to find out how good they actually are by implementing the results from their paper. |
103 in this project is to find out how good they actually are by implementing the results from their paper. |
104 Their approach is based on the concept of partial derivatives introduced in 1994 by |
104 Their approach is based on the concept of partial derivatives introduced in 1994 by |
105 <A HREF="http://reference.kfupm.edu.sa/content/p/a/partial_derivatives_of_regular_expressio_1319383.pdf">Valentin Antimirov</A>. |
105 <A HREF="http://reference.kfupm.edu.sa/content/p/a/partial_derivatives_of_regular_expressio_1319383.pdf">Valentin Antimirov</A>. |
106 I used them <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">once</A> |
106 I used them once myself in a <A HREF="http://www.inf.kcl.ac.uk/staff/urbanc/Publications/rexp.pdf">paper</A> |
107 in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A> |
107 in order to prove the <A HREF="http://en.wikipedia.org/wiki/Myhill–Nerode_theorem">Myhill-Nerode theorem</A>. |
108 by using only regular expressions. |
108 So I know they are worth their money. Still it would be interesting to actually compare their results |
|
109 with my simple rainy-afternoon matcher and "blow away" the matchers in Python and Ruby (and possibly |
|
110 Scala too). |
109 </p> |
111 </p> |
110 |
112 |
111 <p> |
113 <p> |
112 <B>Literature:</B> |
114 <B>Literature:</B> |
113 The place to start with this project is obviously this |
115 The place to start with this project is obviously this |