12 \section*{Handout 2 (Regular Expression Matching)} |
12 \section*{Handout 2 (Regular Expression Matching)} |
13 |
13 |
14 This lecture is about implementing a more efficient regular expression |
14 This lecture is about implementing a more efficient regular expression |
15 matcher (the plots on the right below)---more efficient than the |
15 matcher (the plots on the right below)---more efficient than the |
16 matchers from regular expression libraries in Ruby, Python and Java |
16 matchers from regular expression libraries in Ruby, Python and Java |
17 (the plots on the left). The first pair of plots show the running time |
17 (the plots on the left). The first pair of plots shows the running time |
18 for the regular expression $(a^*)^*\cdot b$ and strings composed of |
18 for the regular expression $(a^*)^*\cdot b$ and strings composed of |
19 $n$ \pcode{a}s (meaning this regular expression actually does not |
19 $n$ \pcode{a}s (meaning this regular expression actually does not |
20 match the strings). The second pair of plots show the running time for |
20 match the strings). The second pair of plots shows the running time for |
21 the regular expressions $a^?{}^{\{n\}}\cdot a^{\{n\}}$ and strings |
21 the regular expressions $a^?{}^{\{n\}}\cdot a^{\{n\}}$ and strings |
22 also composed of $n$ \pcode{a}s (this time the regular expressions |
22 also composed of $n$ \pcode{a}s (this time the regular expressions |
23 match the strings). To see the substantial differences in the left |
23 match the strings). To see the substantial differences in the left |
24 and right plots below, note the different scales of the $x$-axes. |
24 and right plots below, note the different scales of the $x$-axes. |
25 |
25 |
276 $(r_1 \cdot r_2)^*$ & $\equiv$ & $1 + r_1\cdot (r_2 \cdot r_1)^* \cdot r_2$\\ |
276 $(r_1 \cdot r_2)^*$ & $\equiv$ & $1 + r_1\cdot (r_2 \cdot r_1)^* \cdot r_2$\\ |
277 \end{tabular} |
277 \end{tabular} |
278 \end{center} |
278 \end{center} |
279 |
279 |
280 \noindent |
280 \noindent |
281 We will not use them in our algorithm, but feel free to make sure they |
281 We will not use them in our algorithm, but feel free to convince you |
282 hold. As an aside, there has been a lot of research about questions |
282 that they hold. As an aside, there has been a lot of research about |
283 like: Can one always decide when two regular expressions are |
283 questions like: Can one always decide when two regular expressions are |
284 equivalent or not? What does an algorithm look like to decide this |
284 equivalent or not? What does an algorithm look like to decide this |
285 efficiently? |
285 efficiently? |
286 |
286 |
287 \subsection*{The Matching Algorithm} |
287 \subsection*{The Matching Algorithm} |
288 |
288 |