462 the values $v_n, \ldots, v_0$. We first call the function |
462 the values $v_n, \ldots, v_0$. We first call the function |
463 $\textit{mkeps}$, which builds the parse tree for how the empty string |
463 $\textit{mkeps}$, which builds the parse tree for how the empty string |
464 is matched the empty regular expression $r_n$. This function is defined |
464 is matched the empty regular expression $r_n$. This function is defined |
465 as |
465 as |
466 |
466 |
467 $mkeps $ $1 \,[] $ $= Empty$ |
467 \begin{center} |
468 ...... |
468 \begin{tabular}{lcl} |
469 |
469 $\mkeps(\ONE)$ & $\dn$ & $\Empty$ \\ |
|
470 $\mkeps(r_{1}+r_{2})$ & $\dn$ |
|
471 & \textit{if} $\nullable(r_{1})$\\ |
|
472 & & \textit{then} $\Left(\mkeps(r_{1}))$\\ |
|
473 & & \textit{else} $\Right(\mkeps(r_{2}))$\\ |
|
474 $\mkeps(r_1\cdot r_2)$ & $\dn$ & $\Seq\,(\mkeps\,r_1)\,(\mkeps\,r_2)$\\ |
|
475 $mkeps(r^*)$ & $\dn$ & $\Stars\,[]$ |
|
476 \end{tabular} |
|
477 \end{center} |
470 |
478 |
471 After this, we inject back the characters one by one in order to build |
479 After this, we inject back the characters one by one in order to build |
472 the parse tree $v_i$ for how the regex $r_i$ matches the string |
480 the parse tree $v_i$ for how the regex $r_i$ matches the string |
473 $s_i$ ($s_i$ means the string s with the first $i$ characters being |
481 $s_i$ ($s_i$ means the string s with the first $i$ characters being |
474 chopped off) from the previous parse tree. After $n$ transformations, we |
482 chopped off) from the previous parse tree. After $n$ transformations, we |
539 by the number of characters contained in the initial regular |
547 by the number of characters contained in the initial regular |
540 expression \cite{Antimirov95}. |
548 expression \cite{Antimirov95}. |
541 |
549 |
542 Antimirov defined the "partial derivatives" of regular expressions to be this: |
550 Antimirov defined the "partial derivatives" of regular expressions to be this: |
543 %TODO definition of partial derivatives |
551 %TODO definition of partial derivatives |
|
552 \[ pder c 0 = emtpy set\] |
|
553 \[pder c 1 = emptry set\] |
|
554 \[pder c c = if c is equal to d then set containing lambda only else empty set\] |
|
555 \[pder c r_1+r_2 = pder c r_1 \cup pder c r_2\] |
|
556 \[pder c r_1 \cdot r_2 = if r_1 nullable then set of r \cdot r_2' where r is in pder c r_1 \cup pder c r_2\] |
|
557 \[else set of r \cdot r_2' where r is in pder c r_1\] |
|
558 \[pder c r^* = set of r \cdot r^* where r \in pder c r\] |
|
559 |
544 |
560 |
545 it is essentially a set of regular expressions that come from the sub-structure of the original regular expression. |
561 it is essentially a set of regular expressions that come from the sub-structure of the original regular expression. |
546 Antimirov has proved a nice size bound of the size of partial derivatives. Roughly speaking the size will not exceed the fourth power of the number of nodes in that regular expression. Interestingly, we observed from experiment that after the simplification step, our regular expression has the same size or is smaller than the partial derivatives. This allows us to prove a tight bound on the size of regular expression during the running time of the algorithm if we can establish the connection between our simplification rules and partial derivatives. |
562 Antimirov has proved a nice size bound of the size of partial derivatives. Roughly speaking the size will not exceed the fourth power of the number of nodes in that regular expression. Interestingly, we observed from experiment that after the simplification step, our regular expression has the same size or is smaller than the partial derivatives. This allows us to prove a tight bound on the size of regular expression during the running time of the algorithm if we can establish the connection between our simplification rules and partial derivatives. |
547 |
563 |
548 %We believe, and have generated test |
564 %We believe, and have generated test |