394 \end{tabular} |
394 \end{tabular} |
395 \end{center} |
395 \end{center} |
396 |
396 |
397 \begin{center} |
397 \begin{center} |
398 \begin{tabular}{@{}lcl@{}} |
398 \begin{tabular}{@{}lcl@{}} |
399 $r \backslash_{rsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{rsimp}\, c) \backslash_{rsimps}\, s$ \\ |
399 $r \backslash_{rsimps} \; \; c\!::\!s $ & $\dn$ & $(r \backslash_{rsimp}\, c) \backslash_{rsimps}\, s$ \\ |
400 $r \backslash_{rsimps} [\,] $ & $\dn$ & $r$ |
400 $r \backslash_{rsimps} [\,] $ & $\dn$ & $r$ |
401 \end{tabular} |
401 \end{tabular} |
402 \end{center} |
402 \end{center} |
403 \noindent |
403 \noindent |
404 With $\rrexp$ the size caclulation of annotated regular expressions' |
404 We do not define an r-regular expression version of $\blexersimp$, |
405 simplification and derivatives can be done by the size of their unlifted |
405 as our proof does not involve its use. |
406 counterpart with the unlifted version of simplification and derivatives applied. |
406 Everything about the size of annotated regular expressions |
|
407 can be calculated via the size of r-regular expressions: |
407 \begin{lemma}\label{sizeRelations} |
408 \begin{lemma}\label{sizeRelations} |
408 The following equalities hold: |
409 The following equalities hold: |
409 \begin{itemize} |
410 \begin{itemize} |
410 \item |
411 \item |
411 $\asize{\bsimp{a}} = \rsize{\rsimp{\rerase{a}}}$ |
412 $\asize{\bsimps \; a} = \rsize{\rsimp{ \rerase{a}}}$ |
412 \item |
413 \item |
413 $\asize{\bderssimp{a}{s}} = \rsize{\rderssimp{\rerase{a}}{s}}$ |
414 $\asize{\bderssimp{a}{s}} = \rsize{\rderssimp{\rerase{a}}{s}}$ |
414 \end{itemize} |
415 \end{itemize} |
415 \end{lemma} |
416 \end{lemma} |
|
417 \begin{proof} |
|
418 The first part is by induction on the inductive cases |
|
419 of $\textit{bsimp}$. |
|
420 The second part is by induction on the string $s$, |
|
421 where the inductive step follows from part one. |
|
422 \end{proof} |
416 \noindent |
423 \noindent |
417 With lemma \ref{sizeRelations}, |
424 With lemma \ref{sizeRelations}, |
418 a bound on $\rsize{\rderssimp{\rerase{a}}{s}}$ |
425 we will be able to focus on |
419 \[ |
426 estimating only |
420 \llbracket \rderssimp{a}{s} \rrbracket_r \leq N_r |
427 $\rsize{\rderssimp{\rerase{a}}{s}}$ |
421 \] |
428 in later parts because |
422 \noindent |
429 \begin{center} |
423 would apply to $\asize{\bderssimp{a}{s}}$ as well. |
430 $\rsize{\rderssimp{\rerase{a}}{s}} \leq N_r \quad$ |
424 \[ |
431 implies |
425 \llbracket a \backslash_{bsimps} s \rrbracket \leq N_r. |
432 $\quad \llbracket a \backslash_{bsimps} s \rrbracket \leq N_r$. |
426 \] |
433 \end{center} |
427 In the following content, we will focus on $\rrexp$'s size bound. |
434 Unless stated otherwise in the rest of this |
428 Whatever bounds we proved for $ \rsize{\rderssimp{\rerase{r}}{s}}$ |
435 chapter all regular expressions without |
429 will automatically apply to $\asize{\bderssimp{r}{s}}$.\\ |
|
430 Unless stated otherwise in this chapter all regular expressions without |
|
431 bitcodes are seen as $\rrexp$s. |
436 bitcodes are seen as $\rrexp$s. |
432 We also use $r_1 + r_2$ and $\RALTS{[r_1, r_2]}$ interchageably |
437 For the binary alternative r-regular expression $\RALTS{[r_1, r_2]}$, |
433 as the former suits people's intuitive way of stating a binary alternative |
438 we use the notation $r_1 + r_2$ |
434 regular expression. |
439 for brevity. |
435 |
|
436 |
440 |
437 |
441 |
438 %----------------------------------- |
442 %----------------------------------- |
439 % SUB SECTION ROADMAP RREXP BOUND |
443 % SUB SECTION ROADMAP RREXP BOUND |
440 %----------------------------------- |
444 %----------------------------------- |
471 % \end{center} |
475 % \end{center} |
472 %\end{quote} |
476 %\end{quote} |
473 %\noindent |
477 %\noindent |
474 %We explain in detail how we reached those claims. |
478 %We explain in detail how we reached those claims. |
475 \subsection{Basic Properties needed for Closed Forms} |
479 \subsection{Basic Properties needed for Closed Forms} |
|
480 If we attempt to prove |
|
481 \begin{center} |
|
482 $\forall r. \; \exists N_r.\;\; s.t. \llbracket r\backslash_{bsimps} s \rrbracket_r \leq N_r$ |
|
483 \end{center} |
|
484 using a naive induction on the structure of $r$, |
|
485 then we are stuck at the inductive cases such as |
|
486 $r_1\cdot r_2$. |
|
487 The inductive hypotheses are: |
|
488 \begin{center} |
|
489 1: $\text{for } r_1, \text{there exists } N_{r_1}.\;\; s.t. |
|
490 \;\;\forall s. \llbracket r_1 \backslash_{bsimps} s \rrbracket_r \leq N_{r_1}. $\\ |
|
491 2: $\text{for } r_2, \text{there exists } N_{r_2}.\;\; s.t. |
|
492 \;\; \forall s. \llbracket r_2 \backslash_{bsimps} s \rrbracket_r \leq N_{r_2}. $ |
|
493 \end{center} |
|
494 The inductive step to prove would be |
|
495 \begin{center} |
|
496 $\text{there exists } N_{r_1\cdot r_2}. \;\; s.t. \forall s. |
|
497 \llbracket (r_1 \cdot r_2) \backslash_{bsimps} s \rrbracket_r \leq N_{r_1\cdot r_2}.$ |
|
498 \end{center} |
|
499 The problem is that it is not clear what |
|
500 $(r_1\cdot r_2) \backslash_{bsimps} s$ looks like, |
|
501 and therefore $N_{r_1}$ and $N_{r_2}$ in the |
|
502 inductive hypotheses cannot be directly used. |
|
503 |
|
504 |
476 The next steps involves getting a closed form for |
505 The next steps involves getting a closed form for |
477 $\rderssimp{r}{s}$ and then obtaining |
506 $\rderssimp{r}{s}$ and then obtaining |
478 an estimate for the closed form. |
507 an estimate for the closed form. |
479 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully} |
508 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully} |
480 The $\textit{rdistinct}$ function, as its name suggests, will |
509 The $\textit{rdistinct}$ function, as its name suggests, will |