diff -r 5ce3bd8e5696 -r 69ad05398894 ChengsongTanPhdThesis/Chapters/Chapter2.tex --- a/ChengsongTanPhdThesis/Chapters/Chapter2.tex Sat May 07 13:38:44 2022 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Chapter2.tex Sun May 08 13:26:31 2022 +0100 @@ -368,19 +368,92 @@ We give a predicate for such "star-created" regular expressions: \begin{center} \begin{tabular}{lcr} - & & $\createdByStar{\RSEQ{ra}{\RSTAR{rb}}}$\\ + & & $\createdByStar{(\RSEQ{ra}{\RSTAR{rb}}) }$\\ $\createdByStar{r_1} \land \createdByStar{r_2} $ & $ \Longrightarrow$ & $\createdByStar{(r_1 + r_2)}$ \end{tabular} \end{center} + These definitions allows us the flexibility to talk about + regular expressions in their most convenient format, + for example, flattened out $\RALTS{[r_1, r_2, \ldots, r_n]} $ + instead of binary-nested: $((r_1 + r_2) + (r_3 + r_4)) + \ldots$. + These definitions help express that certain classes of syntatically + distinct regular expressions are actually the same under simplification. + This is not entirely true for annotated regular expressions: + %TODO: bsimp bders \neq bderssimp + \begin{center} + $(1+ (c\cdot \ASEQ{bs}{c^*}{c} ))$ + \end{center} + For bit-codes, the order in which simplification is applied + might cause a difference in the location they are placed. + If we want something like + \begin{center} + $\bderssimp{r}{s} \myequiv \bsimp{\bders{r}{s}}$ + \end{center} + Some "canonicalization" procedure is required, + which either pushes all the common bitcodes to nodes + as senior as possible: + \begin{center} + $_{bs}(_{bs_1 @ bs'}r_1 + _{bs_1 @ bs''}r_2) \rightarrow _{bs @ bs_1}(_{bs'}r_1 + _{bs''}r_2) $ + \end{center} + or does the reverse. However bitcodes are not of interest if we are talking about + the $\llbracket r \rrbracket$ size of a regex. + Therefore for the ease and simplicity of producing a + proof for a size bound, we are happy to restrict ourselves to + unannotated regular expressions, and obtain such equalities as + \begin{lemma} + $\rsimp{(r_1 + r_2)} = \rsimp{\RALTS{\hflataux{r_1} @ \hflataux{r_2}}}$ + \end{lemma} + + \begin{proof} + By using the rewriting relation $\rightsquigarrow$ + \end{proof} + %TODO: rsimp sflat +And from this we obtain a proof that a star's derivative will be the same +as if it had all its nested alternatives created during deriving being flattened out: + For example, + \begin{lemma} + $\createdByStar{r} \implies \rsimp{r} = \rsimp{\RALTS{\hflataux{r}}}$ + \end{lemma} + \begin{proof} + By structural induction on $r$, where the induction rules are these of $\createdByStar{_}$. + \end{proof} +% The simplification of a flattened out regular expression, provided it comes +%from the derivative of a star, is the same as the one nested. + + + + + + + + + One might wonder the actual bound rather than the loose bound we gave for the convenience of a easier proof. How much can the regex $r^* \backslash s$ grow? As hinted earlier, they can grow at a maximum speed -of exponential $w.r.t$ the number of characters. -But they will eventually level off when the string $s$ is long enough, + exponential $w.r.t$ the number of characters, +but will eventually level off when the string $s$ is long enough, as suggested by the finiteness bound proof. - +And unfortunately we have concrete examples +where such regexes grew exponentially large before levelling off: +$(a ^ * + (a + aa) ^ * + (a + aa + aaa) ^ * + \ldots + +(a+ aa + aaa+\ldots \underbrace{a \ldots a}_{\text{n a's}})^*$ will already have a maximum +size that is exponential on the number $n$. %TODO: give out a graph showing how the size of the regex grows and levels off at a size exponential to the original regex +While the tight upper bound will definitely be a lot lower than +the one we gave for the finiteness proof, +it will still be at least expoential, under our current simplification rules. + +This suggests stronger simplifications that keep the tight bound polynomial. + +%---------------------------------------------------------------------------------------- +% SECTION 5 +%---------------------------------------------------------------------------------------- +\section{a stronger version of simplification} +%TODO: search for isabelle proofs of algorithms that check equivalence + +