ChengsongTanPhdThesis/Chapters/Chapter2.tex
changeset 506 69ad05398894
parent 505 5ce3bd8e5696
child 507 213220f54a6e
--- a/ChengsongTanPhdThesis/Chapters/Chapter2.tex	Sat May 07 13:38:44 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Chapter2.tex	Sun May 08 13:26:31 2022 +0100
@@ -368,19 +368,92 @@
 We give a predicate for such "star-created" regular expressions:
 \begin{center}
 \begin{tabular}{lcr}
-         &    &       $\createdByStar{\RSEQ{ra}{\RSTAR{rb}}}$\\
+         &    &       $\createdByStar{(\RSEQ{ra}{\RSTAR{rb}}) }$\\
  $\createdByStar{r_1} \land \createdByStar{r_2} $ & $ \Longrightarrow$ & $\createdByStar{(r_1 + r_2)}$
  \end{tabular}
  \end{center}
  
+ These definitions allows us the flexibility to talk about 
+ regular expressions in their most convenient format,
+ for example, flattened out $\RALTS{[r_1, r_2, \ldots, r_n]} $
+ instead of binary-nested: $((r_1 + r_2) + (r_3 + r_4)) + \ldots$.
+ These definitions help express that certain classes of syntatically 
+ distinct regular expressions are actually the same under simplification.
+ This is not entirely true for annotated regular expressions: 
+ %TODO: bsimp bders \neq bderssimp
+ \begin{center}
+ $(1+ (c\cdot \ASEQ{bs}{c^*}{c} ))$
+ \end{center}
+ For bit-codes, the order in which simplification is applied
+ might cause a difference in the location they are placed.
+ If we want something like
+ \begin{center}
+ $\bderssimp{r}{s} \myequiv \bsimp{\bders{r}{s}}$
+ \end{center}
+ Some "canonicalization" procedure is required,
+ which either pushes all the common bitcodes to nodes
+  as senior as possible:
+  \begin{center}
+  $_{bs}(_{bs_1 @ bs'}r_1 + _{bs_1 @ bs''}r_2) \rightarrow _{bs @ bs_1}(_{bs'}r_1 + _{bs''}r_2) $
+  \end{center}
+ or does the reverse. However bitcodes are not of interest if we are talking about
+ the $\llbracket r \rrbracket$ size of a regex.
+ Therefore for the ease and simplicity of producing a
+ proof for a size bound, we are happy to restrict ourselves to 
+ unannotated regular expressions, and obtain such equalities as
+ \begin{lemma}
+ $\rsimp{(r_1 + r_2)} = \rsimp{\RALTS{\hflataux{r_1} @ \hflataux{r_2}}}$
+ \end{lemma}
+ 
+ \begin{proof}
+ By using the rewriting relation $\rightsquigarrow$
+ \end{proof}
+ %TODO: rsimp sflat
+And from this we obtain a proof that a star's derivative will be the same
+as if it had all its nested alternatives created during deriving being flattened out:
+ For example,
+ \begin{lemma}
+ $\createdByStar{r} \implies \rsimp{r} = \rsimp{\RALTS{\hflataux{r}}}$
+ \end{lemma}
+ \begin{proof}
+ By structural induction on $r$, where the induction rules are these of $\createdByStar{_}$.
+ \end{proof}
+% The simplification of a flattened out regular expression, provided it comes
+%from the derivative of a star, is the same as the one nested.
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
 One might wonder the actual bound rather than the loose bound we gave
 for the convenience of a easier proof.
 How much can the regex $r^* \backslash s$ grow? As hinted earlier,
  they can grow at a maximum speed
-of exponential $w.r.t$ the number of characters.
-But they will eventually level off when the string $s$ is long enough,
+  exponential $w.r.t$ the number of characters, 
+but will eventually level off when the string $s$ is long enough,
 as suggested by the finiteness bound proof.
-
+And unfortunately we have concrete examples
+where such regexes grew exponentially large before levelling off:
+$(a ^ * + (a + aa) ^ * + (a + aa + aaa) ^ * + \ldots + 
+(a+ aa + aaa+\ldots \underbrace{a \ldots a}_{\text{n a's}})^*$ will already have a maximum
+size that is  exponential on the number $n$.
 %TODO: give out a graph showing how the size of the regex grows and levels off at a size exponential to the original regex
 
 
+While the tight upper bound will definitely be a lot lower than 
+the one we gave for the finiteness proof, 
+it will still be at least expoential, under our current simplification rules.
+
+This suggests stronger simplifications that keep the tight bound polynomial.
+
+%----------------------------------------------------------------------------------------
+%	SECTION 5
+%----------------------------------------------------------------------------------------
+\section{a stronger version of simplification}
+%TODO: search for isabelle proofs of algorithms that check equivalence 
+
+