--- a/ChengsongTanPhdThesis/Chapters/Chapter2.tex Sat May 07 13:38:44 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Chapter2.tex Sun May 08 13:26:31 2022 +0100
@@ -368,19 +368,92 @@
We give a predicate for such "star-created" regular expressions:
\begin{center}
\begin{tabular}{lcr}
- & & $\createdByStar{\RSEQ{ra}{\RSTAR{rb}}}$\\
+ & & $\createdByStar{(\RSEQ{ra}{\RSTAR{rb}}) }$\\
$\createdByStar{r_1} \land \createdByStar{r_2} $ & $ \Longrightarrow$ & $\createdByStar{(r_1 + r_2)}$
\end{tabular}
\end{center}
+ These definitions allows us the flexibility to talk about
+ regular expressions in their most convenient format,
+ for example, flattened out $\RALTS{[r_1, r_2, \ldots, r_n]} $
+ instead of binary-nested: $((r_1 + r_2) + (r_3 + r_4)) + \ldots$.
+ These definitions help express that certain classes of syntatically
+ distinct regular expressions are actually the same under simplification.
+ This is not entirely true for annotated regular expressions:
+ %TODO: bsimp bders \neq bderssimp
+ \begin{center}
+ $(1+ (c\cdot \ASEQ{bs}{c^*}{c} ))$
+ \end{center}
+ For bit-codes, the order in which simplification is applied
+ might cause a difference in the location they are placed.
+ If we want something like
+ \begin{center}
+ $\bderssimp{r}{s} \myequiv \bsimp{\bders{r}{s}}$
+ \end{center}
+ Some "canonicalization" procedure is required,
+ which either pushes all the common bitcodes to nodes
+ as senior as possible:
+ \begin{center}
+ $_{bs}(_{bs_1 @ bs'}r_1 + _{bs_1 @ bs''}r_2) \rightarrow _{bs @ bs_1}(_{bs'}r_1 + _{bs''}r_2) $
+ \end{center}
+ or does the reverse. However bitcodes are not of interest if we are talking about
+ the $\llbracket r \rrbracket$ size of a regex.
+ Therefore for the ease and simplicity of producing a
+ proof for a size bound, we are happy to restrict ourselves to
+ unannotated regular expressions, and obtain such equalities as
+ \begin{lemma}
+ $\rsimp{(r_1 + r_2)} = \rsimp{\RALTS{\hflataux{r_1} @ \hflataux{r_2}}}$
+ \end{lemma}
+
+ \begin{proof}
+ By using the rewriting relation $\rightsquigarrow$
+ \end{proof}
+ %TODO: rsimp sflat
+And from this we obtain a proof that a star's derivative will be the same
+as if it had all its nested alternatives created during deriving being flattened out:
+ For example,
+ \begin{lemma}
+ $\createdByStar{r} \implies \rsimp{r} = \rsimp{\RALTS{\hflataux{r}}}$
+ \end{lemma}
+ \begin{proof}
+ By structural induction on $r$, where the induction rules are these of $\createdByStar{_}$.
+ \end{proof}
+% The simplification of a flattened out regular expression, provided it comes
+%from the derivative of a star, is the same as the one nested.
+
+
+
+
+
+
+
+
+
One might wonder the actual bound rather than the loose bound we gave
for the convenience of a easier proof.
How much can the regex $r^* \backslash s$ grow? As hinted earlier,
they can grow at a maximum speed
-of exponential $w.r.t$ the number of characters.
-But they will eventually level off when the string $s$ is long enough,
+ exponential $w.r.t$ the number of characters,
+but will eventually level off when the string $s$ is long enough,
as suggested by the finiteness bound proof.
-
+And unfortunately we have concrete examples
+where such regexes grew exponentially large before levelling off:
+$(a ^ * + (a + aa) ^ * + (a + aa + aaa) ^ * + \ldots +
+(a+ aa + aaa+\ldots \underbrace{a \ldots a}_{\text{n a's}})^*$ will already have a maximum
+size that is exponential on the number $n$.
%TODO: give out a graph showing how the size of the regex grows and levels off at a size exponential to the original regex
+While the tight upper bound will definitely be a lot lower than
+the one we gave for the finiteness proof,
+it will still be at least expoential, under our current simplification rules.
+
+This suggests stronger simplifications that keep the tight bound polynomial.
+
+%----------------------------------------------------------------------------------------
+% SECTION 5
+%----------------------------------------------------------------------------------------
+\section{a stronger version of simplification}
+%TODO: search for isabelle proofs of algorithms that check equivalence
+
+