overview of finiteness proof Gerog comment "not helpful", adding more intuitions of "closed forms"
authorChengsong
Mon, 10 Jul 2023 01:51:46 +0100
changeset 661 71502e4d8691
parent 660 eddc4eaba7c4
child 662 fa7552396c71
overview of finiteness proof Gerog comment "not helpful", adding more intuitions of "closed forms"
ChengsongTanPhdThesis/Chapters/Finite.tex
--- a/ChengsongTanPhdThesis/Chapters/Finite.tex	Mon Jul 10 01:33:45 2023 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Finite.tex	Mon Jul 10 01:51:46 2023 +0100
@@ -14,7 +14,7 @@
 our algorithm $\blexersimp$'s derivatives
 are finitely bounded
 by a constant that only depends on $a$.
-Formally we show that there exists an integer $N_a$ such that
+Formally we show that there exists a constant integer $N_a$ such that
 \begin{center}
 	$\llbracket \bderssimp{a}{s} \rrbracket \leq N_a$
 \end{center}
@@ -51,7 +51,8 @@
 		We observe that if $S_{r,n}$ continously grows with $n$ (for example
 		growing exponentially fast), then this
 		is equally bad as catastrophic backtracking.
-		Our finiteness bound seeks to find a constant upper bound $C$ of $\S_{r,n}$,
+		Our finiteness bound seeks to find a constant integer
+		upper bound $C$ (which in our case is $N_a$ where $a = r^\uparrow$) of $\S_{r,n}$,
 		so that the complexity of the algorithm can be seen as linear ($O(C * n)$).
 		Even if $C$ is still large in our current work, it is still a constant
 		rather than ever-increasing number with respect to input length $n$.
@@ -201,6 +202,31 @@
 
 
 \subsection{Overview of the Proof}
+\marginpar{trying to make it more intuitive
+and provide more insights into proof}
+The most important intuition is what we call the "closed forms" of
+regular expression derivatives with respect to strings.
+Assume we have a regular expression $r$, be it an alternative,
+a sequence or a star, the idea is if we try to take several derivatives
+of it on paper, we end up getting a list of subexpressions,
+something like
+%omitting certain
+%nested structures of those expressions:
+\[
+	r\backslash s = r_1 + r_2 + r_3 + \ldots + r_n,
+\]
+if we omit the way these regular expressions need to be nested.
+where each $r_i$ ($i \in \{1, \ldots, n\}$) is related to some fragments
+of $r$ and $s$.
+The second important observation is that the list %of regular expressions
+$[r_1, \ldots, r_n]$ %is not
+cannot grow indefinitely because they all come from $r$, and derivatives
+of the same regular expression are finite up to some isomorphisms.
+We prove that the simplifications of $\blexersimp$ %make use of 
+is powerful enough to counteract the effect of nested structure of alternatives
+and eliminate duplicates
+such that indeed the list in $a\backslash s$ does not grow unbounded.
+
 A high-level overview of the main components of the finiteness proof
 is as follows:
 \begin{figure}[H]