# HG changeset patch # User Chengsong # Date 1688950306 -3600 # Node ID 71502e4d8691fed1fd5c2e6fe64a0003a7457359 # Parent eddc4eaba7c4a9b148c9d21b9e3e68f8211b8e6d overview of finiteness proof Gerog comment "not helpful", adding more intuitions of "closed forms" diff -r eddc4eaba7c4 -r 71502e4d8691 ChengsongTanPhdThesis/Chapters/Finite.tex --- a/ChengsongTanPhdThesis/Chapters/Finite.tex Mon Jul 10 01:33:45 2023 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Finite.tex Mon Jul 10 01:51:46 2023 +0100 @@ -14,7 +14,7 @@ our algorithm $\blexersimp$'s derivatives are finitely bounded by a constant that only depends on $a$. -Formally we show that there exists an integer $N_a$ such that +Formally we show that there exists a constant integer $N_a$ such that \begin{center} $\llbracket \bderssimp{a}{s} \rrbracket \leq N_a$ \end{center} @@ -51,7 +51,8 @@ We observe that if $S_{r,n}$ continously grows with $n$ (for example growing exponentially fast), then this is equally bad as catastrophic backtracking. - Our finiteness bound seeks to find a constant upper bound $C$ of $\S_{r,n}$, + Our finiteness bound seeks to find a constant integer + upper bound $C$ (which in our case is $N_a$ where $a = r^\uparrow$) of $\S_{r,n}$, so that the complexity of the algorithm can be seen as linear ($O(C * n)$). Even if $C$ is still large in our current work, it is still a constant rather than ever-increasing number with respect to input length $n$. @@ -201,6 +202,31 @@ \subsection{Overview of the Proof} +\marginpar{trying to make it more intuitive +and provide more insights into proof} +The most important intuition is what we call the "closed forms" of +regular expression derivatives with respect to strings. +Assume we have a regular expression $r$, be it an alternative, +a sequence or a star, the idea is if we try to take several derivatives +of it on paper, we end up getting a list of subexpressions, +something like +%omitting certain +%nested structures of those expressions: +\[ + r\backslash s = r_1 + r_2 + r_3 + \ldots + r_n, +\] +if we omit the way these regular expressions need to be nested. +where each $r_i$ ($i \in \{1, \ldots, n\}$) is related to some fragments +of $r$ and $s$. +The second important observation is that the list %of regular expressions +$[r_1, \ldots, r_n]$ %is not +cannot grow indefinitely because they all come from $r$, and derivatives +of the same regular expression are finite up to some isomorphisms. +We prove that the simplifications of $\blexersimp$ %make use of +is powerful enough to counteract the effect of nested structure of alternatives +and eliminate duplicates +such that indeed the list in $a\backslash s$ does not grow unbounded. + A high-level overview of the main components of the finiteness proof is as follows: \begin{figure}[H]