diff -r 96e93df60954 -r 823d9b19d21c ChengsongTanPhdThesis/Chapters/ChapterFinite.tex --- a/ChengsongTanPhdThesis/Chapters/ChapterFinite.tex Mon May 30 17:24:52 2022 +0100 +++ b/ChengsongTanPhdThesis/Chapters/ChapterFinite.tex Mon May 30 20:36:15 2022 +0100 @@ -7,67 +7,29 @@ %of our bitcoded algorithm, that is a finite bound on the size of any %regex's derivatives. - -%----------------------------------- -% SECTION ? -%----------------------------------- -\section{preparatory properties for getting a finiteness bound} -Before we get to the proof that says the intermediate result of our lexer will -remain finitely bounded, which is an important efficiency/liveness guarantee, -we shall first develop a few preparatory properties and definitions to -make the process of proving that a breeze. - -We define rewriting relations for $\rrexp$s, which allows us to do the -same trick as we did for the correctness proof, -but this time we will have stronger equalities established. -\subsection{"hrewrite" relation} -List of 1-step rewrite rules for regular expressions simplification without bitcodes: -\begin{center} -\begin{tabular}{lcl} -$r_1 \cdot \ZERO$ & $\rightarrow$ & $\ZERO$ -\end{tabular} -\end{center} - -And we define an "grewrite" relation that works on lists: -\begin{center} -\begin{tabular}{lcl} -$ \ZERO :: rs$ & $\rightarrow_g$ & $rs$ -\end{tabular} -\end{center} - - +In this chapter we give a guarantee in terms of time complexity: +given a regular expression $r$, for any string $s$ +our algorithm's internal data structure is finitely bounded. +To obtain such a proof, we need to +\begin{itemize} +\item +Define an new datatype for regular expressions that makes it easy +to reason about the size of an annotated regular expression. +\item +A set of equalities for this new datatype that enables one to +rewrite $\bderssimp{r_1 \cdot r_2}{s}$ and $\bderssimp{r^*}{s}$ etc. +by their children regexes $r_1$, $r_2$, and $r$. +\item +Using those equalities to actually get those rewriting equations, which we call +"closed forms". +\item +Bound the closed forms, thereby bounding the original +$\blexersimp$'s internal data structures. +\end{itemize} -With these relations we prove -\begin{lemma} -$rs \rightarrow rs' \implies \RALTS{rs} \rightarrow \RALTS{rs'}$ -\end{lemma} -which enables us to prove "closed forms" equalities such as -\begin{lemma} -$\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}$ -\end{lemma} +\section{the $\mathbf{r}$-rexp datatype and the size functions} -But the most involved part of the above lemma is proving the following: -\begin{lemma} -$\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $ -\end{lemma} -which is needed in proving things like -\begin{lemma} -$r \rightarrow_f r' \implies \rsimp{r} \rightarrow \rsimp{r'}$ -\end{lemma} - -Fortunately this is provable by induction on the list $rs$ - - - -%----------------------------------- -% SECTION 2 -%----------------------------------- - -\section{Finiteness Property} -In this section let us describe our argument for why the size of the simplified -derivatives with the aggressive simplification function is finitely bounded. - Suppose -we have a size function for bitcoded regular expressions, written +We have a size function for bitcoded regular expressions, written $\llbracket r\rrbracket$, which counts the number of nodes if we regard $r$ as a tree \begin{center} @@ -170,6 +132,65 @@ regular expression. + +%----------------------------------- +% SECTION ? +%----------------------------------- +\section{preparatory properties for getting a finiteness bound} +Before we get to the proof that says the intermediate result of our lexer will +remain finitely bounded, which is an important efficiency/liveness guarantee, +we shall first develop a few preparatory properties and definitions to +make the process of proving that a breeze. + +We define rewriting relations for $\rrexp$s, which allows us to do the +same trick as we did for the correctness proof, +but this time we will have stronger equalities established. +\subsection{"hrewrite" relation} +List of 1-step rewrite rules for regular expressions simplification without bitcodes: +\begin{figure} +\caption{the "h-rewrite" rules} +\[ +r_1 \cdot \ZERO \rightarrow_h \ZERO \] + +\[ +\ZERO \cdot r_2 \rightarrow \ZERO +\] +\end{figure} +And we define an "grewrite" relation that works on lists: +\begin{center} +\begin{tabular}{lcl} +$ \ZERO :: rs$ & $\rightarrow_g$ & $rs$ +\end{tabular} +\end{center} + + + +With these relations we prove +\begin{lemma} +$rs \rightarrow rs' \implies \RALTS{rs} \rightarrow \RALTS{rs'}$ +\end{lemma} +which enables us to prove "closed forms" equalities such as +\begin{lemma} +$\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\suffix \; s \; r_1 ))}$ +\end{lemma} + +But the most involved part of the above lemma is proving the following: +\begin{lemma} +$\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $ +\end{lemma} +which is needed in proving things like +\begin{lemma} +$r \rightarrow_f r' \implies \rsimp{r} \rightarrow \rsimp{r'}$ +\end{lemma} + +Fortunately this is provable by induction on the list $rs$ + + + +%----------------------------------- +% SECTION 2 +%----------------------------------- + \begin{theorem} For any regex $r$, $\exists N_r. \forall s. \; \llbracket{\bderssimp{r}{s}}\rrbracket \leq N_r$ \end{theorem}