lexing: comparison ChengsongTanPhdThesis/Chapters/ChapterFinite.tex

equal deleted inserted replaced

-:96e93df60954
+:823d9b19d21c
 \label{ChapterFinite}
 %  In Chapter 4 \ref{Chapter4} we give the second guarantee
 %of our bitcoded algorithm, that is a finite bound on the size of any
 %regex's derivatives.
+In this chapter we give a guarantee in terms of time complexity:
-%-----------------------------------
+given a regular expression $r$, for any string $s$
-%	SECTION ?
+our algorithm's internal data structure is finitely bounded.
-%-----------------------------------
+To obtain such a proof, we need to
-\section{preparatory properties for getting a finiteness bound}
+\begin{itemize}
-Before we get to the proof that says the intermediate result of our lexer will
+\item
-remain finitely bounded, which is an important efficiency/liveness guarantee,
+Define an new datatype for regular expressions that makes it easy
-we shall first develop a few preparatory properties and definitions to
+to reason about the size of an annotated regular expression.
-make the process of proving that a breeze.
+\item
+A set of equalities for this new datatype that enables one to
-We define rewriting relations for $\rrexp$s, which allows us to do the
+rewrite $\bderssimp{r_1 \cdot r_2}{s}$ and $\bderssimp{r^*}{s}$ etc.
-same trick as we did for the correctness proof,
+by their children regexes $r_1$, $r_2$, and $r$.
-but this time we will have stronger equalities established.
+\item
-\subsection{"hrewrite" relation}
+Using those equalities to actually get those rewriting equations, which we call
-List of 1-step rewrite rules for regular expressions simplification without bitcodes:
+"closed forms".
-\begin{center}
+\item
-\begin{tabular}{lcl}
+Bound the closed forms, thereby bounding the original
-$r_1 \cdot \ZERO$ & $\rightarrow$ & $\ZERO$
+$\blexersimp$'s internal data structures.
-\end{tabular}
+\end{itemize}
-\end{center}
+\section{the $\mathbf{r}$-rexp datatype and the size functions}
-And we define an "grewrite" relation that works on lists:
-\begin{center}
+We have a size function for bitcoded regular expressions, written
-\begin{tabular}{lcl}
-$ \ZERO :: rs$ & $\rightarrow_g$ & $rs$
-\end{tabular}
-\end{center}
-With these relations we prove
-\begin{lemma}
-$rs \rightarrow rs'  \implies \RALTS{rs} \rightarrow \RALTS{rs'}$
-\end{lemma}
-which enables us to prove "closed forms" equalities such as
-\begin{lemma}
-$\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}$
-\end{lemma}
-But the most involved part of the above lemma is proving the following:
-\begin{lemma}
-$\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $
-\end{lemma}
-which is needed in proving things like
-\begin{lemma}
-$r \rightarrow_f r'  \implies \rsimp{r} \rightarrow \rsimp{r'}$
-\end{lemma}
-Fortunately this is provable by induction on the list $rs$
-%-----------------------------------
-%	SECTION 2
-%-----------------------------------
-\section{Finiteness Property}
-In this section let us describe our argument for why the size of the simplified
-derivatives with the aggressive simplification function is finitely bounded.
-Suppose
-we have a size function for bitcoded regular expressions, written
 $\llbracket r\rrbracket$, which counts the number of nodes if we regard $r$ as a tree
 \begin{center}
 \begin{tabular}{ccc}
 $\llbracket \ACHAR{bs}{c} \rrbracket $ & $\dn$ & $1$\\
 bitcodes are seen as $\rrexp$s.
 We also use $r_1 + r_2$ and $\RALTS{[r_1, r_2]}$ interchageably
 as the former suits people's intuitive way of stating a binary alternative
 regular expression.
+%-----------------------------------
+%	SECTION ?
+%-----------------------------------
+\section{preparatory properties for getting a finiteness bound}
+Before we get to the proof that says the intermediate result of our lexer will
+remain finitely bounded, which is an important efficiency/liveness guarantee,
+we shall first develop a few preparatory properties and definitions to
+make the process of proving that a breeze.
+We define rewriting relations for $\rrexp$s, which allows us to do the
+same trick as we did for the correctness proof,
+but this time we will have stronger equalities established.
+\subsection{"hrewrite" relation}
+List of 1-step rewrite rules for regular expressions simplification without bitcodes:
+\begin{figure}
+\caption{the "h-rewrite" rules}
+\[
+r_1 \cdot \ZERO \rightarrow_h \ZERO \]
+\[
+\ZERO \cdot r_2 \rightarrow \ZERO
+\]
+\end{figure}
+And we define an "grewrite" relation that works on lists:
+\begin{center}
+\begin{tabular}{lcl}
+$ \ZERO :: rs$ & $\rightarrow_g$ & $rs$
+\end{tabular}
+\end{center}
+With these relations we prove
+\begin{lemma}
+$rs \rightarrow rs'  \implies \RALTS{rs} \rightarrow \RALTS{rs'}$
+\end{lemma}
+which enables us to prove "closed forms" equalities such as
+\begin{lemma}
+$\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\suffix \; s \; r_1 ))}$
+\end{lemma}
+But the most involved part of the above lemma is proving the following:
+\begin{lemma}
+$\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $
+\end{lemma}
+which is needed in proving things like
+\begin{lemma}
+$r \rightarrow_f r'  \implies \rsimp{r} \rightarrow \rsimp{r'}$
+\end{lemma}
+Fortunately this is provable by induction on the list $rs$
+%-----------------------------------
+%	SECTION 2
+%-----------------------------------
 \begin{theorem}
 For any regex $r$, $\exists N_r. \forall s. \; \llbracket{\bderssimp{r}{s}}\rrbracket \leq N_r$
 \end{theorem}

changeset 530	823d9b19d21c
parent 528	28751de4b4ba