ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
changeset 624 8ffa28fce271
parent 601 ce4e5151a836
child 639 80cc6dc4c98b
--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Sat Nov 12 00:37:23 2022 +0000
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Sat Nov 12 21:34:40 2022 +0000
@@ -12,32 +12,37 @@
 
 
 In this chapter we introduce simplifications
-on annotated regular expressions that can be applied to 
+for annotated regular expressions that can be applied to 
 each intermediate derivative result. This allows
 us to make $\blexer$ much more efficient.
-Sulzmann and Lu already had some bit-coded simplifications,
-but their simplification functions  were inefficient.
-We contrast our simplification function 
-with Sulzmann and Lu's, indicating the simplicity of our algorithm.
-This is another case for the usefulness 
-and reliability of formal proofs on algorithms.
-These ``aggressive'' simplifications would not be possible in the injection-based 
-lexing we introduced in chapter \ref{Inj}.
-We then prove the correctness with the improved version of 
-$\blexer$, called $\blexersimp$, by establishing 
-$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
-
+Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
+but their simplification functions  were inefficient and in some cases needs fixing.
+%We contrast our simplification function 
+%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
+%This is another case for the usefulness 
+%and reliability of formal proofs on algorithms.
+%These ``aggressive'' simplifications would not be possible in the injection-based 
+%lexing we introduced in chapter \ref{Inj}.
+%We then prove the correctness with the improved version of 
+%$\blexer$, called $\blexersimp$, by establishing 
+%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
+%
 \section{Simplifications by Sulzmann and Lu}
-Consider the derivatives of examples such as $(a^*a^*)^*$
-and $(a^* + (aa)^*)^*$:
+Consider the derivatives of the following example $(a^*a^*)^*$:
+%and $(a^* + (aa)^*)^*$:
 \begin{center}
-	$(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} (a^*a^* + a^*)\cdot(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} $\\
-	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} \ldots$
+	\begin{tabular}{lcl}
+		$(a^*a^*)^*$ & $ \stackrel{\backslash a}{\longrightarrow}$ & 
+		$ (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
+			     & 
+		$ \stackrel{\backslash a}{\longrightarrow} $ & 
+	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
+							     & $\stackrel{\backslash a}{
+	\longrightarrow} $ & $\ldots$\\
+	\end{tabular}
 \end{center}
 \noindent
-As can be seen, there is a lot of duplication 
-in the example we have already mentioned in 
-\ref{eqn:growth2}.
+As can be seen, there are serveral duplications.
 A simple-minded simplification function cannot simplify
 the third regular expression in the above chain of derivative
 regular expressions, namely
@@ -52,52 +57,51 @@
 \begin{gather*}
 	((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + 
 	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\
-	\bigg\downarrow \\
+	\bigg\downarrow (1) \\
 	(a^*a^* + a^* 
 	\color{gray} + a^* \color{black})\cdot(a^*a^*)^* + 
 	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\
-	\bigg\downarrow \\
+	\bigg\downarrow (2) \\
 	(a^*a^* + a^* 
 	)\cdot(a^*a^*)^*  
 	\color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*\\
-	\bigg\downarrow \\
+	\bigg\downarrow (3) \\
 	(a^*a^* + a^* 
 	)\cdot(a^*a^*)^*  
 \end{gather*}
 \noindent
 In the first step, the nested alternative regular expression
 $(a^*a^* + a^*) + a^*$ is flattened into $a^*a^* + a^* + a^*$.
-Now the third term $a^*$ is clearly identified as a duplicate
-and therefore removed in the second step. This causes the two
+Now the third term $a^*$ can clearly be identified as a duplicate
+and therefore removed in the second step. 
+This causes the two
 top-level terms to become the same and the second $(a^*a^*+a^*)\cdot(a^*a^*)^*$ 
-removed in the final step.\\
-This motivating example is from testing Sulzmann and Lu's 
-algorithm: their simplification does 
-not work!
-Consider their simplification (using our notations):
+removed in the final step.
+Sulzmann and Lu's simplification function (using our notations) can achieve this
+simplification:
 \begin{center}
 	\begin{tabular}{lcl}
-		$\simpsulz \; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & 
+		$\textit{simp}\_{SL} \; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & 
 		$\textit{if} \; (\textit{zeroable} \; r)\; \textit{then} \;\; \ZERO$\\
 						   & &$\textit{else}\;\; \fuse \; (bs@ bs') \; r$\\
-		$\simpsulz \;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if} 
+		$\textit{simp}\_{SL} \;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if} 
 		\; (\textit{zeroable} \; r_1 \; \textit{or} \; \textit{zeroable}\; r_2)\;
 		\textit{then} \;\; \ZERO$\\
-					     & & $\textit{else}\;\;_{bs}((\simpsulz \;r_1)\cdot
-					     (\simpsulz \; r_2))$\\
-		$\simpsulz  \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\
-		$\simpsulz  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
+							    & & $\textit{else}\;\;_{bs}((\textit{simp}\_{SL} \;r_1)\cdot
+							    (\textit{simp}\_{SL} \; r_2))$\\
+		$\textit{simp}\_{SL}  \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\
+		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
 		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
-		$\simpsulz  \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\simpsulz  \; r)$\\
-		$\simpsulz  \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum 
-		(\nub \; (\filter \; (\not \circ \zeroable)\;((\simpsulz  \; r) :: \map \; \simpsulz  \; rs)))$\\ 
+		$\textit{simp}\_{SL}  \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\textit{simp}\_{SL}  \; r)$\\
+		$\textit{simp}\_{SL}  \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum 
+		(\nub \; (\filter \; (\not \circ \zeroable)\;((\textit{simp}\_{SL}  \; r) :: \map \; \textit{simp}\_{SL}  \; rs)))$\\ 
 		
 	\end{tabular}
 \end{center}
 \noindent
-where the $\textit{zeroable}$ predicate 
+The $\textit{zeroable}$ predicate 
 tests whether the regular expression
-is equivalent to $\ZERO$,
+is equivalent to $\ZERO$, and
 can be defined as:
 \begin{center}
 	\begin{tabular}{lcl}
@@ -111,25 +115,36 @@
 	\end{tabular}
 \end{center}
 \noindent
-They suggested that the $\simpsulz $ function should be
-applied repeatedly until a fixpoint is reached.
-We call this construction $\textit{sulzSimp}$:
+The 
 \begin{center}
 	\begin{tabular}{lcl}
-		$\textit{sulzSimp} \; r$ & $\dn$ & 
-		$\textit{while}((\simpsulz  \; r)\; \cancel{=} \; r)$ \\
-		& & $\quad r := \simpsulz  \; r$\\
+		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
+		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
+	\end{tabular}
+\end{center}
+\noindent
+clause does flatten the alternative as required in step (1),
+but $\textit{simp}\_{SL}$ is insufficient if we want to do steps (2) and (3),
+as these ``identical'' terms have different bit-annotations.
+They also suggested that the $\textit{simp}\_{SL} $ function should be
+applied repeatedly until a fixpoint is reached.
+We call this construction $\textit{SLSimp}$:
+\begin{center}
+	\begin{tabular}{lcl}
+		$\textit{SLSimp} \; r$ & $\dn$ & 
+		$\textit{while}((\textit{simp}\_{SL}  \; r)\; \cancel{=} \; r)$ \\
+					 & & $\quad r := \textit{simp}\_{SL}  \; r$\\
 		& & $\textit{return} \; r$
 	\end{tabular}
 \end{center}
 We call the operation of alternatingly 
 applying derivatives and simplifications
 (until the string is exhausted) Sulz-simp-derivative,
-written $\backslash_{sulzSimp}$:
+written $\backslash_{SLSimp}$:
 \begin{center}
 \begin{tabular}{lcl}
-	$r \backslash_{sulzSimp} (c\!::\!s) $ & $\dn$ & $(\textit{sulzSimp} \; (r \backslash c)) \backslash_{sulzSimp}\, s$ \\
-$r \backslash_{sulzSimp} [\,] $ & $\dn$ & $r$
+	$r \backslash_{SLSimp} (c\!::\!s) $ & $\dn$ & $(\textit{SLSimp} \; (r \backslash c)) \backslash_{SLSimp}\, s$ \\
+$r \backslash_{SLSimp} [\,] $ & $\dn$ & $r$
 \end{tabular}
 \end{center}
 \noindent
@@ -138,8 +153,8 @@
 as $\blexer$:
 \begin{center}
 \begin{tabular}{lcl}
-  $\textit{blexer\_sulzSimp}\;r\,s$ & $\dn$ &
-      $\textit{let}\;a = (r^\uparrow)\backslash_{sulzSimp}\, s\;\textit{in}$\\                
+  $\textit{blexer\_SLSimp}\;r\,s$ & $\dn$ &
+      $\textit{let}\;a = (r^\uparrow)\backslash_{SLSimp}\, s\;\textit{in}$\\                
   & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
   & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
   & & $\;\;\textit{else}\;\textit{None}$
@@ -148,7 +163,7 @@
 \noindent
 We implemented this lexing algorithm in Scala, 
 and found that the final derivative regular expression
-size grows exponentially fast:
+size grows exponentially fast (note the logarithmic scale):
 \begin{figure}[H]
 	\centering
 \begin{tikzpicture}
@@ -169,7 +184,7 @@
 \noindent
 At $n= 20$ we already get an out of memory error with Scala's normal 
 JVM heap size settings.
-In fact their simplification does not improve over
+In fact their simplification does not improve much over
 the simple-minded simplifications we have shown in \ref{fig:BetterWaterloo}.
 The time required also grows exponentially:
 \begin{figure}[H]
@@ -199,46 +214,47 @@
 \noindent
 The assumption that the size of the regular expressions
 in the algorithm
-would stay below a finite constant is not ture.
-The main reason behind this is that (i) The $\textit{nub}$
+would stay below a finite constant is not true, not in the
+examples we considered.
+The main reason behind this is that (i) the $\textit{nub}$
 function requires identical annotations between two 
 annotated regular expressions to qualify as duplicates,
-and cannot simplify the cases like $_{SZZ}a^*+_{SZS}a^*$
-even if both $a^*$ denote the same language.
-(ii) The ``flattening'' only applies to the head of the list
+and therefore cannot simplify cases like $_{SZZ}a^*+_{SZS}a^*$
+even if both $a^*$ denote the same language, and
+(ii) the ``flattening'' only applies to the head of the list
 in the 
 \begin{center}
 	\begin{tabular}{lcl}
 
-		$\simpsulz  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
+		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
 		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
 	\end{tabular}
 \end{center}
 \noindent
-clause, and therefore is not thorough enough to simplify all
-needed parts of the regular expression.\\
-In addition to that, even if the regular expressions size
+clause, and therefore is not strong enough to simplify all
+needed parts of the regular expression. Moreover, even if the regular expressions size
 do stay finite, one has to take into account that
-the $\simpsulz$ function is applied many times
+the $\textit{simp}\_{SL}$ function is applied many times
 in each derivative step, and that number is not necessarily
 a constant with respect to the size of the regular expression.
-To not get ``caught off guard'' by
-these counterexamples,
-one needs to be more careful when designing the
-simplification function and making claims about them.
+%To not get ``caught off guard'' by
+%these counterexamples,
+%one needs to be more careful when designing the
+%simplification function and making claims about them.
 
 \section{Our $\textit{Simp}$ Function}
-We will now introduce our simplification function,
-by making a contrast with $\simpsulz$.
-We describe
-the ideas behind components in their algorithm 
-and why they fail to achieve the desired effect, followed
-by our solution. These solutions come with correctness
-statements that are backed up by formal proofs.
+We will now introduce our simplification function.
+%by making a contrast with $\textit{simp}\_{SL}$.
+We also describe
+the ideas behind Sulzmann and Lu's $\textit{simp}\_{SL}$
+algorithm 
+and why it fails to achieve the desired effect. 
+Our simplification function comes with a formal
+correctness proof.
 \subsection{Flattening Nested Alternatives}
 The idea behind the clause
 \begin{center}
-$\simpsulz  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2) \quad \dn \quad
+	$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2) \quad \dn \quad
 	       _{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$
 \end{center}
 is that it allows
@@ -250,25 +266,27 @@
 \begin{center}
 $(a+r)+r \longrightarrow a+r$
 \end{center}
-The problem here is that only the head element
-is ``spilled out'',
-whereas we would want to flatten
-an entire list to open up possibilities for further simplifications.
+The problem is that only the head element
+is ``spilled out''.
+It is more desirable
+to flatten
+an entire list to open up possibilities for further simplifications
+with later regular expressions.
 Not flattening the rest of the elements also means that
 the later de-duplication processs 
 does not fully remove further duplicates.
 For example,
-using $\simpsulz$ we could not 
+using $\textit{simp}\_{SL}$ we cannot
 simplify
 \begin{center}
 	$((a^* a^*)+\underline{(a^* + a^*)})\cdot (a^*a^*)^*+
 ((a^*a^*)+a^*)\cdot (a^*a^*)^*$
 \end{center}
-due to the underlined part not in the first element
-of the alternative.\\
-We define a flatten operation that flattens not only 
-the first regular expression of an alternative,
-but the entire list: 
+due to the underlined part not being in the head 
+of the alternative.
+
+We define our flatten operation so that it flattens 
+the entire list: 
  \begin{center}
   \begin{tabular}{@{}lcl@{}}
   $\textit{flts} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
@@ -282,10 +300,10 @@
 also throws away $\ZERO$s
 as they do not contribute to a lexing result.
 \subsection{Duplicate Removal}
-After flattening is done, we are ready to deduplicate.
+After flattening is done, we can deduplicate.
 The de-duplicate function is called $\distinctBy$,
 and that is where we make our second improvement over
-Sulzmann and Lu's.
+Sulzmann and Lu's simplification method.
 The process goes as follows:
 \begin{center}
 $rs \stackrel{\textit{flts}}{\longrightarrow} 
@@ -306,7 +324,7 @@
 \noindent
 The reason we define a distinct function under a mapping $f$ is because
 we want to eliminate regular expressions that are syntactically the same,
-but with different bit-codes.
+but have different bit-codes.
 For example, we can remove the second $a^*a^*$ from
 $_{ZSZ}a^*a^* + _{SZZ}a^*a^*$, because it
 represents a match with shorter initial sub-match 
@@ -327,21 +345,23 @@
 even if they are attached with different bitcodes.
 These duplicates therefore need to be removed.
 To achieve this, we call $\rerases$ as the function $f$ during the distinction
-operation.\\
+operation. The function
 $\rerases$ is very similar to $\erase$, except that it preserves the structure
 when erasing an alternative regular expression.
 The reason why we use $\rerases$ instead of $\erase$ is that
 it keeps the structures of alternative 
 annotated regular expressions
-whereas $\erase$ would turn it back into a binary structure.
+whereas $\erase$ would turn it back into a binary  tree structure.
 Not having to mess with the structure 
 greatly simplifies the finiteness proof in chapter 
-\ref{Finite} (we will follow up with more details there).
+\ref{Finite}.
 We give the definitions of $\rerases$ here together with
 the new datatype used by $\rerases$ (as our plain
 regular expression datatype does not allow non-binary alternatives).
-For the moment the reader can just think of 
-$\rerases$ as $\erase$ and $\rrexp$ as plain regular expressions.
+For now we can think of 
+$\rerases$ as the function $(\_)_\downarrow$ defined in chapter \ref{Bitcoded1}
+and $\rrexp$ as plain regular expressions, but having a general list constructor
+for alternatives:
 \begin{figure}[H]
 \begin{center}	
 	$\rrexp ::=   \RZERO \mid  \RONE
@@ -353,9 +373,7 @@
 \caption{$\rrexp$: plain regular expressions, but with $\sum$ alternative 
 constructor}\label{rrexpDef}
 \end{figure}
-The notation of $\rerases$ also follows that of $\erase$,
-which is a postfix operator written as a subscript,
-except that it has an \emph{r} attached to it to distinguish against $\erase$:
+The function $\rerases$ we define as follows:
 \begin{center}
 \begin{tabular}{lcl}
 $\rerase{\ZERO}$ & $\dn$ & $\RZERO$\\
@@ -368,8 +386,7 @@
 \end{center}
 
 \subsection{Putting Things Together}
-A recursive definition of our  simplification function 
-is given below:
+We can now give the definition of our  simplification function:
 %that looks somewhat similar to our Scala code is 
 \begin{center}
   \begin{tabular}{@{}lcl@{}}
@@ -404,21 +421,20 @@
 and then call $\distinctBy$ on that list, the predicate determining whether two 
 elements are the same is $\rerases \; r_1 = \rerases\; r_2$.
 Finally, depending on whether the regular expression list $as'$ has turned into a
-singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{AALTS}$
+singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{ALTS}$
 decides whether to keep the current level constructor $\sum$ as it is, and 
 removes it when there are less than two elements:
 \begin{center}
 	\begin{tabular}{lcl}
-		$\textit{bsimp}_{AALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\		
+		$\textit{bsimp}_{ALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\		
   &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
    &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
    &&$\quad\textit{case} \;  as' \Rightarrow _{bs}\sum \textit{as'}$\\ 
 	\end{tabular}
 	
 \end{center}
-Having defined the $\bsimp$ function,
-we add it as a phase after a derivative is taken,
-so it stays small:
+Having defined the $\textit{bsimp}$ function,
+we add it as a phase after a derivative is taken.
 \begin{center}
 	\begin{tabular}{lcl}
 		$r \backslash_{bsimp} s$ & $\dn$ & $\textit{bsimp}(r \backslash s)$
@@ -428,7 +444,7 @@
 %when extending from derivatives w.r.t.~character to derivative
 %w.r.t.~string, we define the derivative that nests simplifications 
 %with derivatives:%\comment{simp in  the [] case?}
-We extend this from character to string:
+We extend this from characters to strings:
 \begin{center}
 \begin{tabular}{lcl}
 $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
@@ -458,7 +474,7 @@
 After Simplification}
 Recall the
 previous $(a^*a^*)^*$ example
-where $\simpsulz$ could not
+where $\textit{simp}\_{SL}$ could not
 prevent the fast growth (over
 3 million nodes just below $20$ input length)
 will be reduced to just 15 and stays constant no matter how long the
@@ -486,7 +502,7 @@
     ylabel={derivative size},
     width = 7cm,
     height = 4cm,
-    legend entries={Lexer with $\simpsulz$},  
+    legend entries={Lexer with $\textit{simp}\_{SL}$},  
     legend pos=  north west,
     legend cell align=left]
 \addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
@@ -499,36 +515,38 @@
 \noindent
 Given the size difference, it is not
 surprising that our $\blexersimp$ significantly outperforms
-$\textit{blexer\_sulzSimp}$.
-In the next section we are going to establish the
-first important property of our lexer--the correctness.
+$\textit{blexer\_SLSimp}$.
+In the next section we are going to establish that our
+simplification preserves the correctness of the algorithm.
 %----------------------------------------------------------------------------------------
 %	SECTION rewrite relation
 %----------------------------------------------------------------------------------------
 \section{Correctness of $\blexersimp$}
-In this section we give details
-of the correctness proof of $\blexersimp$,
-one of the contributions of this thesis.\\
 We first introduce the rewriting relation \emph{rrewrite}
 ($\rrewrite$) between two regular expressions,
-which expresses an atomic
+which stands for an atomic
 simplification.
 We then prove properties about
 this rewriting relation and its reflexive transitive closure.
 Finally we leverage these properties to show
-an equivalence between the internal data structures of 
+an equivalence between the results generated by
 $\blexer$ and $\blexersimp$.
 
 \subsection{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
 In the $\blexer$'s correctness proof, we
 did not directly derive the fact that $\blexer$ generates the POSIX value,
-but first proved that $\blexer$ is linked with $\lexer$.
+but first proved that $\blexer$ generates the same result as $\lexer$.
 Then we re-use
 the correctness of $\lexer$
-to obtain
+to obtain 
 \begin{center}
-	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$.
+	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$\\
+	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
+	r\;s = \None$.
 \end{center}
+%\begin{center}
+%	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$.
+%\end{center}
 Here we apply this
 modularised technique again
 by first proving that
@@ -536,22 +554,25 @@
 produces the same output as $\blexer \; r\; s$,
 and then piecing it together with 
 $\blexer$'s correctness to achieve our main
-theorem:\footnote{ The case when 
-$s$ is not in $L \; r$, is routine to establish.}
+theorem:
 \begin{center}
-	$(r, s) \rightarrow v \; \;   \textit{iff} \;\;  \blexersimp \; r \; s = v$
+	$(r, s) \rightarrow v \; \;   \textit{iff} \;\;  \blexersimp \; r \; s = \Some \;v$
+	\\
+	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
+	r\;s = \None$
 \end{center}
 \noindent
 The overall idea for the proof
 of $\blexer \;r \;s = \blexersimp \; r \;s$ 
 is that the transition from $r$ to $\textit{bsimp}\; r$ can be
-broken down into finitely many rewrite steps:
+broken down into smaller rewrite steps of the form:
 \begin{center}
 	$r \rightsquigarrow^* \textit{bsimp} \; r$
 \end{center}
 where each rewrite step, written $\rightsquigarrow$,
 is an ``atomic'' simplification that
-is similar to a small-step reduction in operational semantics:
+is similar to a small-step reduction in operational semantics (
+see figure \ref{rrewriteRules} for the rules):
 \begin{figure}[H]
 \begin{mathpar}
 	\inferrule * [Right = $S\ZERO_l$]{\vspace{0em}}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
@@ -596,11 +617,11 @@
 }\label{rrewriteRules}
 \end{figure}
 \noindent
-The rules such as $LT$ and $LH$ are for rewriting between two regular expression lists
+The rules $LT$ and $LH$ are for rewriting two regular expression lists
 such that one regular expression
 in the left-hand-side list is rewritable in one step
 to the right-hand-side's regular expression at the same position.
-This helps with defining the ``context rules'' such as $AL$.\\
+This helps with defining the ``context rule'' $AL$.\\
 The reflexive transitive closure of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$
 are defined in the usual way:
 \begin{figure}[H]
@@ -619,20 +640,21 @@
 \end{figure}
 %Two rewritable terms will remain rewritable to each other
 %even after a derivative is taken:
-Rewriting is preserved under derivatives,
+The main point of our rewriting relation
+is that it is preserved under derivatives,
 namely
 \begin{center}
 	$r_1 \rightsquigarrow r_2 \implies (r_1 \backslash c) \rightsquigarrow^* (r_2 \backslash c)$
 \end{center}
-And finally, if two terms are rewritable to each other,
+And also, if two terms are rewritable to each other,
 then they produce the same bitcodes:
 \begin{center}
 	$r \rightsquigarrow^* r' \;\; \textit{then} \; \; \bmkeps \; r = \bmkeps \; r'$
 \end{center}
 The decoding phase of both $\blexer$ and $\blexersimp$
-are the same, which means that if they get the same
+are the same, which means that if they receive the same
 bitcodes before the decoding phase,
-they get the same value after decoding is done.
+they generate the same value after decoding is done.
 We will prove the three properties 
 we mentioned above in the next sub-section.
 \subsection{Important Properties of $\rightsquigarrow$}
@@ -685,7 +707,7 @@
 \end{proof}
 \noindent
 The inference rules of $\stackrel{s}{\rightsquigarrow}$
-are defined in terms of list cons operation, here
+are defined in terms of the list cons operation, where
 we establish that the 
 $\stackrel{s}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ 
 relation is also preserved w.r.t appending and prepending of a list.
@@ -717,9 +739,9 @@
 			$rs_3 \stackrel{s*}{\rightsquigarrow} rs_4 \land r_1 \rightsquigarrow^* r_2 \implies
 			r_2 :: rs_3 \stackrel{s*}{\rightsquigarrow} r_2 :: rs_4$
 		\item			
-			If we could rewrite a regular expression 
+			If we can rewrite a regular expression 
 			in many steps to $\ZERO$, then 
-			we could also rewrite any sequence containing it to $\ZERO$:\\
+			we can also rewrite any sequence containing it to $\ZERO$:\\
 			$r_1 \rightsquigarrow^* \ZERO 
 			\implies _{bs}r_1\cdot r_2 \rightsquigarrow^* \ZERO$
 	\end{itemize}
@@ -736,18 +758,18 @@
 	The last part is proven by rule induction again on $\rightsquigarrow^*$.
 \end{proof}
 \noindent
-Now we are ready to give the proofs of the below properties:
+Now we are ready to give the proofs of the following properties:
 \begin{itemize}
 	\item
-		$(r \rightsquigarrow^* r'\land \bnullable \; r_1) 
+		$r \rightsquigarrow^* r'\land \bnullable \; r_1 
 		\implies \bmkeps \; r = \bmkeps \; r'$. \\
 	\item
 		$r \rightsquigarrow^* \textit{bsimp} \;r$.\\
 	\item
 		$r \rightsquigarrow r' \implies r \backslash c \rightsquigarrow^* r'\backslash c$.\\
 \end{itemize}
-These properties would work together towards the correctness theorem.
-\subsubsection{Property 1: $(r \rightsquigarrow^* r'\land \bnullable \; r_1) 
+
+\subsubsection{Property 1: $r \rightsquigarrow^* r'\land \bnullable \; r_1 
 		\implies \bmkeps \; r = \bmkeps \; r'$}
 Intuitively, this property says we can 
 extract the same bitcodes using $\bmkeps$ from the nullable
@@ -761,7 +783,7 @@
 	$\textit{bnullables} \; rs \quad \dn \quad \exists r \in rs. \;\; \bnullable \; r$
 \end{center}
 \noindent
-The rewriting relation $\rightsquigarrow$ preserves nullability:
+The rewriting relation $\rightsquigarrow$ preserves (b)nullability:
 \begin{lemma}\label{rewritesBnullable}
 	\hspace{0em}
 	\begin{itemize}
@@ -812,23 +834,23 @@
 	By rule induction over the cases that lead to $r_1 \rightsquigarrow r_2$.
 \end{proof}
 \noindent
-With lemma \ref{rewriteBmkepsAux} we are ready to prove its
+With lemma \ref{rewriteBmkepsAux} in place we are ready to prove its
 many-step version: 
 \begin{lemma}
 	$\text{If} \;\; r \stackrel{*}{\rightsquigarrow} r' \;\; \text{and} \;\; \bnullable \; r, \;\;\; \text{then} \;\; \bmkeps \; r = \bmkeps \; r'$
 \end{lemma}
 \begin{proof}
-	By rule induction of $\stackrel{*}{\rightsquigarrow} $.
-	$\ref{rewritesBnullable}$ tells us both $r$ and $r'$ are nullable.
-	\ref{rewriteBmkepsAux} solves the inductive case.
+	By rule induction of $\stackrel{*}{\rightsquigarrow} $. Lemma 
+	$\ref{rewritesBnullable}$ gives us both $r$ and $r'$ are nullable.
+	The lemma \ref{rewriteBmkepsAux} solves the inductive case.
 \end{proof}
 
-\subsubsection{Property 2: $r \stackrel{*}{\rightsquigarrow} \bsimp{r}$}
-Now we get to the ``meaty'' part of the proof, 
+\subsubsection{Property 2: $r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$}
+Now we get to the key part of the proof, 
 which says that our simplification's helper functions 
-such as $\distinctBy$ and $\flts$ conform to 
-the $\stackrel{s*}{\rightsquigarrow}$ and 
-$\rightsquigarrow^* $ rewriting relations.\\
+such as $\distinctBy$ and $\flts$ describe
+reducts of $\stackrel{s*}{\rightsquigarrow}$ and 
+$\rightsquigarrow^* $.\\
 The first lemma to prove is a more general version of 
 $rs_ 1 \rightsquigarrow^* \distinctBy \; rs_1 \; \phi$:
 \begin{lemma}
@@ -849,7 +871,7 @@
 	$rs_1 \stackrel{s*}{\rightsquigarrow} \distinctBy \; rs_1 \; \phi$.
 \end{corollary}
 \noindent
-The flatten function $\flts$ conforms to
+Similarly the flatten function $\flts$ describes a reduct of
 $\stackrel{s*}{\rightsquigarrow}$ as well:
 
 \begin{lemma}\label{fltsPreserves}
@@ -865,14 +887,14 @@
 \end{lemma}
 \noindent
 The simplification function
-$\textit{bsimp}$ only transforms the regex $r$ using steps specified by 
-$\rightsquigarrow^*$ and nothing else.
+$\textit{bsimp}$ only transforms the regular expression  using steps specified by 
+$\rightsquigarrow^*$ and nothing else:
 \begin{lemma}
-	$r \stackrel{*}{\rightsquigarrow} \bsimp{r}$
+	$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
 \end{lemma}
 \begin{proof}
 	By an induction on $r$.
-	The most involved case would be the alternative, 
+	The most involved case is the alternative, 
 	where we use lemmas \ref{bsimpaltsPreserves},
 	\ref{fltsPreserves} and \ref{dBPreserves} to do a series of rewriting:\\
 	\begin{center}
@@ -883,7 +905,7 @@
 			(\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerases \; \phi$\\
 		\end{tabular}
 	\end{center}
-	Using this we derive the following rewrite relation:\\
+	Using this we can derive the following rewrite sequence:\\
 	\begin{center}
 		\begin{tabular}{lcl}
 			$r$ & $=$ & $_{bs}\sum rs$\\[1.5ex]
@@ -900,10 +922,9 @@
 	\end{center}	
 \end{proof}
 \subsubsection{Property 3: $r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow} r_2 \backslash c$}
-The rewritability relation 
-$\rightsquigarrow$ is preserved under derivatives--
-it is just that we might need multiple steps 
-where originally only one step was needed:
+The rewrite relation 
+$\rightsquigarrow$ changes into $\stackrel{*}{\rightsquigarrow}$
+after derivatives are taken on both sides:
 \begin{lemma}\label{rewriteBder}
 	\hspace{0em}
 	\begin{itemize}
@@ -927,11 +948,11 @@
 	r_2 \backslash c$
 \end{corollary}
 \begin{proof}
-	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and using the previous lemma \ref{rewriteBder}.
+	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and   lemma \ref{rewriteBder}.
 \end{proof}
 \noindent
 This can be extended and combined with $r \rightsquigarrow^* \textit{bsimp} \; r$
-to obtain the rewritability between
+to obtain the correspondence between
 $\blexer$ and $\blexersimp$'s intermediate
 derivative regular expressions 
 \begin{lemma}\label{bderBderssimp}
@@ -941,52 +962,56 @@
 	By an induction on $s$.
 \end{proof}
 \subsection{Main Theorem}
-Now with \ref{bderBderssimp} we are ready for the main theorem.
+Now with \ref{bderBderssimp} in place we are ready for the main theorem.
 \begin{theorem}
 	$\blexer \; r \; s = \blexersimp{r}{s}$
 \end{theorem}
 \noindent
 \begin{proof}
-	One can rewrite in many steps from the original lexer's 
+	We can rewrite in many steps from the original lexer's 
 	derivative regular expressions to the 
 	lexer with simplification applied (by lemma \ref{bderBderssimp}):
 	\begin{center}
 		$a \backslash s \stackrel{*}{\rightsquigarrow} \bderssimp{a}{s} $.
 	\end{center}
-	we know that they give out the same bits, if the lexing result is a match:
+	We know that they generate the same bits, if the lexing result is a match:
 	\begin{center}
 		$\bnullable \; (a \backslash s) 
 		\implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$
 	\end{center}
-	Now that they give out the same bits, we know that they give the same value after decoding.
+	Now that they generate the same bits, we know that they give the same value after decoding.
 	\begin{center}
 		$\bnullable \; (a \backslash s) 
 		\implies \decode \; r \; (\bmkeps \; (a \backslash s)) = 
 		\decode \; r \; (\bmkeps \; (\bderssimp{a}{s}))$
 	\end{center}
-	Which is equivalent to our proof goal:
+	Which is required by our proof goal:
 	\begin{center}
 		$\blexer \; r \; s = \blexersimp \; r \; s$.
 	\end{center}	
 \end{proof}
 \noindent
 As a corollary,
-we link this result with the lemma we proved earlier that 
+we can link this result with the lemma we proved earlier that 
 \begin{center}
-	$(r, s) \rightarrow v \;\; \textit{iff}\;\; \blexer \; r \; s = v$
+	$(r, s) \rightarrow v \;\; \textit{iff}\;\; \blexer \; r \; s = \Some \;v$\\
+	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
+	r\;s = \None$.
 \end{center}
 and obtain the corollary that the bit-coded lexer with simplification is
 indeed correctly outputting POSIX lexing result, if such a result exists.
 \begin{corollary}
-	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp \; r\; s $
+	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp \; r\; s = \Some \; v$\\
+	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
+	r\;s = \None$.
 \end{corollary}
 
 \subsection{Comments on the Proof Techniques Used}
 Straightforward and simple as the proof may seem,
-the efforts we spent obtaining it were far from trivial.\\
+the efforts we spent obtaining it were far from trivial.
 We initially attempted to re-use the argument 
 in \cref{flex_retrieve}. 
-The problem was that both functions $\inj$ and $\retrieve$ require 
+The problem is that both functions $\inj$ and $\retrieve$ require 
 that the annotated regular expressions stay unsimplified, 
 so that one can 
 correctly compare $v_{i+1}$ and $r_i$  and $v_i$ 
@@ -1041,16 +1066,11 @@
 such as 
 $\SEQ r_1 \cdot (\SEQ r_1 \cdot r_3) \rightarrow
 \SEQs [r_1, r_2, r_3]$.
-This does not fit with the proof technique
+However this does not fit with the proof technique
 of our main theorem, but seem to not violate the POSIX
-property.\\
-Having correctness property is good. 
-But we would also a guarantee that the lexer is not slow in 
-some sense, for exampe, not grinding to a halt regardless of the input.
-As we have already seen, Sulzmann and Lu's simplification function
-$\simpsulz$ cannot achieve this, because their claim that
-the regular expression size does not grow arbitrary large
-was not true. 
-In the next chapter we shall prove that with our $\simp$, 
-for a given $r$, the internal derivative size is always
+property.
+
+Having established the correctness of our
+$\blexersimp$, in the next chapter we shall prove that with our $\simp$ function,
+for a given $r$, the derivative size is always
 finitely bounded by a constant.