ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
changeset 600 fd068f39ac23
parent 591 b2d0de6aee18
child 601 ce4e5151a836
--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Sat Sep 10 22:30:22 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Mon Sep 12 23:32:18 2022 +0200
@@ -11,39 +11,43 @@
 
 
 
-In this chapter we introduce the simplifications
+In this chapter we introduce simplifications
 on annotated regular expressions that can be applied to 
 each intermediate derivative result. This allows
 us to make $\blexer$ much more efficient.
-We contrast this simplification function 
-with Sulzmann and Lu's original
-simplifications, indicating the simplicity of our algorithm and
-improvements we made, demostrating
-the usefulness and reliability of formal proofs on algorithms.
+Sulzmann and Lu already had some bit-coded simplifications,
+but their simplification functions  were inefficient.
+We contrast our simplification function 
+with Sulzmann and Lu's, indicating the simplicity of our algorithm.
+This is another case for the usefulness 
+and reliability of formal proofs on algorithms.
 These ``aggressive'' simplifications would not be possible in the injection-based 
 lexing we introduced in chapter \ref{Inj}.
-We then go on to prove the correctness with the improved version of 
+We then prove the correctness with the improved version of 
 $\blexer$, called $\blexersimp$, by establishing 
 $\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
 
 \section{Simplifications by Sulzmann and Lu}
-The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s
-and $(a^* + (aa)^*)^*$'s derivatives is that a lot of duplicated sub-patterns
-are scattered around different levels, and therefore requires 
-de-duplication at different levels:
+Consider the derivatives of examples such as $(a^*a^*)^*$
+and $(a^* + (aa)^*)^*$:
 \begin{center}
 	$(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} (a^*a^* + a^*)\cdot(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} $\\
 	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^* \stackrel{\backslash a}{\longrightarrow} \ldots$
 \end{center}
 \noindent
-As we have already mentioned in \ref{eqn:growth2},
-a simple-minded simplification function cannot simplify
+As can be seen, there is a lot of duplication 
+in the example we have already mentioned in 
+\ref{eqn:growth2}.
+A simple-minded simplification function cannot simplify
 the third regular expression in the above chain of derivative
-regular expressions:
+regular expressions, namely
 \begin{center}
 $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
 \end{center}
-one would expect a better simplification function to work in the 
+because the duplicates are
+not next to each other and therefore the rule
+$r+ r \rightarrow r$ does not fire.
+One would expect a better simplification function to work in the 
 following way:
 \begin{gather*}
 	((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + 
@@ -61,10 +65,16 @@
 	)\cdot(a^*a^*)^*  
 \end{gather*}
 \noindent
-This motivating example came from testing Sulzmann and Lu's 
+In the first step, the nested alternative regular expression
+$(a^*a^* + a^*) + a^*$ is flattened into $a^*a^* + a^* + a^*$.
+Now the third term $a^*$ is clearly identified as a duplicate
+and therefore removed in the second step. This causes the two
+top-level terms to become the same and the second $(a^*a^*+a^*)\cdot(a^*a^*)^*$ 
+removed in the final step.\\
+This motivating example is from testing Sulzmann and Lu's 
 algorithm: their simplification does 
 not work!
-We quote their $\textit{simp}$ function verbatim here:
+Consider their simplification (using our notations):
 \begin{center}
 	\begin{tabular}{lcl}
 		$\simpsulz \; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & 
@@ -85,10 +95,10 @@
 	\end{tabular}
 \end{center}
 \noindent
-the $\textit{zeroable}$ predicate 
-which tests whether the regular expression
+where the $\textit{zeroable}$ predicate 
+tests whether the regular expression
 is equivalent to $\ZERO$,
-is defined as:
+can be defined as:
 \begin{center}
 	\begin{tabular}{lcl}
 		$\zeroable \; _{bs}\sum (r::rs)$ & $\dn$ & $\zeroable \; r\;\; \land \;\;
@@ -190,6 +200,23 @@
 The assumption that the size of the regular expressions
 in the algorithm
 would stay below a finite constant is not ture.
+The main reason behind this is that (i) The $\textit{nub}$
+function requires identical annotations between two 
+annotated regular expressions to qualify as duplicates,
+and cannot simplify the cases like $_{SZZ}a^*+_{SZS}a^*$
+even if both $a^*$ denote the same language.
+(ii) The ``flattening'' only applies to the head of the list
+in the 
+\begin{center}
+	\begin{tabular}{lcl}
+
+		$\simpsulz  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
+		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
+	\end{tabular}
+\end{center}
+\noindent
+clause, and therefore is not thorough enough to simplify all
+needed parts of the regular expression.\\
 In addition to that, even if the regular expressions size
 do stay finite, one has to take into account that
 the $\simpsulz$ function is applied many times
@@ -209,14 +236,14 @@
 by our solution. These solutions come with correctness
 statements that are backed up by formal proofs.
 \subsection{Flattening Nested Alternatives}
-The idea behind the 
+The idea behind the clause
 \begin{center}
 $\simpsulz  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2) \quad \dn \quad
 	       _{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$
 \end{center}
-clause is that it allows
+is that it allows
 duplicate removal of regular expressions at different
-levels.
+``levels'' of alternatives.
 For example, this would help with the
 following simplification:
 
@@ -226,15 +253,15 @@
 The problem here is that only the head element
 is ``spilled out'',
 whereas we would want to flatten
-an entire list to open up possibilities for further simplifications.\\
+an entire list to open up possibilities for further simplifications.
 Not flattening the rest of the elements also means that
 the later de-duplication processs 
-does not fully remove apparent duplicates.
+does not fully remove further duplicates.
 For example,
 using $\simpsulz$ we could not 
 simplify
 \begin{center}
-$((a^* a^*)+ (a^* + a^*))\cdot (a^*a^*)^*+
+	$((a^* a^*)+\underline{(a^* + a^*)})\cdot (a^*a^*)^*+
 ((a^*a^*)+a^*)\cdot (a^*a^*)^*$
 \end{center}
 due to the underlined part not in the first element
@@ -427,17 +454,16 @@
 This algorithm keeps the regular expression size small.
 
 
-\subsection{$(a+aa)^*$ and $(a^*\cdot a^*)^*$  against 
-$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}$ After Simplification}
-For example,
-with our simplification the
+\subsection{Examples $(a+aa)^*$ and $(a^*\cdot a^*)^*$
+After Simplification}
+Recall the
 previous $(a^*a^*)^*$ example
 where $\simpsulz$ could not
-stop the fast growth (over
+prevent the fast growth (over
 3 million nodes just below $20$ input length)
-will be reduced to just 15 and stays constant, no matter how long the
+will be reduced to just 15 and stays constant no matter how long the
 input string is.
-This is demonstrated in the graphs below.
+This is shown in the graphs below.
 \begin{figure}[H]
 \begin{center}
 \begin{tabular}{ll}
@@ -482,12 +508,11 @@
 \section{Correctness of $\blexersimp$}
 In this section we give details
 of the correctness proof of $\blexersimp$,
-an important contribution of this thesis.\\
+one of the contributions of this thesis.\\
 We first introduce the rewriting relation \emph{rrewrite}
 ($\rrewrite$) between two regular expressions,
 which expresses an atomic
-simplification step from the left-hand-side
-to the right-hand-side.
+simplification.
 We then prove properties about
 this rewriting relation and its reflexive transitive closure.
 Finally we leverage these properties to show
@@ -496,7 +521,7 @@
 
 \subsection{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
 In the $\blexer$'s correctness proof, we
-did not directly derive the fact that $\blexer$ gives out the POSIX value,
+did not directly derive the fact that $\blexer$ generates the POSIX value,
 but first proved that $\blexer$ is linked with $\lexer$.
 Then we re-use
 the correctness of $\lexer$
@@ -511,8 +536,8 @@
 produces the same output as $\blexer \; r\; s$,
 and then piecing it together with 
 $\blexer$'s correctness to achieve our main
-theorem:\footnote{ the case when 
-$s$ is not in $L \; r$, is routine to establish }
+theorem:\footnote{ The case when 
+$s$ is not in $L \; r$, is routine to establish.}
 \begin{center}
 	$(r, s) \rightarrow v \; \;   \textit{iff} \;\;  \blexersimp \; r \; s = v$
 \end{center}
@@ -526,7 +551,7 @@
 \end{center}
 where each rewrite step, written $\rightsquigarrow$,
 is an ``atomic'' simplification that
-cannot be broken down any further:
+is similar to a small-step reduction in operational semantics:
 \begin{figure}[H]
 \begin{mathpar}
 	\inferrule * [Right = $S\ZERO_l$]{\vspace{0em}}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
@@ -592,8 +617,10 @@
 \caption{The Reflexive Transitive Closure of 
 $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$}\label{transClosure}
 \end{figure}
-Two rewritable terms will remain rewritable to each other
-even after a derivative is taken:
+%Two rewritable terms will remain rewritable to each other
+%even after a derivative is taken:
+Rewriting is preserved under derivatives,
+namely
 \begin{center}
 	$r_1 \rightsquigarrow r_2 \implies (r_1 \backslash c) \rightsquigarrow^* (r_2 \backslash c)$
 \end{center}
@@ -956,7 +983,7 @@
 
 \subsection{Comments on the Proof Techniques Used}
 Straightforward and simple as the proof may seem,
-the efforts we spent obtaining it was far from trivial.\\
+the efforts we spent obtaining it were far from trivial.\\
 We initially attempted to re-use the argument 
 in \cref{flex_retrieve}. 
 The problem was that both functions $\inj$ and $\retrieve$ require 
@@ -1007,6 +1034,16 @@
 \end{center}
 as equal, because they were both re-written
 from the same expression.\\
+The simplification rewriting rules
+given in \ref{rrewriteRules} are by no means
+final,
+one could come up new rules
+such as 
+$\SEQ r_1 \cdot (\SEQ r_1 \cdot r_3) \rightarrow
+\SEQs [r_1, r_2, r_3]$.
+This does not fit with the proof technique
+of our main theorem, but seem to not violate the POSIX
+property.\\
 Having correctness property is good. 
 But we would also a guarantee that the lexer is not slow in 
 some sense, for exampe, not grinding to a halt regardless of the input.