diff -r 80cc6dc4c98b -r bd1354127574 ChengsongTanPhdThesis/Chapters/Bitcoded2.tex --- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Fri Dec 30 17:37:51 2022 +0000 +++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Fri Dec 30 23:41:44 2022 +0000 @@ -16,7 +16,7 @@ each intermediate derivative result. This allows us to make $\blexer$ much more efficient. Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions, -but their simplification functions were inefficient and in some cases needed fixing. +but their simplification functions could have been more efficient and in some cases needed fixing. %We contrast our simplification function %with Sulzmann and Lu's, indicating the simplicity of our algorithm. %This is another case for the usefulness @@ -42,7 +42,7 @@ \end{tabular} \end{center} \noindent -As can be seen, there are serveral duplications. +As can be seen, there are several duplications. A simple-minded simplification function cannot simplify the third regular expression in the above chain of derivative regular expressions, namely @@ -50,7 +50,7 @@ $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$ \end{center} because the duplicates are -not next to each other and therefore the rule +not next to each other, and therefore the rule $r+ r \rightarrow r$ from $\textit{simp}$ does not fire. One would expect a better simplification function to work in the following way: @@ -182,7 +182,7 @@ $ using Sulzmann and Lu's lexer}\label{SulzmannLuLexer} \end{figure} \noindent -At $n= 20$ we already get an out of memory error with Scala's normal +At $n= 20$ we already get an out-of-memory error with Scala's normal JVM heap size settings. In fact their simplification does not improve much over the simple-minded simplifications we have shown in \ref{fig:BetterWaterloo}. @@ -274,7 +274,7 @@ an entire list to open up possibilities for further simplifications with later regular expressions. Not flattening the rest of the elements also means that -the later de-duplication processs +the later de-duplication process does not fully remove further duplicates. For example, using $\textit{simp}\_{SL}$ we cannot @@ -418,13 +418,13 @@ \end{center} \noindent The most involved part is the $\sum$ clause, where we first call $\flts$ on -the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$. -and then call $\distinctBy$ on that list, the predicate determining whether two +the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$, +and then call $\distinctBy$ on that list. The predicate used in $\distinctBy$ for determining whether two elements are the same is $\rerases \; r_1 = \rerases\; r_2$. Finally, depending on whether the regular expression list $as'$ has turned into a singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{ALTS}$ decides whether to keep the current level constructor $\sum$ as it is, and -removes it when there are less than two elements: +removes it when there are fewer than two elements: \begin{center} \begin{tabular}{lcl} $\textit{bsimp}_{ALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\ @@ -621,7 +621,7 @@ The rules $LT$ and $LH$ are for rewriting two regular expression lists such that one regular expression in the left-hand-side list is rewritable in one step -to the right-hand-side's regular expression at the same position. +to the right-hand side's regular expression at the same position. This helps with defining the ``context rule'' $AL$. The reflexive transitive closure of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$ @@ -677,7 +677,7 @@ $r \rightsquigarrow^* r' \implies _{bs} \sum (r :: rs)\; \rightsquigarrow^*\; _{bs} \sum (r' :: rs)$ \item - The rewriting in many steps property is composible + The rewriting in many steps property is composable in terms of the sequence constructor:\\ $r_1 \rightsquigarrow^* r_2 \implies _{bs} r_1 \cdot r_3 \rightsquigarrow^* \; @@ -704,7 +704,7 @@ by the properties $r_1 \rightsquigarrow r_2 \implies \fuse \; bs \; r_1 \implies \fuse \; bs \; r_2$ and $rs_2 \stackrel{s}{\rightsquigarrow} rs_3 \implies \map \; (\fuse \; bs) rs_2 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs)\; rs_3$, - which can be indutively proven by the inductive cases of $\rightsquigarrow$ and + which can be inductively proven by the inductive cases of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$. \end{proof} \noindent @@ -781,7 +781,7 @@ For convenience, we define a predicate for a list of regular expressions -having at least one nullable regular expressions: +having at least one nullable regular expression: \begin{center} $\textit{bnullables} \; rs \quad \dn \quad \exists r \in rs. \;\; \bnullable \; r$ \end{center} @@ -803,7 +803,7 @@ \end{lemma} \begin{proof} By rule induction of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$. - The third point is a corollary of the second. + The third point is a result of the second. \end{proof} \noindent For convenience again, @@ -862,7 +862,7 @@ (rs_1 @ (\distinctBy \; rs_2 \; \; \rerases \;\; (\map\;\; \rerases \; \; rs_1)))$ \end{lemma} \noindent -It says that that for a list made of two parts $rs_1 @ rs_2$, +It says that for a list made of two parts $rs_1 @ rs_2$, one can throw away the duplicate elements in $rs_2$, as well as those that have appeared in $rs_1$. \begin{proof} @@ -946,7 +946,7 @@ and $\stackrel{s}{\rightsquigarrow}$, using a number of the previous lemmas. \end{proof} \noindent -Now we can prove property 3, as an immediate corollary: +Now we can prove property 3 as an immediate corollary: \begin{corollary}\label{rewritesBder} $r_1 \rightsquigarrow^* r_2 \implies r_1 \backslash c \rightsquigarrow^* r_2 \backslash c$ @@ -983,7 +983,7 @@ $\bnullable \; (a \backslash s) \implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$ \end{center} - Now that they generate the same bits, we know that they give the same value after decoding. + Now that they generate the same bits, we know they also give the same value after decoding. \begin{center} $\bnullable \; (a \backslash s) \implies \decode \; r \; (\bmkeps \; (a \backslash s)) = @@ -1011,7 +1011,7 @@ \end{corollary} \subsection{Comments on the Proof Techniques Used} -Straightforward and simple as the proof may seem, +Straightforward as the proof may seem, the efforts we spent obtaining it were far from trivial. We initially attempted to re-use the argument in \cref{flex_retrieve}. @@ -1060,13 +1060,13 @@ $_{Z}(_{Z} \ONE + _{S} c)$ \end{center} -as equal, because they were both re-written +as equal because they were both re-written from the same expression. The simplification rewriting rules given in \ref{rrewriteRules} are by no means final, -one could come up new rules +one could come up with new rules such as $\SEQ r_1 \cdot (\SEQ r_1 \cdot r_3) \rightarrow \SEQs [r_1, r_2, r_3]$.