lexing: comparison ChengsongTanPhdThesis/Chapters/Bitcoded2.tex

equal deleted inserted replaced

-:3e19073e91f4
+:4aabb0629e4b
 %its correctness proof in
 %Chapter 3\ref{Chapter3}.
-Now we introduce the simplifications, which is why we introduce the
+In this chapter we introduce the simplifications
-bitcodes in the first place.
+on annotated regular expressions that can be applied to
+each intermediate derivative result. This allows
+us to make $\blexer$ much more efficient.
+We contrast this simplification function
+with Sulzmann and Lu's original
+simplifications, indicating the simplicity of our algorithm and
+improvements we made, demostrating
+the usefulness and reliability of formal proofs on algorithms.
+These ``aggressive'' simplifications would not be possible in the injection-based
+lexing we introduced in chapter \ref{Inj}.
+We then go on to prove the correctness with the improved version of
+$\blexer$, called $\blexersimp$, by establishing
+$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
 \section{Simplification for Annotated Regular Expressions}
 The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s
 and $(a^* + (aa)^*)^*$'s derivatives is that a lot of duplicated sub-patterns
 are scattered around different levels, and therefore requires
 de-duplication at different levels:
 \begin{center}
 	$(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow $\\
-	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
+	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow \ldots$
 \end{center}
 \noindent
 As we have already mentioned in \ref{eqn:growth2},
 a simple-minded simplification function cannot simplify
+\begin{center}
 $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
-any further.
+\end{center}
-we would expect a better simplification function to work in the
+any further, one would expect a better simplification function to work in the
 following way:
 \begin{gather*}
 	((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* +
 	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\
 	\bigg\downarrow \\
 	\color{gray} + a^* \color{black})\cdot(a^*a^*)^* +
 	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\
 	\bigg\downarrow \\
 	(a^*a^* + a^*
 	)\cdot(a^*a^*)^*
-	\color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*
+	\color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*\\
+	\bigg\downarrow \\
+	(a^*a^* + a^*
+	)\cdot(a^*a^*)^*
 \end{gather*}
 \noindent
 This motivating example came from testing Sulzmann and Lu's
 algorithm: their simplification does
 not work!
 		$\zeroable\;_{bs}c$ & $\dn$ & $\textit{false}$\\
 		$\zeroable\;_{bs}\ONE$ & $\dn$ & $\textit{false}$\\
 		$\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$
 	\end{tabular}
 \end{center}
-%\[
-%\begin{aligned}
-%&\textit{isPhi} \left(b s @ r i^{*}\right)=\text { False } \\
-%&\textit{isPhi}\left(b s @ r i_{1} r i_{2}\right)=\textit{isPhi}  r i_{1} \vee \textit{isPhi}  r i_{2} \\
-%&\textit{isPhi} \left(b s @ r i_{1} \oplus r i_{2}\right)=\textit{isPhi}  r i_{1} \wedge \textit{isPhi}  r i_{2} \\
-%&\textit{isPhi}(b s @ l)= False \\
-%&\textit{isPhi}(b s @ \epsilon)= False \\
-%&\textit{isPhi} \; \ZERO = True
-%\end{aligned}
-%\]
 \noindent
 Despite that we have already implemented the simple-minded simplifications
 such as throwing away useless $\ONE$s and $\ZERO$s.
 The simplification rule $r + r \rightarrow $ cannot make a difference either
 since it only removes duplicates on the same level, not something like

changeset 583	4aabb0629e4b
parent 582	3e19073e91f4
child 584	1734bd5975a3