diff -r 3e19073e91f4 -r 4aabb0629e4b ChengsongTanPhdThesis/Chapters/Bitcoded2.tex --- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Mon Aug 22 17:58:29 2022 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Tue Aug 23 14:21:13 2022 +0100 @@ -11,8 +11,20 @@ -Now we introduce the simplifications, which is why we introduce the -bitcodes in the first place. +In this chapter we introduce the simplifications +on annotated regular expressions that can be applied to +each intermediate derivative result. This allows +us to make $\blexer$ much more efficient. +We contrast this simplification function +with Sulzmann and Lu's original +simplifications, indicating the simplicity of our algorithm and +improvements we made, demostrating +the usefulness and reliability of formal proofs on algorithms. +These ``aggressive'' simplifications would not be possible in the injection-based +lexing we introduced in chapter \ref{Inj}. +We then go on to prove the correctness with the improved version of +$\blexer$, called $\blexersimp$, by establishing +$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system. \section{Simplification for Annotated Regular Expressions} The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s @@ -21,14 +33,15 @@ de-duplication at different levels: \begin{center} $(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow $\\ - $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$ + $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow \ldots$ \end{center} \noindent As we have already mentioned in \ref{eqn:growth2}, a simple-minded simplification function cannot simplify +\begin{center} $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$ -any further. -we would expect a better simplification function to work in the +\end{center} +any further, one would expect a better simplification function to work in the following way: \begin{gather*} ((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + @@ -40,7 +53,10 @@ \bigg\downarrow \\ (a^*a^* + a^* )\cdot(a^*a^*)^* - \color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^* + \color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*\\ + \bigg\downarrow \\ + (a^*a^* + a^* + )\cdot(a^*a^*)^* \end{gather*} \noindent This motivating example came from testing Sulzmann and Lu's @@ -82,16 +98,6 @@ $\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$ \end{tabular} \end{center} -%\[ -%\begin{aligned} -%&\textit{isPhi} \left(b s @ r i^{*}\right)=\text { False } \\ -%&\textit{isPhi}\left(b s @ r i_{1} r i_{2}\right)=\textit{isPhi} r i_{1} \vee \textit{isPhi} r i_{2} \\ -%&\textit{isPhi} \left(b s @ r i_{1} \oplus r i_{2}\right)=\textit{isPhi} r i_{1} \wedge \textit{isPhi} r i_{2} \\ -%&\textit{isPhi}(b s @ l)= False \\ -%&\textit{isPhi}(b s @ \epsilon)= False \\ -%&\textit{isPhi} \; \ZERO = True -%\end{aligned} -%\] \noindent Despite that we have already implemented the simple-minded simplifications such as throwing away useless $\ONE$s and $\ZERO$s.