ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
changeset 543 b2bea5968b89
parent 539 7cf9f17aa179
child 576 3e1b699696b6
equal deleted inserted replaced
542:a7344c9afbaf 543:b2bea5968b89
    12 
    12 
    13 
    13 
    14 Now we introduce the simplifications, which is why we introduce the 
    14 Now we introduce the simplifications, which is why we introduce the 
    15 bitcodes in the first place.
    15 bitcodes in the first place.
    16 
    16 
    17 \subsection*{Simplification Rules}
    17 \section{Simplification for Annotated Regular Expressions}
    18 
    18 The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s
    19 This section introduces aggressive (in terms of size) simplification rules
    19 and $(a^* + (aa)^*)^*$'s derivatives is that a lot of duplicated sub-patterns
    20 on annotated regular expressions
    20 are scattered around different levels:
    21 to keep derivatives small. Such simplifications are promising
    21 
    22 as we have
    22 \begin{center}
    23 generated test data that show
    23 	$(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
    24 that a good tight bound can be achieved. We could only
    24 	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
    25 partially cover the search space as there are infinitely many regular
    25 \end{center}
    26 expressions and strings. 
    26 \noindent
    27 
    27 Despite that we have already implemented the simple-minded simplifications 
    28 One modification we introduced is to allow a list of annotated regular
    28 such as throwing away useless $\ONE$s and $\ZERO$s.
    29 expressions in the $\sum$ constructor. This allows us to not just
    29 The simplification rule $r + r \rightarrow $ cannot make a difference either
    30 delete unnecessary $\ZERO$s and $\ONE$s from regular expressions, but
    30 since it only removes duplicates on the same level, not something like 
    31 also unnecessary ``copies'' of regular expressions (very similar to
    31 $(r+a)+r \rightarrow r+a$.
    32 simplifying $r + r$ to just $r$, but in a more general setting). Another
    32 This requires us to break up nested alternatives into lists, for example 
    33 modification is that we use simplification rules inspired by Antimirov's
    33 using the flatten operation similar to the one defined for any function language's
    34 work on partial derivatives. They maintain the idea that only the first
    34 list datatype:
    35 ``copy'' of a regular expression in an alternative contributes to the
    35 
    36 calculation of a POSIX value. All subsequent copies can be pruned away from
    36  \begin{center}
    37 the regular expression. A recursive definition of our  simplification function 
    37   \begin{tabular}{@{}lcl@{}}
       
    38   $\textit{flts} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
       
    39      (\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flts} \; as' $ \\
       
    40   $\textit{flts} \; \ZERO :: as'$ & $\dn$ & $ \textit{flts} \;  \textit{as'} $ \\
       
    41     $\textit{flts} \; a :: as'$ & $\dn$ & $a :: \textit{flts} \; \textit{as'}$ \quad(otherwise) 
       
    42 \end{tabular}    
       
    43 \end{center}  
       
    44 
       
    45 \noindent
       
    46 There is a minor difference though, in that our $\flts$ operation defined by us
       
    47 also throws away $\ZERO$s.
       
    48 For a flattened list of regular expressions, a de-duplication can be done efficiently:
       
    49 
       
    50 
       
    51 \begin{center}
       
    52 	\begin{tabular}{@{}lcl@{}}
       
    53 		$\distinctBy \; [] \; f\; acc $ & $ =$ & $ []$\\
       
    54 		$\distinctBy \; (x :: xs) \; f \; acc$ & $=$ & \\
       
    55 						       & & $\quad \textit{if} (f \; x) \in acc \textit{then} \distinctBy \; xs \; f \; acc$\\
       
    56 						       & & $\quad \textit{else} x :: (\distinctBy \; xs \; f \; (\{f \; x\} \cup acc))$ 
       
    57 	\end{tabular}
       
    58 \end{center}
       
    59 \noindent
       
    60 The reason we define a distinct function under a mapping $f$ is because
       
    61 we want to eliminate regular expressions that are the same syntactically,
       
    62 but having different bit-codes (more on the reason why we can do this later).
       
    63 To achieve this, we call $\erase$ as the function $f$ during the distinction
       
    64 operation.
       
    65 A recursive definition of our  simplification function 
    38 that looks somewhat similar to our Scala code is given below:
    66 that looks somewhat similar to our Scala code is given below:
    39 %\comment{Use $\ZERO$, $\ONE$ and so on. 
       
    40 %Is it $ALTS$ or $ALTS$?}\\
       
    41 
    67 
    42 \begin{center}
    68 \begin{center}
    43   \begin{tabular}{@{}lcl@{}}
    69   \begin{tabular}{@{}lcl@{}}
    44    
    70    
    45   $\textit{simp} \; (_{bs}a_1\cdot a_2)$ & $\dn$ & $ (\textit{simp} \; a_1, \textit{simp}  \; a_2) \; \textit{match} $ \\
    71 	  $\textit{bsimp} \; (_{bs}a_1\cdot a_2)$ & $\dn$ & $ \textit{bsimp}_{ASEQ} \; bs \;(\textit{bsimp} \; a_1) \; (\textit{bsimp}  \; a_2)  $ \\
       
    72 	  $\textit{bsimp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{bsimp}_{ALTS} \; \textit{bs} \; (\textit{distinctBy} \; ( \textit{flatten} ( \textit{map} \; bsimp \; as)) \; \erase \; \phi) $ \\
       
    73    $\textit{bsimp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
       
    74 \end{tabular}    
       
    75 \end{center}    
       
    76 
       
    77 \noindent
       
    78 The simplification (named $\bsimp$ for \emph{b}it-coded) does a pattern matching on the regular expression.
       
    79 When it detected that the regular expression is an alternative or
       
    80 sequence, it will try to simplify its children regular expressions
       
    81 recursively and then see if one of the children turns into $\ZERO$ or
       
    82 $\ONE$, which might trigger further simplification at the current level.
       
    83 Current level simplifications are handled by the function $\textit{bsimp}_{ASEQ}$,
       
    84 using rules such as  $\ZERO \cdot r \rightarrow \ZERO$ and $\ONE \cdot r \rightarrow r$.
       
    85 \begin{center}
       
    86 	\begin{tabular}{@{}lcl@{}}
       
    87 		$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
    46    &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
    88    &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
    47    &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
    89    &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
    48    &&$\quad\textit{case} \;  (\ONE, a_2') \Rightarrow  \textit{fuse} \; bs \;  a_2'$ \\
    90    &&$\quad\textit{case} \;  (_{bs1}\ONE, a_2') \Rightarrow  \textit{fuse} \; (bs@bs_1) \;  a_2'$ \\
    49    &&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow  \textit{fuse} \; bs \;  a_1'$ \\
    91    &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ 
    50    &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ \\
    92 	\end{tabular}
    51 
    93 \end{center}
    52   $\textit{simp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map} \; simp \; as)) \; \textit{match} $ \\
    94 \noindent
       
    95 The most involved part is the $\sum$ clause, where we first call $\flts$ on
       
    96 the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$.
       
    97 and then call $\distinctBy$ on that list, the predicate determining whether two 
       
    98 elements are the same is $\erase \; r_1 = \erase\; r_2$.
       
    99 Finally, depending on whether the regular expression list $as'$ has turned into a
       
   100 singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{AALTS}$
       
   101 decides whether to keep the current level constructor $\sum$ as it is, and 
       
   102 removes it when there are less than two elements:
       
   103 \begin{center}
       
   104 	\begin{tabular}{lcl}
       
   105 		$\textit{bsimp}_{AALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\		
    53   &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
   106   &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
    54    &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
   107    &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
    55    &&$\quad\textit{case} \;  as' \Rightarrow _{bs}\sum \textit{as'}$\\ 
   108    &&$\quad\textit{case} \;  as' \Rightarrow _{bs}\sum \textit{as'}$\\ 
    56 
   109 	\end{tabular}
    57    $\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
   110 	
    58 \end{tabular}    
   111 \end{center}
    59 \end{center}    
   112 Having defined the $\bsimp$ function,
    60 
   113 we add it as a phase after a derivative is taken,
    61 \noindent
   114 so it stays small:
    62 The simplification does a pattern matching on the regular expression.
   115 \begin{center}
    63 When it detected that the regular expression is an alternative or
   116 	\begin{tabular}{lcl}
    64 sequence, it will try to simplify its child regular expressions
   117 		$r \backslash_{bsimp} s$ & $\dn$ & $\textit{bsimp}(r \backslash s)$
    65 recursively and then see if one of the children turns into $\ZERO$ or
   118 	\end{tabular}
    66 $\ONE$, which might trigger further simplification at the current level.
   119 \end{center}
    67 The most involved part is the $\sum$ clause, where we use two
   120 Following previous notation of  natural
    68 auxiliary functions $\textit{flatten}$ and $\textit{distinct}$ to open up nested
       
    69 alternatives and reduce as many duplicates as possible. Function
       
    70 $\textit{distinct}$  keeps the first occurring copy only and removes all later ones
       
    71 when detected duplicates. Function $\textit{flatten}$ opens up nested $\sum$s.
       
    72 Its recursive definition is given below:
       
    73 
       
    74  \begin{center}
       
    75   \begin{tabular}{@{}lcl@{}}
       
    76   $\textit{flatten} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
       
    77      (\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flatten} \; as' $ \\
       
    78   $\textit{flatten} \; \ZERO :: as'$ & $\dn$ & $ \textit{flatten} \;  \textit{as'} $ \\
       
    79     $\textit{flatten} \; a :: as'$ & $\dn$ & $a :: \textit{flatten} \; \textit{as'}$ \quad(otherwise) 
       
    80 \end{tabular}    
       
    81 \end{center}  
       
    82 
       
    83 \noindent
       
    84 Here $\textit{flatten}$ behaves like the traditional functional programming flatten
       
    85 function, except that it also removes $\ZERO$s. Or in terms of regular expressions, it
       
    86 removes parentheses, for example changing $a+(b+c)$ into $a+b+c$.
       
    87 
       
    88 Having defined the $\simp$ function,
       
    89 we can use the previous notation of  natural
       
    90 extension from derivative w.r.t.~character to derivative
   121 extension from derivative w.r.t.~character to derivative
    91 w.r.t.~string:%\comment{simp in  the [] case?}
   122 w.r.t.~string, we define the derivative that nests simplifications with derivatives:%\comment{simp in  the [] case?}
    92 
   123 
    93 \begin{center}
   124 \begin{center}
    94 \begin{tabular}{lcl}
   125 \begin{tabular}{lcl}
    95 $r \backslash_{simp} (c\!::\!s) $ & $\dn$ & $(r \backslash_{simp}\, c) \backslash_{simp}\, s$ \\
   126 $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
    96 $r \backslash_{simp} [\,] $ & $\dn$ & $r$
   127 $r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
    97 \end{tabular}
   128 \end{tabular}
    98 \end{center}
   129 \end{center}
    99 
   130 
   100 \noindent
   131 \noindent
   101 to obtain an optimised version of the algorithm:
   132 Extracting bit-codes from the derivatives that had been simplified repeatedly after 
   102 
   133 each derivative run, the simplified $\blexer$, called $\blexersimp$,
       
   134 is defined as 
   103  \begin{center}
   135  \begin{center}
   104 \begin{tabular}{lcl}
   136 \begin{tabular}{lcl}
   105   $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
   137   $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
   106       $\textit{let}\;a = (r^\uparrow)\backslash_{simp}\, s\;\textit{in}$\\                
   138       $\textit{let}\;a = (r^\uparrow)\backslash_{simp}\, s\;\textit{in}$\\                
   107   & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
   139   & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
   110 \end{tabular}
   142 \end{tabular}
   111 \end{center}
   143 \end{center}
   112 
   144 
   113 \noindent
   145 \noindent
   114 This algorithm keeps the regular expression size small, for example,
   146 This algorithm keeps the regular expression size small, for example,
   115 with this simplification our previous $(a + aa)^*$ example's 8000 nodes
   147 with this simplification our previous $(a + aa)^*$ example's fast growth (over
       
   148 $10^5$ nodes at around $20$ input length)
   116 will be reduced to just 17 and stays constant, no matter how long the
   149 will be reduced to just 17 and stays constant, no matter how long the
   117 input string is.
   150 input string is.
   118 
   151 We show some graphs to better demonstrate this imrpovement.
   119 
   152 
   120 
   153 
   121 \section{ $(a^*\cdot a^*)^*$ against $\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}$ After Simplification}
   154 \section{$(a+aa)^*$ and $(a^*\cdot a^*)^*$  against $\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}$ After Simplification}
   122 We 
   155 For $(a+aa)^*$, it used to grow to over $9000$ nodes with simple-minded simplification
   123 Unlike in \ref{fig:BetterWaterloo}, the size with simplification now stay nicely constant with current
   156 at only around $15$ input characters:
   124 simplification rules (we put the graphs together to show the contrast)
   157 \begin{center}
   125 \begin{tabular}{ll}
   158 \begin{tabular}{ll}
   126 \begin{tikzpicture}
   159 \begin{tikzpicture}
   127 \begin{axis}[
   160 \begin{axis}[
   128     xlabel={$n$},
   161     xlabel={$n$},
   129     ylabel={derivative size},
   162     ylabel={derivative size},
   130         width=7cm,
   163         width=7cm,
   131     height=4cm, 
   164     height=4cm, 
   132     legend entries={Lexer with $\bsimp$},  
   165     legend entries={Lexer with $\textit{bsimp}$},  
       
   166     legend pos=  south east,
       
   167     legend cell align=left]
       
   168 \addplot[red,mark=*, mark options={fill=white}] table {a_aa_star_bsimp.data};
       
   169 \end{axis}
       
   170 \end{tikzpicture} %\label{fig:BitcodedLexer}
       
   171 &
       
   172 \begin{tikzpicture}
       
   173 \begin{axis}[
       
   174     xlabel={$n$},
       
   175     ylabel={derivative size},
       
   176     width = 7cm,
       
   177     height = 4cm,
       
   178     legend entries={Lexer without $\textit{bsimp}$},  
       
   179     legend pos=  north west,
       
   180     legend cell align=left]
       
   181 \addplot[red,mark=*, mark options={fill=white}] table {a_aa_star_easysimp.data};
       
   182 \end{axis}
       
   183 \end{tikzpicture} 
       
   184 \end{tabular}
       
   185 \end{center}
       
   186 And for $(a^*\cdot a^*)^*$, unlike in \ref{fig:BetterWaterloo}, the size with simplification now stay nicely constant with current
       
   187 simplification rules (we put the graphs together to show the contrast)
       
   188 \begin{center}
       
   189 \begin{tabular}{ll}
       
   190 \begin{tikzpicture}
       
   191 \begin{axis}[
       
   192     xlabel={$n$},
       
   193     ylabel={derivative size},
       
   194         width=7cm,
       
   195     height=4cm, 
       
   196     legend entries={Lexer with $\textit{bsimp}$},  
   133     legend pos=  south east,
   197     legend pos=  south east,
   134     legend cell align=left]
   198     legend cell align=left]
   135 \addplot[red,mark=*, mark options={fill=white}] table {BitcodedLexer.data};
   199 \addplot[red,mark=*, mark options={fill=white}] table {BitcodedLexer.data};
   136 \end{axis}
   200 \end{axis}
   137 \end{tikzpicture} %\label{fig:BitcodedLexer}
   201 \end{tikzpicture} %\label{fig:BitcodedLexer}
   140 \begin{axis}[
   204 \begin{axis}[
   141     xlabel={$n$},
   205     xlabel={$n$},
   142     ylabel={derivative size},
   206     ylabel={derivative size},
   143     width = 7cm,
   207     width = 7cm,
   144     height = 4cm,
   208     height = 4cm,
   145     legend entries={Lexer without $\bsimp$},  
   209     legend entries={Lexer without $\textit{bsimp}$},  
   146     legend pos=  north west,
   210     legend pos=  north west,
   147     legend cell align=left]
   211     legend cell align=left]
   148 \addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
   212 \addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
   149 \end{axis}
   213 \end{axis}
   150 \end{tikzpicture} 
   214 \end{tikzpicture} 
   151 \end{tabular}
   215 \end{tabular}
   152 
   216 \end{center}
   153 
       
   154 
       
   155 
       
   156 
   217 
   157 %----------------------------------------------------------------------------------------
   218 %----------------------------------------------------------------------------------------
   158 %	SECTION common identities
   219 %	SECTION rewrite relation
   159 %----------------------------------------------------------------------------------------
   220 %----------------------------------------------------------------------------------------
   160 \section{Common Identities In Simplification-Related Functions} 
   221 \section{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
   161 Need to prove these before starting on the big correctness proof.
   222 The overall idea for the correctness 
   162 
   223 \begin{conjecture}
   163 %-----------------------------------
   224 	$\blexersimp \; r \; s = \blexer \; r\; s$
   164 %	SUBSECTION 
   225 \end{conjecture}
   165 %-----------------------------------
   226 is that the $\textit{bsimp}$ will not change the regular expressions so much that
   166 \subsection{Idempotency of $\simp$}
   227 it becomes impossible to extract the $\POSIX$ values.
   167 
   228 To capture this "similarity" between unsimplified regular expressions and simplified ones,
   168 \begin{equation}
   229 we devise the rewriting relation $\rrewrite$, written infix as $\rightsquigarrow$:
   169 	\simp \;r = \simp\; \simp \; r 
   230 
   170 \end{equation}
   231 \begin{center}
   171 This property means we do not have to repeatedly
   232 \begin{mathpar}
   172 apply simplification in each step, which justifies
   233 	\inferrule{}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
   173 our definition of $\blexersimp$.
   234 
   174 It will also be useful in future proofs where properties such as 
   235 	\inferrule{}{_{bs} r_1 \cdot \ZERO \rightsquigarrow \ZERO\\}
   175 closed forms are needed.
   236 
   176 The proof is by structural induction on $r$.
   237 	\inferrule{}{_{bs1} ((_{bs2} \ONE) \cdot r) \rightsquigarrow \fuse \; (bs_1 @ bs_2) \; r\\}\\
   177 
   238 
   178 %-----------------------------------
   239 	
   179 %	SUBSECTION 
   240 	
   180 %-----------------------------------
   241 	\inferrule{\\ r_1 \rightsquigarrow r_2}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_2 \cdot r_3\\}
   181 \subsection{Syntactic Equivalence Under $\simp$}
   242 
   182 We prove that minor differences can be annhilated
   243 	\inferrule{\\ r_3 \rightsquigarrow r_4}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_1 \cdot r_4\\}\\
   183 by $\simp$.
   244 
   184 For example,
   245 	\inferrule{}{ _{bs}\sum [] \rightsquigarrow \ZERO\\}
   185 \begin{center}
   246 
   186 $\simp \;(\simpALTs\; (\map \;(\_\backslash \; x)\; (\distinct \; \mathit{rs}\; \phi))) = 
   247 	\inferrule{}{ _{bs}\sum [a] \rightsquigarrow \fuse \; bs \; a\\}
   187  \simp \;(\simpALTs \;(\distinct \;(\map \;(\_ \backslash\; x) \; \mathit{rs}) \; \phi))$
   248 
   188 \end{center}
   249 	\inferrule{\\ rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{_{bs}\sum rs_1 \rightsquigarrow rs_2\\}
   189 
   250 
   190 
   251 	\inferrule{}{\\ [] \stackrel{s}{\rightsquigarrow} []}
   191 
   252 
   192 
   253 	\inferrule{rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{\\ r :: rs_1 \rightsquigarrow r :: rs_2 \rightsquigarrow}
   193 
   254 
   194 
   255 	\inferrule{r_1 \rightsquigarrow r_2}{ r_1 :: rs \stackrel{s}{\rightsquigarrow} r_2 :: rs\\}
   195 
   256 
   196 %----------------------------------------------------------------------------------------
   257 	\inferrule{}{\ZERO :: rs \stackrel{s}{\rightsquigarrow} rs}
   197 %	SECTION corretness proof
   258 
   198 %----------------------------------------------------------------------------------------
   259 	\inferrule{}{_{bs} \sum (rs_1 :: rs_b) \stackrel{s}{\rightsquigarrow} ((\map \; (\fuse \; bs_1) \; rs_1) @ rsb) }
   199 \section{Proof Technique of Correctness of Bit-coded Algorithm with Simplification}
   260 
       
   261 	\inferrule{\\ \rerase{a_1} = \rerase{a_2}}{rs_a @ [a_1] @ rs_b @ [a_2] @ rsc \stackrel{s}{\rightsquigarrow} rs_a @ [a_1] @ rs_b @ rs_c}
       
   262 
       
   263 \end{mathpar}
       
   264 \end{center}
       
   265 These "rewrite" steps define the atomic simplifications we could impose on regular expressions
       
   266 under our simplification algorithm.
       
   267 For convenience, we define a relation $\stackrel{s}{\rightsquigarrow}$ for rewriting
       
   268 a list of regular exression to another list.
       
   269 The $\rerase{}$ function is used instead of $_\downarrow$ for the finiteness bound proof of next chapter,
       
   270 which we will discuss later. For the moment the reader can assume they basically do the same thing as $\erase$.
       
   271 
       
   272 Usually more than one steps are taking place during the simplification of a regular expression,
       
   273 therefore we define the reflexive transitive closure of the $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$
       
   274 relations:
       
   275 
       
   276 \begin{center}
       
   277 \begin{mathpar}
       
   278 	\inferrule{}{ r \stackrel{*}{\rightsquigarrow} r \\}
       
   279 	\inferrule{}{\\ rs \stackrel{s*}{\rightsquigarrow} rs \\}
       
   280 	\inferrule{\\r_1 \stackrel{*}{\rightsquigarrow}  r_2 \land \; r_2 \stackrel{*}{\rightsquigarrow} r_3}{r_1 \stackrel{*}{\rightsquigarrow} r_3\\}
       
   281 	\inferrule{\\rs_1 \stackrel{*}{\rightsquigarrow}  rs_2 \land \; rs_2 \stackrel{*}{\rightsquigarrow} rs_3}{rs_1 \stackrel{*}{\rightsquigarrow} rs_3}
       
   282 \end{mathpar}
       
   283 \end{center}
       
   284 Now that we have modelled the rewriting behaviour of our simplifier $\bsimp$, we prove mainly 
       
   285 three properties about how these relations connect to $\blexersimp$:
       
   286 \begin{itemize}
       
   287 	\item
       
   288 		$r \stackrel{*}{\rightsquigarrow} \bsimp{r}$
       
   289 		The algorithm $\bsimp$ only transforms the regex $r$ using steps specified by 
       
   290 		$\stackrel{*}{\rightsquigarrow}$ and nothing else.
       
   291 	\item
       
   292 		$r \rightsquigarrow r' \implies r \backslash c \rightsquigarrow r'\backslash c$.
       
   293 		The relation $\stackrel{*}{rightsquigarrow}$ is preserved under derivative. 
       
   294 	\item
       
   295 		$(r \stackrel{*}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$. If we reach another 
       
   296 		expression in finitely many atomic simplification steps, then these two regular expressions
       
   297 		will produce the same bit-codes under the bit collection function $\bmkeps$ used by our $\blexer$.
       
   298 
       
   299 \end{itemize}
       
   300 \section{Three Important properties}
       
   301 These properties would work together towards the correctness theorem.
       
   302 We start proving each of these lemmas below.
       
   303 \subsection{$(r \stackrel{*}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$ and $r \stackrel{*}{\rightsquigarrow} \bsimp{r}$}
       
   304 The first few properties we establish is that the inference rules we gave for $\rightsquigarrow$
       
   305 and $\stackrel{s}{\rightsquigarrow}$ also hold as implications for $\stackrel{*}{\rightsquigarrow}$ and 
       
   306 $\stackrel{s*}{\rightsquigarrow}$.
       
   307 \begin{lemma}
       
   308 	$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies _{bs} \sum rs_1 \stackrel{*}{\rightsquigarrow} _{bs} \sum rs_2$
       
   309 \end{lemma}
       
   310 \begin{proof}
       
   311 	By rule induction of $\stackrel{s*}{\rightsquigarrow}$.
       
   312 \end{proof}
       
   313 \begin{lemma}
       
   314 	$r \stackrel{*}{\rightsquigarrow} r' \implies _{bs} \sum r :: rs \stackrel{*}{\rightsquigarrow}  _{bs} \sum r' :: rs$
       
   315 \end{lemma}
       
   316 \begin{proof}
       
   317 	By rule induction of $\stackrel{*}{\rightsquigarrow} $.
       
   318 \end{proof}
       
   319 \noindent
       
   320 Then we establish that the $\stackrel{s}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ relation is preserved w.r.t appending and prepending of a  list:
       
   321 \begin{lemma}
       
   322 	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s}{\rightsquigarrow} rs @ rs_2$
       
   323 \end{lemma}
       
   324 \begin{proof}
       
   325 	By induction on the list $rs$.
       
   326 \end{proof}
       
   327 
       
   328 \begin{lemma}
       
   329 	$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s*}{\rightsquigarrow} rs @ rs_2$
       
   330 \end{lemma}
       
   331 \begin{proof}
       
   332 	By rule induction of $\stackrel{s*}{\rightsquigarrow}$.
       
   333 \end{proof}
       
   334 
       
   335 \noindent
       
   336 The $\stackrel{s}{\rightsquigarrow} $ relation after appending a list becomes $\stackrel{s*}{\rightsquigarrow}$:
       
   337 \begin{lemma}\label{ssgqTossgs}
       
   338 	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
       
   339 \end{lemma}
       
   340 \begin{proof}
       
   341 	By rule induction of $\stackrel{s}{\rightsquigarrow}$.
       
   342 \end{proof}
       
   343 \begin{lemma}
       
   344 	$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
       
   345 \end{lemma}
       
   346 \begin{proof}
       
   347 	By rule induction of $\stackrel{s*}{\rightsquigarrow}$ and using \ref{ssgqTossgs}.
       
   348 \end{proof}
       
   349 Here are two lemmas relating $\stackrel{*}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$:
       
   350 \begin{lemma}\label{singleton}
       
   351 	$r_1 \stackrel{*}{\rightsquigarrow} r_2 \implies [r_1] \stackrel{s*}{\rightsquigarrow} [r_2]$
       
   352 \end{lemma}
       
   353 \begin{proof}
       
   354 	By rule induction of $ \stackrel{*}{\rightsquigarrow} $.
       
   355 \end{proof}
       
   356 \begin{lemma}
       
   357 	$rs_3 \stackrel{s*}{\rightsquigarrow} rs_4 \land r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies
       
   358 	r_2 :: rs_3 \stackrel{s*}{\rightsquigarrow} r_2 :: rs_4$
       
   359 \end{lemma}
       
   360 \begin{proof}
       
   361 	By using \ref{singleton}.
       
   362 \end{proof}
       
   363 Now we get to the "meaty" part of the proof, which relates the relations $\stackrel{s*}{\rightsquigarrow}$ and 
       
   364 $\stackrel{*}{\rightsquigarrow} $ with our simplification components such $\distinctBy$ and $\flts$.
       
   365 The first lemma below says that for a list made of two parts $rs_1 @ rs_2$, one can throw away the duplicate
       
   366 elements in $rs_2$, as well as those that have appeared in $rs_1$:
       
   367 \begin{lemma}
       
   368 	$rs_1 @ rs_2 \stackrel{s*}{\rightsquigarrow} (rs_1 @ (\distinctBy \; rs_2 \; \; \rerase{\_}\;  \; (\map\;\; \rerase{\_}\; \; rs_1)))$
       
   369 \end{lemma}
       
   370 \begin{proof}
       
   371 	By induction on $rs_2$, where $rs_1$ is allowed to be arbitrary.
       
   372 \end{proof}
       
   373 The above h as the corollary that is suitable for the actual way $\distinctBy$ is called in $\bsimp$:
       
   374 \begin{lemma}\label{dBPreserves}
       
   375 	$rs_ 1 \rightarrow \distinctBy \; rs_1 \; \phi$
       
   376 \end{lemma}
       
   377 
       
   378 
       
   379 
       
   380 The flatten function $\flts$ works within the $\rightsquigarrow$ relation:
       
   381 \begin{lemma}\label{fltsPreserves}
       
   382 	$rs \stackrel{s*}{\rightsquigarrow} \flts \; rs$
       
   383 \end{lemma}
       
   384 
       
   385 The rewriting in many steps property is composible in terms of regular expression constructors:
       
   386 \begin{lemma}
       
   387 	$r_1 \stackrel{*}{\rightsquigarrow} r_2 \implies _{bs} r_1 \cdot r_3 \stackrel{*}{\rightsquigarrow} \;  _{bs} r_2 \cdot r_3 \quad $ and 
       
   388 $r_3 \stackrel{*}{\rightsquigarrow} r_4 \implies _{bs} r_1 \cdot r_3 \stackrel{*}{\rightsquigarrow} _{bs} \; r_1 \cdot r_4$
       
   389 \end{lemma}
       
   390 
       
   391 The rewriting in many steps properties $\stackrel{*}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ is preserved under the function $\fuse$:
       
   392 \begin{lemma}
       
   393 	$r_1 \stackrel{*}{\rightsquigarrow} r_2 \implies \fuse \; bs \; r_1 \stackrel{*}{\rightsquigarrow} \; \fuse \; bs \; r_2 \quad $ and 
       
   394 	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \map \; (\fuse \; bs) \; rs_1 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs) \; rs_2$
       
   395 \end{lemma}
       
   396 \begin{proof}
       
   397 	By the properties $r_1 \rightsquigarrow r_2 \implies \fuse \; bs \; r_1 \implies \fuse \; bs \; r_2$ and
       
   398 	$rs_2 \stackrel{s}{\rightsquigarrow} rs_3 \implies \map \; (\fuse \; bs) rs_2 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs)\; rs_3$.
       
   399 \end{proof}
       
   400 \noindent
       
   401 If we could rewrite a regular expression in many steps to $\ZERO$, then 
       
   402 we could also rewrite any sequence containing it to $\ZERO$:
       
   403 \begin{lemma}
       
   404 	$r_1 \stackrel{*}{\rightsquigarrow} \ZERO \implies _{bs}r_1\cdot r_2 \stackrel{*}{\rightsquigarrow} \ZERO$
       
   405 \end{lemma}
       
   406 \begin{lemma}
       
   407 	$\bmkeps \; (r \backslash s) = \bmkeps \; \bderssimp{r}{s}$
       
   408 \end{lemma}
       
   409 \noindent
       
   410 The function $\bsimpalts$ preserves rewritability:
       
   411 \begin{lemma}\label{bsimpaltsPreserves}
       
   412 	$_{bs} \sum rs \stackrel{*}{\rightsquigarrow} \bsimpalts \; _{bs} \; rs$
       
   413 \end{lemma}
       
   414 \noindent
       
   415 Before we give out the next lemmas, we define a predicate for a list of regular expressions
       
   416 having at least one nullable regular expressions:
       
   417 \begin{definition}
       
   418 	$\textit{bnullables} \; rs \dn  \exists r \in rs. \bnullable r$
       
   419 \end{definition}
       
   420 The rewriting relation $\rightsquigarrow$ preserves nullability:
       
   421 \begin{lemma}
       
   422 	$r_1 \rightsquigarrow r_2 \implies  \bnullable \; r_1 = \bnullable \; r_2$ and
       
   423 	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \textit{bnullables} \; rs_1 = \textit{bnullables} \; rs_2$
       
   424 \end{lemma}
       
   425 \begin{proof}
       
   426 	By rule induction of $\rightarrow$ and $\stackrel{s}{\rightsquigarrow}$.
       
   427 \end{proof}
       
   428 So does the many steps rewriting:
       
   429 \begin{lemma}\label{rewritesBnullable}
       
   430 	$r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies \bnullable \; r_1 = \bnullable \; r_2$
       
   431 \end{lemma}
       
   432 \begin{proof}
       
   433 	By rule induction of $\stackrel{*}{\rightsquigarrow} $.
       
   434 \end{proof}
       
   435 \noindent
       
   436 And if both regular expressions in a rewriting relation are nullable, then they 
       
   437 produce the same bit-codes:
       
   438 
       
   439 \begin{lemma}\label{rewriteBmkepsAux}
       
   440 	$r_1 \rightsquigarrow r_2 \implies (\bnullable \; r_1 \land \bnullable \; r_2 \implies \bmkeps \; r_1 = 
       
   441 	\bmkeps \; r_2)$ and
       
   442 	$rs_ 1 \stackrel{s}{\rightsquigarrow} rs_2 \implies (\bnullables \; rs_1 \land \bnullables \; rs_2 \implies 
       
   443 	\bmkepss \; rs_1 = \bmkepss \; rs2)$
       
   444 \end{lemma}
       
   445 \noindent
       
   446 The definition of $\bmkepss$ on list $rs$ is just to extract the bit-codes on the first element in $rs$ that 
       
   447 is $bnullable$:
       
   448 \begin{center}
       
   449 	\begin{tabular}{lcl}
       
   450 		$\bmkepss \; [] $ & $\dn$ & $[]$\\
       
   451 		$\bmkepss \; r :: rs$ & $\dn$ & $\textit{if} \; \bnullable \; r then \; (\bmkeps \; r) \; \textit{else} \; \bmkepss \; rs$
       
   452 	\end{tabular}
       
   453 \end{center}
       
   454 \noindent
       
   455 And now we are ready to prove the key property that if you 
       
   456 have two regular expressions, one rewritable in many steps to the other,
       
   457 and one of them is $\bnullable$, then they will both yield the same bits under $\bmkeps$:
       
   458 \begin{lemma}
       
   459 	$(r \stackrel{*}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$
       
   460 \end{lemma}
       
   461 \begin{proof}
       
   462 	By rule induction of $\stackrel{*}{\rightsquigarrow} $, using \ref{rewriteBmkepsAux} and $\ref{rewritesBnullable}$.
       
   463 \end{proof}
       
   464 \noindent
       
   465 the other property is also ready:
       
   466 \begin{lemma}
       
   467 	$r \stackrel{*}{\rightsquigarrow} \bsimp{r}$
       
   468 \end{lemma}
       
   469 \begin{proof}
       
   470 	By an induction on $r$.
       
   471 The most difficult case would be the alternative case, where we using properties such as \ref{bsimpaltsPreserves} and \ref{fltsPreserves} and \ref{dBPreserves}, we could continuously rewrite a list like:\\
       
   472 	$rs \stackrel{s*}{\rightsquigarrow} \map \; \textit{bsimp} \; rs$\\
       
   473 	$\ldots \stackrel{s*}{\rightsquigarrow} \flts \; (\map \; \textit{bsimp} \; rs)$\\
       
   474 	$\ldots \;\stackrel{s*}{\rightsquigarrow} \distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi$\\
       
   475 	Then we could do the following regular expresssion many steps rewrite:\\
       
   476 	$ _{bs} \sum \distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi \stackrel{*}{\rightsquigarrow} \bsimpalts \; bs \; (\distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi)$
       
   477 	\\
       
   478 	
       
   479 \end{proof}
       
   480 \section{Proof for the Property: $r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow} r_2 \backslash c$}
       
   481 The first thing we prove is that if we could rewrite in one step, then after derivative
       
   482 we could rewrite in many steps:
       
   483 \begin{lemma}\label{rewriteBder}
       
   484 	$r_1 \rightsquigarrow r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow}  r_2 \backslash c$ and
       
   485 	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \map \; (\_\backslash c) \; rs_1 \stackrel{s*}{\rightsquigarrow} \map \; (\_ \backslash c) \; rs_2$
       
   486 \end{lemma}
       
   487 \begin{proof}
       
   488 	By induction on $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$, using a number of the previous lemmas.
       
   489 \end{proof}
       
   490 Now we can prove that once we could rewrite from one expression to another in many steps,
       
   491 then after a derivative on both sides we could still rewrite one to another in many steps:
       
   492 \begin{lemma}
       
   493 	$r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow}  r_2 \backslash c$
       
   494 \end{lemma}
       
   495 \begin{proof}
       
   496 	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and using the previous lemma :\ref{rewriteBder}.
       
   497 \end{proof}
       
   498 \noindent
       
   499 This can be extended and combined with the previous two important properties
       
   500 so that a regular expression's successivve derivatives can be rewritten in many steps
       
   501 to its simplified counterpart:
       
   502 \begin{lemma}\label{bderBderssimp}
       
   503 	$a \backslash s \stackrel{*}{\rightsquigarrow} \bderssimp{a}{s} $
       
   504 \end{lemma}
       
   505 \subsection{Main Theorem}
       
   506 Now with \ref{bdersBderssimp} we are ready for the main theorem.
       
   507 To link $\blexersimp$ and $\blexer$, 
       
   508 we first say that they give out the same bits, if the lexing result is a match:
       
   509 \begin{lemma}
       
   510 	$\bnullable \; (a \backslash s) \implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$
       
   511 \end{lemma}
       
   512 \noindent
       
   513 Now that they give out the same bits, we know that they give the same value after decoding,
       
   514 which we know is correct value as $\blexer$ is correct:
       
   515 \begin{theorem}
       
   516 	$\blexer \; r \; s = \blexersimp{r}{s}$
       
   517 \end{theorem}
       
   518 \noindent
       
   519 
       
   520 \subsection{Comments on the Proof Techniques Used}
   200 The non-trivial part of proving the correctness of the algorithm with simplification
   521 The non-trivial part of proving the correctness of the algorithm with simplification
   201 compared with not having simplification is that we can no longer use the argument 
   522 compared with not having simplification is that we can no longer use the argument 
   202 in \cref{flex_retrieve}.
   523 in \cref{flex_retrieve}.
   203 The function \retrieve needs the structure of the annotated regular expression to 
   524 The function \retrieve needs the cumbersome structure of the (umsimplified)
       
   525 annotated regular expression to 
   204 agree with the structure of the value, but simplification will always mess with the 
   526 agree with the structure of the value, but simplification will always mess with the 
   205 structure:
   527 structure.
   206 %TODO: after simp does not agree with each other: (a + 0) --> a v.s. Left(Char(a))
   528 
       
   529 We also tried to prove $\bsimp{\bderssimp{a}{s}} = \bsimp{a\backslash s}$,
       
   530 but this turns out to be not true, A counterexample of this being
       
   531 \[ r = [(1+c)\cdot [aa \cdot (1+c)]] \land s = aa
       
   532 \]
       
   533 
       
   534 Then we would have $\bsimp{a \backslash s}$ being 
       
   535 $_{[]}(_{ZZ}\ONE +  _{ZS}c ) $
       
   536 whereas $\bsimp{\bderssimp{a}{s}}$ would be 
       
   537 $_{Z}(_{Z} \ONE + _{S} c)$.
       
   538 Unfortunately if we apply $\textit{bsimp}$ at different
       
   539 stages we will always have this discrepancy, due to 
       
   540 whether the $\map \; (\fuse\; bs) \; as$ operation in $\textit{bsimp}$
       
   541 is taken at some points will be entirely dependant on when the simplification 
       
   542 take place whether there is a larger alternative structure surrounding the 
       
   543 alternative being simplified.
       
   544 The good thing about $\stackrel{*}{\rightsquigarrow} $ is that it allows
       
   545 us not specify how exactly the "atomic" simplification steps $\rightsquigarrow$
       
   546 are taken, but simply say that they can be taken to make two similar 
       
   547 regular expressions equal, and can be done after interleaving derivatives
       
   548 and simplifications.
       
   549 
       
   550 
       
   551 Having correctness property is good. But we would also like the lexer to be efficient in 
       
   552 some sense, for exampe, not grinding to a halt at certain cases.
       
   553 In the next chapter we shall prove that for a given $r$, the internal derivative size is always
       
   554 finitely bounded by a constant.