# HG changeset patch # User Chengsong # Date 1582327052 0 # Node ID 0ed8053dcf212abbc11facbe911034d78fc78f20 # Parent 5ca3b84e0eba029801d9d21c5a352e7a158680c5 flow diff -r 5ca3b84e0eba -r 0ed8053dcf21 etnms/etnms.tex --- a/etnms/etnms.tex Tue Feb 18 22:44:30 2020 +0000 +++ b/etnms/etnms.tex Fri Feb 21 23:17:32 2020 +0000 @@ -915,46 +915,12 @@ With $\flex$ we can write $\lexer$ this way: \begin{center} -$\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps r\backslash s)$ -\end{center} - -%\noindent -%$\flex$ focuses on the injections instead of the derivatives , -%compared to the original definition of $\lexer$, which puts equal -%amount of emphasis on injection and derivative with respect to each -%character: - -%\begin{center} -%\begin{tabular}{lcl} -%$\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; %\textit{of}$ \\ -% & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\ -% & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\ -%$\textit{lexer} \; r\; [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps% (r) \; \textit{else} \;None$ -%\end{tabular} -%\end{center} - -\noindent -The crux in the existing proof is how $\flex$ relates to injection, namely - -\begin{center} -$\flex \; r \; id \; (s@[c]) \; v = \flex \; r \; id \; s \; (inj \; (r\backslash s) \; c\; v)$. +$\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps (r\backslash s))$ \end{center} \noindent -This property allows one to rewrite an induction hypothesis like - -\begin{center} -$ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$ -\end{center} - - -\subsubsection{Retrieve Function} -The crucial point is to find the -$\textit{POSIX}$ information of a regular expression and how it is modified, -augmented and propagated -during simplification in parallel with the regular expression that -has not been simplified in the subsequent derivative operations. To aid this, -we use the helper function retrieve described by Sulzmann and Lu: +The other trick, which is the crux in the existing proof, +is the use of $\retrieve$-function by Sulzmann and Lu: \begin{center} \begin{tabular}{@{}l@{\hspace{2mm}}c@{\hspace{2mm}}l@{}} $\textit{retrieve}\,(_{bs}\ONE)\,\Empty$ & $\dn$ & $bs$\\ @@ -973,27 +939,61 @@ \textit{retrieve}\,(_{[]}a^*)\,(\Stars\,vs)$}\\ \end{tabular} \end{center} -%\comment{Did not read further}\\ + +\noindent This function assembles the bitcode -%that corresponds to a lexical value for how -%the current derivative matches the suffix of the string(the characters that -%have not yet appeared, but will appear as the successive derivatives go on. -%How do we get this "future" information? By the value $v$, which is -%computed by a pass of the algorithm that uses -%$inj$ as described in the previous section). using information from both the derivative regular expression and the -value. Sulzmann and Lu poroposed this function, but did not prove -anything about it. Ausaf and Urban used it to connect the bitcoded -algorithm to the older algorithm by the following equation: +value. Sulzmann and Lu proposed this function, but did not prove +anything about it. Ausaf and Urban made this important use of the +fact about $\retrieve$ in the proof: - \begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\; - (r^\uparrow)\backslash_{simp} \,c)\,v)$ + \begin{center} + $\retrieve \rup \backslash c \; v = \retrieve \rup (\inj \;r \;c \; v)$ \end{center} \noindent -whereby $r^\uparrow$ stands for the internalised version of $r$. Ausaf -and Urban also used this fact to prove the correctness of bitcoded -algorithm without simplification. Our purpose of using this, however, +whereby $r^\uparrow$ stands for the internalised version of $r$. +This fact, together with the fact of how $\flex$ relates to injection: + +\begin{equation}\label{flex} +\flex \; r \; id \; (s@[c]) \; v = \flex \; r \; id \; s \; (inj \; (r\backslash s) \; c\; v). +\end{equation} + +\noindent +can be used to prove what we want: +\begin{center} +$ \flex \; r\; id\; s\; v = \textit{decode} \;( \textit{bmkeps}\; (\rup \backslash s) ) r$ +\end{center} +\noindent +If we state the inductive hypothesis to be +\begin{center} +$ \flex \; r\; id\; s\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; v \;) r$ +\end{center} +\noindent +where $\mkeps(r\backslash s) $ is denoted using $v$, +then a reverse induction technique is +best suited here to make use of +the fact~\eqref{flex} in proving the $n+1$ step: +\begin{center} +$ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$ +\end{center} +and using a lemma that +\begin{center} +$\textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\; = \textit{retrieve}\; (\rup \backslash s@[c]) \; v\; $ +\end{center} +we get +\begin{center} +$ \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s@[c]) \; v\;) r $ +\end{center} +and the inductive step is done because +\begin{center} +$ \textit{retrieve}\; (\rup \backslash s@[c]) \; \mkeps(r\backslash s) = \bmkeps \;(\rup \backslash s@[c])$ +\end{center} + +\subsubsection{Using Retrieve Function In a New Setting} + Ausaf +and Urban used $\retrieve$ to prove the correctness of bitcoded +algorithm without simplification. Our purpose of using it, however, is to establish \begin{center} @@ -1410,14 +1410,16 @@ Why? +We shall illustrate in detail again of what happened in each recursive call of +$\backslash$ and $\backslash_{simp}$. During the first derivative operation, \begin{center} -$\rup\backslash a=( _0[ \ONE\cdot {\bf b}] + _1( _0[ _1\ONE \cdot {\bf a}^*] + [ \ONE \cdot {\bf a}]) )$, +$\rup\backslash a=( _0[ \ONE\cdot {\bf b}] + _1( _0[ _1\ONE \cdot {\bf a}^*] + _1[ \ONE \cdot {\bf a}]) )$, \end{center} \noindent the second derivative gives us \begin{center} -$\rup\backslash a=(_0( [\ZERO\cdot {\bf b}] + 0) + _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE) ))$, +$(\rup\backslash a)\backslash a=(_0( [\ZERO\cdot {\bf b}] + 0) + _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE) ))$, \end{center} \noindent @@ -1425,7 +1427,20 @@ \begin{center} $ _1(_{011}{\bf a}^* + _1\ONE) $ \end{center} - +because when $(\rup\backslash a)\backslash a$ goes +through simplification, according to our new $\simp$ +clause, +each component of the list +$[_0( [\ZERO\cdot {\bf b}] + 0) , _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE) )]$ +is simplified into +$[\ZERO, _1(_{011}{\bf a}^* + _1\ONE) ]$, +this fits into the definition of $\textit{hollowAlternatives}$, +so the structure of the annotated regular expression +\begin{center} +$_1(_{011}{\bf a}^* + _1\ONE) $ +\end{center} +is preserved, in the sense that the outside bit $_1$ +is not fused inside. If, after the first derivative we apply simplification we get $(_0{\bf b} + _{101}{\bf a}^* + _{11}{\bf a} )$, and we do another derivative, getting