cst_tests: comparison etnms/etnms.tex

equal deleted inserted replaced

-:5ca3b84e0eba
+:0ed8053dcf21
 explicitly showing the order of characters being
 injected back in each step.
 With $\flex$ we can write $\lexer$ this way:
 \begin{center}
-$\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps r\backslash s)$
+$\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps (r\backslash s))$
 \end{center}
-%\noindent
+\noindent
-%$\flex$ focuses on the injections instead of the derivatives ,
+The other trick, which is the crux in the existing proof,
-%compared to the original definition of $\lexer$, which puts equal
+is the use of $\retrieve$-function by Sulzmann and Lu:
-%amount of emphasis on injection and derivative with respect to each
-%character:
-%\begin{center}
-%\begin{tabular}{lcl}
-%$\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; %\textit{of}$ \\
-% & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\
-%  & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\
-%$\textit{lexer} \; r\;  [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps% (r) \; \textit{else} \;None$
-%\end{tabular}
-%\end{center}
-\noindent
-The crux in the existing proof is how $\flex$ relates to injection, namely
-\begin{center}
-$\flex \; r \; id \; (s@[c]) \; v = \flex \;  r \; id \; s \; (inj \; (r\backslash s) \; c\; v)$.
-\end{center}
-\noindent
-This property allows one to rewrite an induction hypothesis like
-\begin{center}
-$ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$
-\end{center}
-\subsubsection{Retrieve Function}
-The crucial point is to find the
-$\textit{POSIX}$  information of a regular expression and how it is modified,
-augmented and propagated
-during simplification in parallel with the regular expression that
-has not been simplified in the subsequent derivative operations.  To aid this,
-we use the helper function retrieve described by Sulzmann and Lu:
 \begin{center}
 \begin{tabular}{@{}l@{\hspace{2mm}}c@{\hspace{2mm}}l@{}}
 $\textit{retrieve}\,(_{bs}\ONE)\,\Empty$ & $\dn$ & $bs$\\
 $\textit{retrieve}\,(_{bs}{\bf c})\,(\Char\,d)$ & $\dn$ & $bs$\\
 $\textit{retrieve}\,(_{bs}\sum a::as)\,(\Left\,v)$ & $\dn$ &
 \multicolumn{3}{l}{
 \hspace{3cm}$bs \,@\, [1] \,@\, \textit{retrieve}\,a\,v\,@\,
 \textit{retrieve}\,(_{[]}a^*)\,(\Stars\,vs)$}\\
 \end{tabular}
 \end{center}
-%\comment{Did not read further}\\
+\noindent
 This function assembles the bitcode
-%that corresponds to a lexical value for how
-%the current derivative matches the suffix of the string(the characters that
-%have not yet appeared, but will appear as the successive derivatives go on.
-%How do we get this "future" information? By the value $v$, which is
-%computed by a pass of the algorithm that uses
-%$inj$ as described in the previous section).
 using information from both the derivative regular expression and the
-value. Sulzmann and Lu poroposed this function, but did not prove
+value. Sulzmann and Lu proposed this function, but did not prove
-anything about it. Ausaf and Urban used it to connect the bitcoded
+anything about it. Ausaf and Urban made this important use of the
-algorithm to the older algorithm by the following equation:
+fact about $\retrieve$ in the proof:
-\begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\;
+\begin{center}
-	 (r^\uparrow)\backslash_{simp} \,c)\,v)$
+$\retrieve \rup \backslash c \; v = \retrieve  \rup (\inj \;r \;c \; v)$
 \end{center}
 \noindent
-whereby $r^\uparrow$ stands for the internalised version of $r$. Ausaf
+whereby $r^\uparrow$ stands for the internalised version of $r$.
-and Urban also used this fact to prove  the correctness of bitcoded
+This fact, together with the fact of how $\flex$ relates to injection:
-algorithm without simplification.  Our purpose of using this, however,
+\begin{equation}\label{flex}
+\flex \; r \; id \; (s@[c]) \; v = \flex \;  r \; id \; s \; (inj \; (r\backslash s) \; c\; v).
+\end{equation}
+\noindent
+can be used to prove what we want:
+\begin{center}
+$ \flex \; r\; id\; s\; v = \textit{decode} \;( \textit{bmkeps}\; (\rup \backslash s) ) r$
+\end{center}
+\noindent
+If we state the inductive hypothesis to be
+\begin{center}
+$ \flex \; r\; id\; s\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; v \;) r$
+\end{center}
+\noindent
+where $\mkeps(r\backslash s) $ is denoted using $v$,
+then a reverse induction technique is
+best suited here to make use of
+the fact~\eqref{flex} in proving the $n+1$ step:
+\begin{center}
+$ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$
+\end{center}
+and using a lemma that
+\begin{center}
+$\textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\; = \textit{retrieve}\; (\rup \backslash s@[c])  \; v\; $
+\end{center}
+we get
+\begin{center}
+$  \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s@[c]) \; v\;) r $
+\end{center}
+and the inductive step is done because
+\begin{center}
+$ \textit{retrieve}\; (\rup \backslash s@[c]) \; \mkeps(r\backslash s) =  \bmkeps \;(\rup \backslash s@[c])$
+\end{center}
+\subsubsection{Using Retrieve Function In a New Setting}
+Ausaf
+and Urban used $\retrieve$  to prove  the correctness of bitcoded
+algorithm without simplification.  Our purpose of using it, however,
 is to establish
 \begin{center}
 $ \textit{retrieve} \;
 a \; v \;=\; \textit{retrieve}  \; (\textit{simp}\,a) \; v'.$
 This discrepancy does not appear for the old
 version of $\simp$.
 Why?
+We shall illustrate in detail  again of what happened in each recursive call of
+$\backslash$ and $\backslash_{simp}$.
 During the first derivative operation,
 \begin{center}
-$\rup\backslash a=(    _0[ \ONE\cdot {\bf b}] + _1( _0[ _1\ONE \cdot {\bf a}^*] + [ \ONE \cdot {\bf a}])      )$,
+$\rup\backslash a=(    _0[ \ONE\cdot {\bf b}] + _1( _0[ _1\ONE \cdot {\bf a}^*] + _1[ \ONE \cdot {\bf a}])      )$,
 \end{center}
 \noindent
 the second derivative gives us
 \begin{center}
-$\rup\backslash a=(_0( [\ZERO\cdot {\bf b}] + 0) + _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE)  ))$,
+$(\rup\backslash a)\backslash a=(_0( [\ZERO\cdot {\bf b}] + 0) + _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE)  ))$,
 \end{center}
 \noindent
 and this simplifies to
 \begin{center}
 $ _1(_{011}{\bf a}^* +  _1\ONE)  $
 \end{center}
+because when  $(\rup\backslash a)\backslash a$ goes
+through simplification, according to our new $\simp$
+clause,
+each component of the list
+$[_0( [\ZERO\cdot {\bf b}] + 0) , _1( _0( [\ZERO\cdot {\bf a}^*] + _1[ _1\ONE \cdot {\bf a}^*]) + _1( [\ZERO \cdot {\bf a}] + \ONE)  )]$
+is simplified into
+$[\ZERO, _1(_{011}{\bf a}^* +  _1\ONE) ]$,
+this fits into the definition of $\textit{hollowAlternatives}$,
+so the structure of the annotated regular expression
+\begin{center}
+$_1(_{011}{\bf a}^* +  _1\ONE) $
+\end{center}
+is preserved, in the sense that the outside bit $_1$
+is not fused inside.
 If, after the first derivative we apply simplification we get
 $(_0{\bf b}  + _{101}{\bf  a}^* + _{11}{\bf a}  )$,
 and we do another derivative, getting
 $(\ZERO  + (_{101}(\ONE \cdot _1{\bf a}^*)+_{11}\ONE)$,
 which simplifies to

changeset 135	0ed8053dcf21
parent 134	5ca3b84e0eba
child 136	8c404da3cb49