diff -r 80e1114d6421 -r 86e0203db2da ChengsongTanPhdThesis/Chapters/Bitcoded2.tex --- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Tue Aug 30 12:41:52 2022 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Wed Aug 31 12:51:53 2022 +0100 @@ -942,47 +942,74 @@ As a corollary, we link this result with the lemma we proved earlier that \begin{center} - $(r, s) \rightarrow v \implies \blexer \; r \; s = v$ + $(r, s) \rightarrow v \;\; \textit{iff}\;\; \blexer \; r \; s = v$ \end{center} and obtain the corollary that the bit-coded lexer with simplification is indeed correctly outputting POSIX lexing result, if such a result exists. \begin{corollary} - $(r, s) \rightarrow v \implies \blexersimp{r}{s}$ + $(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp \; r\; s $ \end{corollary} \subsection{Comments on the Proof Techniques Used} -The non-trivial part of proving the correctness of the algorithm with simplification -compared with not having simplification is that we can no longer use the argument -in \cref{flex_retrieve}. -The function \retrieve needs the cumbersome structure of the (umsimplified) -annotated regular expression to -agree with the structure of the value, but simplification will always mess with the -structure. - -We also tried to prove $\bsimp{\bderssimp{a}{s}} = \bsimp{a\backslash s}$, -but this turns out to be not true, A counterexample of this being -\[ r = [(1+c)\cdot [aa \cdot (1+c)]] \land s = aa -\] +Straightforward and simple as the proof may seem, +the efforts we spent obtaining it was far from trivial.\\ +We initially attempted to re-use the argument +in \cref{flex_retrieve}. +The problem was that both functions $\inj$ and $\retrieve$ require +that the annotated regular expressions stay unsimplified, +so that one can +correctly compare $v_{i+1}$ and $r_i$ and $v_i$ +in diagram \ref{graph:inj} and +``fit the key into the lock hole''. -Then we would have $\bsimp{a \backslash s}$ being -$_{[]}(_{ZZ}\ONE + _{ZS}c ) $ -whereas $\bsimp{\bderssimp{a}{s}}$ would be -$_{Z}(_{Z} \ONE + _{S} c)$. -Unfortunately if we apply $\textit{bsimp}$ at different -stages we will always have this discrepancy, due to -whether the $\map \; (\fuse\; bs) \; as$ operation in $\textit{bsimp}$ -is taken at some points will be entirely dependant on when the simplification -take place whether there is a larger alternative structure surrounding the -alternative being simplified. -The good thing about $\stackrel{*}{\rightsquigarrow} $ is that it allows -us not specify how exactly the "atomic" simplification steps $\rightsquigarrow$ -are taken, but simply say that they can be taken to make two similar -regular expressions equal, and can be done after interleaving derivatives -and simplifications. +\noindent +We also tried to prove +\begin{center} +$\textit{bsimp} \;\; (\bderssimp{a}{s}) = +\textit{bsimp} \;\; (a\backslash s)$, +\end{center} +but this turns out to be not true. +A counterexample would be +\[ a = [(_{Z}1+_{S}c)\cdot [bb \cdot (_{Z}1+_{S}c)]] \;\; + \text{and} \;\; s = bb. +\] +\noindent +Then we would have +\begin{center} + $\textit{bsimp}\;\; ( a \backslash s )$ = + $_{[]}(_{ZZ}\ONE + _{ZS}c ) $ +\end{center} +\noindent +whereas +\begin{center} + $\textit{bsimp} \;\;( \bderssimp{a}{s} )$ = + $_{Z}(_{Z} \ONE + _{S} c)$. +\end{center} +Unfortunately, +if we apply $\textit{bsimp}$ differently +we will always have this discrepancy. +This is due to +the $\map \; (\fuse\; bs) \; as$ operation +happening at different locations in the regular expression.\\ +The rewriting relation +$\rightsquigarrow^*$ +allows us to ignore this discrepancy +and view the expressions +\begin{center} + $_{[]}(_{ZZ}\ONE + _{ZS}c ) $\\ + and\\ + $_{Z}(_{Z} \ONE + _{S} c)$ - -Having correctness property is good. But we would also like the lexer to be efficient in -some sense, for exampe, not grinding to a halt at certain cases. -In the next chapter we shall prove that for a given $r$, the internal derivative size is always +\end{center} +as equal, because they were both re-written +from the same expression.\\ +Having correctness property is good. +But we would also a guarantee that the lexer is not slow in +some sense, for exampe, not grinding to a halt regardless of the input. +As we have already seen, Sulzmann and Lu's simplification function +$\simpsulz$ cannot achieve this, because their claim that +the regular expression size does not grow arbitrary large +was not true. +In the next chapter we shall prove that with our $\simp$, +for a given $r$, the internal derivative size is always finitely bounded by a constant. -we would expect in the