# HG changeset patch # User Chengsong # Date 1688851102 -3600 # Node ID 753a3b0ee02b0df682eaee95af3f6b0280c2362d # Parent d8f82c690b32294a1d6e332f1c6d07cfb6579f5f reordered sections to make chapter 4 more coherent diff -r d8f82c690b32 -r 753a3b0ee02b ChengsongTanPhdThesis/Chapters/Bitcoded2.tex --- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sat Jul 08 01:36:08 2023 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sat Jul 08 22:18:22 2023 +0100 @@ -8,8 +8,8 @@ %simplifications and therefore introduce our version of the bitcoded algorithm and %its correctness proof in %Chapter 3\ref{Chapter3}. -\section{Overview} -\marginpar{\em Added a completely new \\overview section, \\highlighting contributions.} +%\section{Overview} +\marginpar{\em Added a completely new \\overview section, \\highlighting\\ contributions.} This chapter is the point from which novel contributions of this PhD project are introduced @@ -19,240 +19,21 @@ chapters is necessary for this thesis, because it provides the context for why we need a new framework for the proof of $\blexersimp$. + +We will first introduce why aggressive simplifications are needed, after which we +provide our algorithm, contrasting with Sulzmann and Lu's simplifications. +We then explain how our simplifications make +reusing $\blexer$'s correctness proof impossible. +%with some minor modifications +We discuss possible fixes such as rectification functions and then introduce our proof, +which involves a weaker inductive +invariant than that used in the correctness proof of $\blexer$. + +\marginpar{Shortened overview.} %material for setting the scene of the formal proof we %are about to describe. -The fundamental reason is we cannot extend the correctness proof of theorem 4 -because lemma 13 does not hold anymore when simplifications are involved. -\marginpar{\em rephrased things \\so why new \\proof makes sense.} -%The proof details are necessary materials for this thesis -%because it provides necessary context to explain why we need a -%new framework for the proof of $\blexersimp$, which involves -%simplifications that cause structural changes to the regular expression. -%A new formal proof of the correctness of $\blexersimp$, where the -%proof of $\blexer$ -%is not applicatble in the sense that we cannot straightforwardly extend the -%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does -%not hold anymore. -%This is because the structural induction on the stepwise correctness -%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described -%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to -%each other. -%In this chapter we introduce simplifications -%for annotated regular expressions that can be applied to -%each intermediate derivative result. This allows -%us to make $\blexer$ much more efficient. -%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions, -%but their simplification functions could have been more efficient and in some cases needed fixing. - - -In particular, the correctness theorem -of the un-optimised bit-coded lexer $\blexer$ in -chapter \ref{Bitcoded1} formalised by Ausaf et al. -relies crucially on lemma \ref{retrieveStepwise} that says -any value can be retrieved in a stepwise manner, namely: -\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred - \vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v) -\end{equation} -%This no longer holds once we introduce simplifications. -Simplifications are necessary to control the size of derivatives, -but they also destroy the structures of the regular expressions -such that \ref{eq:stepwise} does not hold. - - -We want to prove the correctness of $\blexersimp$ which integrates -$\textit{bsimp}$ by applying it after each call to the derivative: -\begin{center} -\begin{tabular}{lcl} - $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\ -$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$ -\end{tabular} -\begin{tabular}{lcl} - $\textit{blexer\_simp}\;r\,s$ & $\dn$ & - $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\ - & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\ - & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\ - & & $\;\;\textit{else}\;\textit{None}$ -\end{tabular} -\end{center} -\noindent -Previously without $\textit{bsimp}$ the exact structure of each intermediate -regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and -$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if -we use the convenient notation $r_{c} \dn r\backslash c$ -and $v_{r}^{c} \dn \inj \;r \; c \; v$), -but $\blexersimp$ introduces simplification after the derivative, -making it difficult to align the pairs: -\begin{center} - $\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r \;(\mathord{?} v_{r}^{c}) $ -\end{center} -\noindent -It is clear that once we made -$v$ to align with $\textit{bsimp} \; r_{c}$ -in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged -in for the above statement to hold. -Ausaf et al. \cite{AusafUrbanDyckhoff2016} -made some initial attempts with this idea, see \cite{FahadThesis} -for details. - -They added -and then rectify it to - - -this works fine, however that limits the kind of simplifications you can introduce. -We cannot use their idea for our very strong simplification rules. -Therefore we take our route -a wea - -The other route is to dispose of lemma \ref{retrieveStepwise}, -and prove a weakened inductive invariant instead. -We adopt this approach in this thesis. - -Let us first explain why the requirement in $\blexer$'s proof -is too strong, and suggest a few possible fixes, which leads to -our proof which we believe was the most natural and effective method. - - - -\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong} - -%From this chapter we start with the main contribution of this thesis, which - -The $\blexer$ proof relies on a lockstep POSIX -correspondence between the lexical value and the -regular expression in each derivative and injection. -If we zoom into the diagram \ref{graph:inj} and look specifically at -the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating -the invariant that the same bitcodes can be extracted from the pairs: -\tikzset{three sided/.style={ - draw=none, - append after command={ - [-,shorten <= -0.5\pgflinewidth] - ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) - edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) - ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) - edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) - ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) - edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) - } - } -} - -\tikzset{three sided1/.style={ - draw=none, - append after command={ - [-,shorten <= -0.5\pgflinewidth] - ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) - edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) - ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) - edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) - ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) - edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) - } - } -} - -\begin{center} - \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick] - \node [rectangle ] (1) at (-7, 2) {$\ldots$}; - \node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; - \node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$}; - \node [rectangle] (4) at (9, 2) {$\ldots$}; - \node [rectangle] (5) at (-7, -2) {$\ldots$}; - \node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$}; - \node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; - \node [rectangle] (8) at ( 9, -2) {$\ldots$}; - \node [rectangle] (9) at (-7, -6) {$\ldots$}; - \node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = \retrieve \; r_i\;v_i$}; - \node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = \retrieve \; r_{i+1}\;v_{i+1}$}; - \node [rectangle] (12) at (9, -6) {$\ldots$}; - - - \path (1) edge [] node {} (2); - \path (5) edge [] node {} (6); - \path (9) edge [] node {} (10); - - \path (11) edge [] node {} (12); - \path (7) edge [] node {} (8); - \path (3) edge [] node {} (4); - - - \path (2) edge [] node {$\backslash a$} (3); - \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6); - \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7); - %\path (6) edge [] node {$\vdash v_i : r_i$} (10); - %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11); - - \path (10) edge [dashed, <->] node {$=$} (11); - - \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6); - -% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$}; -% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$}; -% \path (r) -% edge [] node {$\internalise$} (a); -% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$}; -% \path (a) -% edge [] node {$\backslash a$} (a1); -% -% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$}; -% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$}; -% \path (a1) -% edge [] node {$\backslash a$} (a21); -% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$}; -% \path (a22) -% edge [] node {$\backslash b$} (a3); -% \path (a21) -% edge [dashed, bend right] node {} (a3); -% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$}; -% \path (a3) -% edge [below] node {$\bmkeps$} (bs); -% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$}; -% \path (bs) -% edge [] node {$\decode$} (v); - - - \end{tikzpicture} - %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$} -\end{center} - -When simplifications are added, the inhabitation relation no longer holds, -causing the above diagram to break. - -Ausaf addressed this with an augmented lexer he called $\textit{slexer}$. - - - -we note that the invariant -$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong -to maintain because the precondition $\vdash v_i : r_i$ is too weak. -It does not require $v_i$ to be a POSIX value - - -{\color{red} \rule{\linewidth}{0.5mm}} -New content ends -{\color{red} \rule{\linewidth}{0.5mm}} - - - -% -% -%which is essential for getting an understanding this thesis -%in chapter \ref{Bitcoded1}, which is necessary for understanding why -%the proof -% -%In this chapter, - -%We contrast our simplification function -%with Sulzmann and Lu's, indicating the simplicity of our algorithm. -%This is another case for the usefulness -%and reliability of formal proofs on algorithms. -%These ``aggressive'' simplifications would not be possible in the injection-based -%lexing we introduced in chapter \ref{Inj}. -%We then prove the correctness with the improved version of -%$\blexer$, called $\blexersimp$, by establishing -%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system. -% \section{Simplifications by Sulzmann and Lu} +\marginpar{moved \\simplification \\section to front \\to make coherent\\ sense.} The algorithms $\lexer$ and $\blexer$ work beautifully as functional programs, but not as practical code. One main reason for the slowness is due to the size of intermediate representations--the derivative regular expressions @@ -271,7 +52,10 @@ \end{tabular} \end{center} \noindent -As can be seen, there are several duplications. +From the second derivative several duplicate sub-expressions +already needs to be eliminated (possible +bitcodes are omitted to make the presentation more concise +because they are not the key part of the simplifications). A simple-minded simplification function cannot simplify the third regular expression in the above chain of derivative regular expressions, namely @@ -472,6 +256,8 @@ %one needs to be more careful when designing the %simplification function and making claims about them. + + \section{Our $\textit{Simp}$ Function} We will now introduce our own simplification function. %by making a contrast with $\textit{simp}\_{SL}$. @@ -748,6 +534,294 @@ $\textit{blexer\_SLSimp}$ by Sulzmann and Lu. In the next section we are going to establish that our simplification preserves the correctness of the algorithm. + + + + +\section{Why $\textit{Blexer}$'s Proof Does Not Work} +The fundamental reason is we cannot extend the correctness proof of theorem 4 +because lemma 13 does not hold anymore when simplifications are involved. +\marginpar{\em rephrased things \\so why new \\proof makes sense.} +%The proof details are necessary materials for this thesis +%because it provides necessary context to explain why we need a +%new framework for the proof of $\blexersimp$, which involves +%simplifications that cause structural changes to the regular expression. +%A new formal proof of the correctness of $\blexersimp$, where the +%proof of $\blexer$ +%is not applicatble in the sense that we cannot straightforwardly extend the +%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does +%not hold anymore. +%This is because the structural induction on the stepwise correctness +%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described +%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to +%each other. +%In this chapter we introduce simplifications +%for annotated regular expressions that can be applied to +%each intermediate derivative result. This allows +%us to make $\blexer$ much more efficient. +%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions, +%but their simplification functions could have been more efficient and in some cases needed fixing. + + +In particular, the correctness theorem +of the un-optimised bit-coded lexer $\blexer$ in +chapter \ref{Bitcoded1} formalised by Ausaf et al. +relies crucially on lemma \ref{retrieveStepwise} that says +any value can be retrieved in a stepwise manner, namely: +\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred + \vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v) +\end{equation} +%This no longer holds once we introduce simplifications. +Simplifications are necessary to control the size of derivatives, +but they also destroy the structures of the regular expressions +such that \ref{eq:stepwise} does not hold. + + +We want to prove the correctness of $\blexersimp$ which integrates +$\textit{bsimp}$ by applying it after each call to the derivative: +\begin{center} +\begin{tabular}{lcl} + $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\ +$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$ +\end{tabular} +\begin{tabular}{lcl} + $\textit{blexer\_simp}\;r\,s$ & $\dn$ & + $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\ + & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\ + & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\ + & & $\;\;\textit{else}\;\textit{None}$ +\end{tabular} +\end{center} +\noindent +Previously without $\textit{bsimp}$ the exact structure of each intermediate +regular expression is preserved, allowing pairs of inhabitation relations +in the form $\vdash v : r \backslash c $ and +$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}. +But $\blexersimp$ introduces simplification after the derivative, +making it difficult to align the structures of values and regular expressions. +If we change the form of property \ref{eq:stepwise} to +adapt to the needs of $\blexersimp$ the precondition of becomes +\[ + \vdash v : (\textit{bsimp} \; (r\backslash c)) +\] +The inhabitation relation of the other pair no longer holds, +because $\inj$ does not work on the simplified value $v$ +and the unsimplified regular expression $r$. +The retrieve function will not work either. +\[ + \vdash \inj \; r \; c \; v : r +\] +It seems unclear what procedures needs to be +used to create a new value $v_?$ such that +\[ + \vdash v_? : r \; \text{and} \; \retrieve \; r \; v_? = \retrieve \; (\textit{bsimp} \; (r\backslash c)) \; v +\] +hold. +%It is clear that once we made +%$v$ to align with $\textit{bsimp} \; r_{c}$ +%in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged +%in for the above statement to hold. +Ausaf et al. \cite{AusafDyckhoffUrban2016} +used something they call rectification functions to restore the original value from the simplified value. +The idea is that simplification functions not only returns a regular expression, +but also a rectification function +\[ + \textit{simp}^{rect} : Regex \Rightarrow (Value \Rightarrow Value, Regex) +%\textit{frect} : Value \Rightarrow Value +\] +that is recorded recursively, +and then applied to the previous value +to obtain the correct value for $\inj$ to work on. +The recursive case of the lexer is defined as something like +\[ + \textit{slexer} \; r \; (c\!::\!s) \dn let \;(\textit{frect}, r_c) = \textit{simp}^{rect} \;(r \backslash c) \;\; \textit{in}\;\; + \inj \; r \; c \; (\textit{frect} \; (\textit{slexer} \; r_c\; s)) + %\textit{match} \; s \; \textit{case} [ +\] +However this approach (including $\textit{slexer}$'s correctness proof) only +works without bitcodes, and it limits the kind of simplifications one can introduce. +%and they have not yet extended their relatively simple simplifications +%to more aggressive ones. +See the thesis by Ausaf \cite{Ausaf} +for details. + +%\begin{center} +% $\vdash v: (r\backslash c) \implies \retrieve \; (\mathord{?}(\textit{bsimp} \; r_c)) \; v =\retrieve \; r \;(\mathord{?} v_{r}^{c}) $ +%\end{center} +%\noindent + +We were not able to use their idea for our very strong simplification rules. +Therefore we are taking another route that completely +disposes of lemma \ref{retrieveStepwise}, +and prove a weakened inductive invariant instead. + +Let us first explain why lemma \ref{retrieveStepwise}'s requirement +is too strong, and suggest a few possible fixes, which leads to +our proof which we believe was the most natural and effective method. + + + +\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong} + +%From this chapter we start with the main contribution of this thesis, which + +The $\blexer$ proof relies on a lockstep POSIX +correspondence between the lexical value and the +regular expression in each derivative and injection. +If we zoom into the diagram \ref{graph:inj} and look specifically at +the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating +the invariant that the same bitcodes can be extracted from the pairs: +\tikzset{three sided/.style={ + draw=none, + append after command={ + [-,shorten <= -0.5\pgflinewidth] + ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) + edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) + ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) + edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) + ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) + edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) + } + } +} + +\tikzset{three sided1/.style={ + draw=none, + append after command={ + [-,shorten <= -0.5\pgflinewidth] + ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) + edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) + ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) + edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) + ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) + edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) + } + } +} + +\begin{center} + \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick] + %\node [rectangle ] (1) at (-7, 2) {$\ldots$}; + %\node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; + %\node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; + %\node [rectangle] (4) at (9, 2) {$\ldots$}; + %\node [rectangle] (5) at (-7, -2) {$\ldots$}; + %\node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$}; + %\node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; + %\node [rectangle] (8) at ( 9, -2) {$\ldots$}; + %\node [rectangle] (9) at (-7, -6) {$\ldots$}; + %\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; + %\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; + %\node [rectangle] (12) at (9, -6) {$\ldots$}; + + \node [rectangle ] (1) at (-8, 2) {$\ldots$}; + \node [rectangle, draw] (2) at (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; + \node [rectangle, draw] (3) at (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; + \node [rectangle] (4) at (8, 2) {$\ldots$}; + \node [rectangle] (5) at (-8, -2) {$\ldots$}; + \node [rectangle, draw] (6) at (-5, -2) {$v_i = \Stars \; [\Left (a)]$}; + \node [rectangle, draw] (7) at ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; + \node [rectangle] (8) at ( 8, -2) {$\ldots$}; + \node [rectangle] (9) at (-8, -6) {$\ldots$}; + \node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; + \node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; + \node [rectangle] (12) at (8, -6) {$\ldots$}; + + + + \path (1) edge [] node {} (2); + \path (5) edge [] node {} (6); + \path (9) edge [] node {} (10); + + \path (11) edge [] node {} (12); + \path (7) edge [] node {} (8); + \path (3) edge [] node {} (4); + + + \path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10); + \path (2) edge [dashed,bend left = 48] node {} (10); + + \path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11); + \path (3) edge [dashed,bend left = 45] node {} (11); + + \path (2) edge [] node {$\backslash a$} (3); + \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6); + \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7); + %\path (6) edge [] node {$\vdash v_i : r_i$} (10); + %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11); + + \path (10) edge [dashed, <->] node {$=$} (11); + + \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6); + +% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$}; +% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$}; +% \path (r) +% edge [] node {$\internalise$} (a); +% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$}; +% \path (a) +% edge [] node {$\backslash a$} (a1); +% +% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$}; +% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$}; +% \path (a1) +% edge [] node {$\backslash a$} (a21); +% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$}; +% \path (a22) +% edge [] node {$\backslash b$} (a3); +% \path (a21) +% edge [dashed, bend right] node {} (a3); +% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$}; +% \path (a3) +% edge [below] node {$\bmkeps$} (bs); +% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$}; +% \path (bs) +% edge [] node {$\decode$} (v); + + + \end{tikzpicture} + %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$} +\end{center} + +When simplifications are added, the inhabitation relation no longer holds, +causing the above diagram to break. + +Ausaf addressed this with an augmented lexer he called $\textit{slexer}$. + + + +we note that the invariant +$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong +to maintain because the precondition $\vdash v_i : r_i$ is too weak. +It does not require $v_i$ to be a POSIX value + + +{\color{red} \rule{\linewidth}{0.5mm}} +New content ends +{\color{red} \rule{\linewidth}{0.5mm}} + + + +% +% +%which is essential for getting an understanding this thesis +%in chapter \ref{Bitcoded1}, which is necessary for understanding why +%the proof +% +%In this chapter, + +%We contrast our simplification function +%with Sulzmann and Lu's, indicating the simplicity of our algorithm. +%This is another case for the usefulness +%and reliability of formal proofs on algorithms. +%These ``aggressive'' simplifications would not be possible in the injection-based +%lexing we introduced in chapter \ref{Inj}. +%We then prove the correctness with the improved version of +%$\blexer$, called $\blexersimp$, by establishing +%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system. +% + + %---------------------------------------------------------------------------------------- % SECTION rewrite relation %----------------------------------------------------------------------------------------