--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sat Jul 08 01:36:08 2023 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sat Jul 08 22:18:22 2023 +0100
@@ -8,8 +8,8 @@
%simplifications and therefore introduce our version of the bitcoded algorithm and
%its correctness proof in
%Chapter 3\ref{Chapter3}.
-\section{Overview}
-\marginpar{\em Added a completely new \\overview section, \\highlighting contributions.}
+%\section{Overview}
+\marginpar{\em Added a completely new \\overview section, \\highlighting\\ contributions.}
This chapter
is the point from which novel contributions of this PhD project are introduced
@@ -19,240 +19,21 @@
chapters is necessary for this thesis,
because it provides the context for why we need a new framework for
the proof of $\blexersimp$.
+
+We will first introduce why aggressive simplifications are needed, after which we
+provide our algorithm, contrasting with Sulzmann and Lu's simplifications.
+We then explain how our simplifications make
+reusing $\blexer$'s correctness proof impossible.
+%with some minor modifications
+We discuss possible fixes such as rectification functions and then introduce our proof,
+which involves a weaker inductive
+invariant than that used in the correctness proof of $\blexer$.
+
+\marginpar{Shortened overview.}
%material for setting the scene of the formal proof we
%are about to describe.
-The fundamental reason is we cannot extend the correctness proof of theorem 4
-because lemma 13 does not hold anymore when simplifications are involved.
-\marginpar{\em rephrased things \\so why new \\proof makes sense.}
-%The proof details are necessary materials for this thesis
-%because it provides necessary context to explain why we need a
-%new framework for the proof of $\blexersimp$, which involves
-%simplifications that cause structural changes to the regular expression.
-%A new formal proof of the correctness of $\blexersimp$, where the
-%proof of $\blexer$
-%is not applicatble in the sense that we cannot straightforwardly extend the
-%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
-%not hold anymore.
-%This is because the structural induction on the stepwise correctness
-%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
-%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
-%each other.
-%In this chapter we introduce simplifications
-%for annotated regular expressions that can be applied to
-%each intermediate derivative result. This allows
-%us to make $\blexer$ much more efficient.
-%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
-%but their simplification functions could have been more efficient and in some cases needed fixing.
-
-
-In particular, the correctness theorem
-of the un-optimised bit-coded lexer $\blexer$ in
-chapter \ref{Bitcoded1} formalised by Ausaf et al.
-relies crucially on lemma \ref{retrieveStepwise} that says
-any value can be retrieved in a stepwise manner, namely:
-\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred
- \vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v)
-\end{equation}
-%This no longer holds once we introduce simplifications.
-Simplifications are necessary to control the size of derivatives,
-but they also destroy the structures of the regular expressions
-such that \ref{eq:stepwise} does not hold.
-
-
-We want to prove the correctness of $\blexersimp$ which integrates
-$\textit{bsimp}$ by applying it after each call to the derivative:
-\begin{center}
-\begin{tabular}{lcl}
- $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
-$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
-\end{tabular}
-\begin{tabular}{lcl}
- $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
- $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
- & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
- & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
- & & $\;\;\textit{else}\;\textit{None}$
-\end{tabular}
-\end{center}
-\noindent
-Previously without $\textit{bsimp}$ the exact structure of each intermediate
-regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
-$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if
-we use the convenient notation $r_{c} \dn r\backslash c$
-and $v_{r}^{c} \dn \inj \;r \; c \; v$),
-but $\blexersimp$ introduces simplification after the derivative,
-making it difficult to align the pairs:
-\begin{center}
- $\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r \;(\mathord{?} v_{r}^{c}) $
-\end{center}
-\noindent
-It is clear that once we made
-$v$ to align with $\textit{bsimp} \; r_{c}$
-in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
-in for the above statement to hold.
-Ausaf et al. \cite{AusafUrbanDyckhoff2016}
-made some initial attempts with this idea, see \cite{FahadThesis}
-for details.
-
-They added
-and then rectify it to
-
-
-this works fine, however that limits the kind of simplifications you can introduce.
-We cannot use their idea for our very strong simplification rules.
-Therefore we take our route
-a wea
-
-The other route is to dispose of lemma \ref{retrieveStepwise},
-and prove a weakened inductive invariant instead.
-We adopt this approach in this thesis.
-
-Let us first explain why the requirement in $\blexer$'s proof
-is too strong, and suggest a few possible fixes, which leads to
-our proof which we believe was the most natural and effective method.
-
-
-
-\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
-
-%From this chapter we start with the main contribution of this thesis, which
-
-The $\blexer$ proof relies on a lockstep POSIX
-correspondence between the lexical value and the
-regular expression in each derivative and injection.
-If we zoom into the diagram \ref{graph:inj} and look specifically at
-the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
-the invariant that the same bitcodes can be extracted from the pairs:
-\tikzset{three sided/.style={
- draw=none,
- append after command={
- [-,shorten <= -0.5\pgflinewidth]
- ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
- edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
- ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
- edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
- ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
- edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
- }
- }
-}
-
-\tikzset{three sided1/.style={
- draw=none,
- append after command={
- [-,shorten <= -0.5\pgflinewidth]
- ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
- edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
- ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
- edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
- ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
- edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
- }
- }
-}
-
-\begin{center}
- \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
- \node [rectangle ] (1) at (-7, 2) {$\ldots$};
- \node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
- \node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$};
- \node [rectangle] (4) at (9, 2) {$\ldots$};
- \node [rectangle] (5) at (-7, -2) {$\ldots$};
- \node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
- \node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
- \node [rectangle] (8) at ( 9, -2) {$\ldots$};
- \node [rectangle] (9) at (-7, -6) {$\ldots$};
- \node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = \retrieve \; r_i\;v_i$};
- \node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = \retrieve \; r_{i+1}\;v_{i+1}$};
- \node [rectangle] (12) at (9, -6) {$\ldots$};
-
-
- \path (1) edge [] node {} (2);
- \path (5) edge [] node {} (6);
- \path (9) edge [] node {} (10);
-
- \path (11) edge [] node {} (12);
- \path (7) edge [] node {} (8);
- \path (3) edge [] node {} (4);
-
-
- \path (2) edge [] node {$\backslash a$} (3);
- \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
- \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
- %\path (6) edge [] node {$\vdash v_i : r_i$} (10);
- %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
-
- \path (10) edge [dashed, <->] node {$=$} (11);
-
- \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
-
-% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
-% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$};
-% \path (r)
-% edge [] node {$\internalise$} (a);
-% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
-% \path (a)
-% edge [] node {$\backslash a$} (a1);
-%
-% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
-% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
-% \path (a1)
-% edge [] node {$\backslash a$} (a21);
-% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
-% \path (a22)
-% edge [] node {$\backslash b$} (a3);
-% \path (a21)
-% edge [dashed, bend right] node {} (a3);
-% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
-% \path (a3)
-% edge [below] node {$\bmkeps$} (bs);
-% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
-% \path (bs)
-% edge [] node {$\decode$} (v);
-
-
- \end{tikzpicture}
- %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
-\end{center}
-
-When simplifications are added, the inhabitation relation no longer holds,
-causing the above diagram to break.
-
-Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
-
-
-
-we note that the invariant
-$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
-to maintain because the precondition $\vdash v_i : r_i$ is too weak.
-It does not require $v_i$ to be a POSIX value
-
-
-{\color{red} \rule{\linewidth}{0.5mm}}
-New content ends
-{\color{red} \rule{\linewidth}{0.5mm}}
-
-
-
-%
-%
-%which is essential for getting an understanding this thesis
-%in chapter \ref{Bitcoded1}, which is necessary for understanding why
-%the proof
-%
-%In this chapter,
-
-%We contrast our simplification function
-%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
-%This is another case for the usefulness
-%and reliability of formal proofs on algorithms.
-%These ``aggressive'' simplifications would not be possible in the injection-based
-%lexing we introduced in chapter \ref{Inj}.
-%We then prove the correctness with the improved version of
-%$\blexer$, called $\blexersimp$, by establishing
-%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
-%
\section{Simplifications by Sulzmann and Lu}
+\marginpar{moved \\simplification \\section to front \\to make coherent\\ sense.}
The algorithms $\lexer$ and $\blexer$ work beautifully as functional
programs, but not as practical code. One main reason for the slowness is due
to the size of intermediate representations--the derivative regular expressions
@@ -271,7 +52,10 @@
\end{tabular}
\end{center}
\noindent
-As can be seen, there are several duplications.
+From the second derivative several duplicate sub-expressions
+already needs to be eliminated (possible
+bitcodes are omitted to make the presentation more concise
+because they are not the key part of the simplifications).
A simple-minded simplification function cannot simplify
the third regular expression in the above chain of derivative
regular expressions, namely
@@ -472,6 +256,8 @@
%one needs to be more careful when designing the
%simplification function and making claims about them.
+
+
\section{Our $\textit{Simp}$ Function}
We will now introduce our own simplification function.
%by making a contrast with $\textit{simp}\_{SL}$.
@@ -748,6 +534,294 @@
$\textit{blexer\_SLSimp}$ by Sulzmann and Lu.
In the next section we are going to establish that our
simplification preserves the correctness of the algorithm.
+
+
+
+
+\section{Why $\textit{Blexer}$'s Proof Does Not Work}
+The fundamental reason is we cannot extend the correctness proof of theorem 4
+because lemma 13 does not hold anymore when simplifications are involved.
+\marginpar{\em rephrased things \\so why new \\proof makes sense.}
+%The proof details are necessary materials for this thesis
+%because it provides necessary context to explain why we need a
+%new framework for the proof of $\blexersimp$, which involves
+%simplifications that cause structural changes to the regular expression.
+%A new formal proof of the correctness of $\blexersimp$, where the
+%proof of $\blexer$
+%is not applicatble in the sense that we cannot straightforwardly extend the
+%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
+%not hold anymore.
+%This is because the structural induction on the stepwise correctness
+%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
+%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
+%each other.
+%In this chapter we introduce simplifications
+%for annotated regular expressions that can be applied to
+%each intermediate derivative result. This allows
+%us to make $\blexer$ much more efficient.
+%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
+%but their simplification functions could have been more efficient and in some cases needed fixing.
+
+
+In particular, the correctness theorem
+of the un-optimised bit-coded lexer $\blexer$ in
+chapter \ref{Bitcoded1} formalised by Ausaf et al.
+relies crucially on lemma \ref{retrieveStepwise} that says
+any value can be retrieved in a stepwise manner, namely:
+\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred
+ \vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v)
+\end{equation}
+%This no longer holds once we introduce simplifications.
+Simplifications are necessary to control the size of derivatives,
+but they also destroy the structures of the regular expressions
+such that \ref{eq:stepwise} does not hold.
+
+
+We want to prove the correctness of $\blexersimp$ which integrates
+$\textit{bsimp}$ by applying it after each call to the derivative:
+\begin{center}
+\begin{tabular}{lcl}
+ $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
+$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
+\end{tabular}
+\begin{tabular}{lcl}
+ $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
+ $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
+ & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
+ & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
+ & & $\;\;\textit{else}\;\textit{None}$
+\end{tabular}
+\end{center}
+\noindent
+Previously without $\textit{bsimp}$ the exact structure of each intermediate
+regular expression is preserved, allowing pairs of inhabitation relations
+in the form $\vdash v : r \backslash c $ and
+$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}.
+But $\blexersimp$ introduces simplification after the derivative,
+making it difficult to align the structures of values and regular expressions.
+If we change the form of property \ref{eq:stepwise} to
+adapt to the needs of $\blexersimp$ the precondition of becomes
+\[
+ \vdash v : (\textit{bsimp} \; (r\backslash c))
+\]
+The inhabitation relation of the other pair no longer holds,
+because $\inj$ does not work on the simplified value $v$
+and the unsimplified regular expression $r$.
+The retrieve function will not work either.
+\[
+ \vdash \inj \; r \; c \; v : r
+\]
+It seems unclear what procedures needs to be
+used to create a new value $v_?$ such that
+\[
+ \vdash v_? : r \; \text{and} \; \retrieve \; r \; v_? = \retrieve \; (\textit{bsimp} \; (r\backslash c)) \; v
+\]
+hold.
+%It is clear that once we made
+%$v$ to align with $\textit{bsimp} \; r_{c}$
+%in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
+%in for the above statement to hold.
+Ausaf et al. \cite{AusafDyckhoffUrban2016}
+used something they call rectification functions to restore the original value from the simplified value.
+The idea is that simplification functions not only returns a regular expression,
+but also a rectification function
+\[
+ \textit{simp}^{rect} : Regex \Rightarrow (Value \Rightarrow Value, Regex)
+%\textit{frect} : Value \Rightarrow Value
+\]
+that is recorded recursively,
+and then applied to the previous value
+to obtain the correct value for $\inj$ to work on.
+The recursive case of the lexer is defined as something like
+\[
+ \textit{slexer} \; r \; (c\!::\!s) \dn let \;(\textit{frect}, r_c) = \textit{simp}^{rect} \;(r \backslash c) \;\; \textit{in}\;\;
+ \inj \; r \; c \; (\textit{frect} \; (\textit{slexer} \; r_c\; s))
+ %\textit{match} \; s \; \textit{case} [
+\]
+However this approach (including $\textit{slexer}$'s correctness proof) only
+works without bitcodes, and it limits the kind of simplifications one can introduce.
+%and they have not yet extended their relatively simple simplifications
+%to more aggressive ones.
+See the thesis by Ausaf \cite{Ausaf}
+for details.
+
+%\begin{center}
+% $\vdash v: (r\backslash c) \implies \retrieve \; (\mathord{?}(\textit{bsimp} \; r_c)) \; v =\retrieve \; r \;(\mathord{?} v_{r}^{c}) $
+%\end{center}
+%\noindent
+
+We were not able to use their idea for our very strong simplification rules.
+Therefore we are taking another route that completely
+disposes of lemma \ref{retrieveStepwise},
+and prove a weakened inductive invariant instead.
+
+Let us first explain why lemma \ref{retrieveStepwise}'s requirement
+is too strong, and suggest a few possible fixes, which leads to
+our proof which we believe was the most natural and effective method.
+
+
+
+\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
+
+%From this chapter we start with the main contribution of this thesis, which
+
+The $\blexer$ proof relies on a lockstep POSIX
+correspondence between the lexical value and the
+regular expression in each derivative and injection.
+If we zoom into the diagram \ref{graph:inj} and look specifically at
+the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
+the invariant that the same bitcodes can be extracted from the pairs:
+\tikzset{three sided/.style={
+ draw=none,
+ append after command={
+ [-,shorten <= -0.5\pgflinewidth]
+ ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+ edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+ ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+ edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+ ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+ edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+ }
+ }
+}
+
+\tikzset{three sided1/.style={
+ draw=none,
+ append after command={
+ [-,shorten <= -0.5\pgflinewidth]
+ ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+ edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+ ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+ edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+ ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+ edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+ }
+ }
+}
+
+\begin{center}
+ \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
+ %\node [rectangle ] (1) at (-7, 2) {$\ldots$};
+ %\node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
+ %\node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
+ %\node [rectangle] (4) at (9, 2) {$\ldots$};
+ %\node [rectangle] (5) at (-7, -2) {$\ldots$};
+ %\node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
+ %\node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+ %\node [rectangle] (8) at ( 9, -2) {$\ldots$};
+ %\node [rectangle] (9) at (-7, -6) {$\ldots$};
+ %\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+ %\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+ %\node [rectangle] (12) at (9, -6) {$\ldots$};
+
+ \node [rectangle ] (1) at (-8, 2) {$\ldots$};
+ \node [rectangle, draw] (2) at (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
+ \node [rectangle, draw] (3) at (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
+ \node [rectangle] (4) at (8, 2) {$\ldots$};
+ \node [rectangle] (5) at (-8, -2) {$\ldots$};
+ \node [rectangle, draw] (6) at (-5, -2) {$v_i = \Stars \; [\Left (a)]$};
+ \node [rectangle, draw] (7) at ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+ \node [rectangle] (8) at ( 8, -2) {$\ldots$};
+ \node [rectangle] (9) at (-8, -6) {$\ldots$};
+ \node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+ \node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+ \node [rectangle] (12) at (8, -6) {$\ldots$};
+
+
+
+ \path (1) edge [] node {} (2);
+ \path (5) edge [] node {} (6);
+ \path (9) edge [] node {} (10);
+
+ \path (11) edge [] node {} (12);
+ \path (7) edge [] node {} (8);
+ \path (3) edge [] node {} (4);
+
+
+ \path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10);
+ \path (2) edge [dashed,bend left = 48] node {} (10);
+
+ \path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11);
+ \path (3) edge [dashed,bend left = 45] node {} (11);
+
+ \path (2) edge [] node {$\backslash a$} (3);
+ \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
+ \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
+ %\path (6) edge [] node {$\vdash v_i : r_i$} (10);
+ %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
+
+ \path (10) edge [dashed, <->] node {$=$} (11);
+
+ \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
+
+% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
+% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$};
+% \path (r)
+% edge [] node {$\internalise$} (a);
+% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
+% \path (a)
+% edge [] node {$\backslash a$} (a1);
+%
+% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
+% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
+% \path (a1)
+% edge [] node {$\backslash a$} (a21);
+% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
+% \path (a22)
+% edge [] node {$\backslash b$} (a3);
+% \path (a21)
+% edge [dashed, bend right] node {} (a3);
+% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
+% \path (a3)
+% edge [below] node {$\bmkeps$} (bs);
+% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
+% \path (bs)
+% edge [] node {$\decode$} (v);
+
+
+ \end{tikzpicture}
+ %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
+\end{center}
+
+When simplifications are added, the inhabitation relation no longer holds,
+causing the above diagram to break.
+
+Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
+
+
+
+we note that the invariant
+$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
+to maintain because the precondition $\vdash v_i : r_i$ is too weak.
+It does not require $v_i$ to be a POSIX value
+
+
+{\color{red} \rule{\linewidth}{0.5mm}}
+New content ends
+{\color{red} \rule{\linewidth}{0.5mm}}
+
+
+
+%
+%
+%which is essential for getting an understanding this thesis
+%in chapter \ref{Bitcoded1}, which is necessary for understanding why
+%the proof
+%
+%In this chapter,
+
+%We contrast our simplification function
+%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
+%This is another case for the usefulness
+%and reliability of formal proofs on algorithms.
+%These ``aggressive'' simplifications would not be possible in the injection-based
+%lexing we introduced in chapter \ref{Inj}.
+%We then prove the correctness with the improved version of
+%$\blexer$, called $\blexersimp$, by establishing
+%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
+%
+
+
%----------------------------------------------------------------------------------------
% SECTION rewrite relation
%----------------------------------------------------------------------------------------