lexing: changeset 656:753a3b0ee02b

--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Sat Jul 08 01:36:08 2023 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex	Sat Jul 08 22:18:22 2023 +0100
@@ -8,8 +8,8 @@
 %simplifications and therefore introduce our version of the bitcoded algorithm and 
 %its correctness proof in 
 %Chapter 3\ref{Chapter3}. 
-\section{Overview}
-\marginpar{\em Added a completely new \\overview section, \\highlighting contributions.}
+%\section{Overview}
+\marginpar{\em Added a completely new \\overview section, \\highlighting\\ contributions.}
 
 This chapter
 is the point from which novel contributions of this PhD project are introduced
@@ -19,240 +19,21 @@
 chapters is necessary for this thesis,
 because it provides the context for why we need a new framework for
 the proof of $\blexersimp$.
+
+We will first introduce why aggressive simplifications are needed, after which we
+provide our algorithm, contrasting with Sulzmann and Lu's simplifications.
+We then explain how our simplifications make
+reusing $\blexer$'s correctness proof impossible.
+%with some minor modifications
+We discuss possible fixes such as rectification functions and then introduce our proof, 
+which involves a weaker inductive
+invariant than that used in the correctness proof of $\blexer$.
+
+\marginpar{Shortened overview.}
 %material for setting the scene of the formal proof we
 %are about to describe.
-The fundamental reason is we cannot extend the correctness proof of theorem 4
-because lemma 13 does not hold anymore when simplifications are involved.
-\marginpar{\em rephrased things \\so why new \\proof makes sense.}
-%The proof details are necessary materials for this thesis
-%because it provides necessary context to explain why we need a
-%new framework for the proof of $\blexersimp$, which involves
-%simplifications that cause structural changes to the regular expression.
-%A new formal proof of the correctness of $\blexersimp$, where the 
-%proof of $\blexer$
-%is not applicatble in the sense that we cannot straightforwardly extend the
-%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
-%not hold anymore.
-%This is because the structural induction on the stepwise correctness
-%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
-%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
-%each other.
-%In this chapter we introduce simplifications
-%for annotated regular expressions that can be applied to 
-%each intermediate derivative result. This allows
-%us to make $\blexer$ much more efficient.
-%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
-%but their simplification functions could have been more efficient and in some cases needed fixing.
-
-
-In particular, the correctness theorem 
-of the un-optimised bit-coded lexer $\blexer$ in 
-chapter \ref{Bitcoded1} formalised by Ausaf et al.
-relies crucially on lemma \ref{retrieveStepwise} that says
-any value can be retrieved in a stepwise manner, namely:
-\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred	
-	\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)
-\end{equation}
-%This no longer holds once we introduce simplifications.
-Simplifications are necessary to control the size of derivatives,
-but they also destroy the structures of the regular expressions
-such that \ref{eq:stepwise} does not hold.
-
-
-We want to prove the correctness of $\blexersimp$ which integrates
-$\textit{bsimp}$ by applying it after each call to the derivative:
-\begin{center}
-\begin{tabular}{lcl}
-	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
-$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
-\end{tabular}
-\begin{tabular}{lcl}
-  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
-      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
-  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
-  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
-  & & $\;\;\textit{else}\;\textit{None}$
-\end{tabular}
-\end{center}
-\noindent
-Previously without $\textit{bsimp}$ the exact structure of each intermediate 
-regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
-$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if 
-we use the convenient notation $r_{c} \dn r\backslash c$
-and $v_{r}^{c} \dn \inj \;r \; c \; v$),
-but $\blexersimp$ introduces simplification after the derivative,
-making it difficult to align the pairs:
-\begin{center}
-	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
-\end{center}
-\noindent
-It is clear that once we made 
-$v$ to align with $\textit{bsimp} \; r_{c}$
-in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
-in for the above statement to hold.
-Ausaf et al. \cite{AusafUrbanDyckhoff2016}
-made some initial attempts with this idea, see \cite{FahadThesis}
-for details.
-
-They added
-and then rectify it to
-
-
-this works fine, however that limits the kind of simplifications you can introduce.
-We cannot use their idea for our very strong simplification rules.
-Therefore we take our route
-a wea
-
-The other route is to dispose of lemma \ref{retrieveStepwise},
-and prove a weakened inductive invariant instead.
-We adopt this approach in this thesis.
-
-Let us first explain why the requirement in $\blexer$'s proof
-is too strong, and suggest a few possible fixes, which leads to
-our proof which we believe was the most natural and effective method.
-
-
-
-\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
-
-%From this chapter we start with the main contribution of this thesis, which
-
-The $\blexer$ proof relies on a lockstep POSIX
-correspondence between the lexical value and the
-regular expression in each derivative and injection.
-If we zoom into the diagram \ref{graph:inj} and look specifically at
-the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
-the invariant that the same bitcodes can be extracted from the pairs:
-\tikzset{three sided/.style={
-        draw=none,
-        append after command={
-            [-,shorten <= -0.5\pgflinewidth]
-            ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
-        edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
-            ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
-        edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
-            ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
-        edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
-        }
-    }
-}
-
-\tikzset{three sided1/.style={
-        draw=none,
-        append after command={
-            [-,shorten <= -0.5\pgflinewidth]
-            ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
-        edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
-            ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
-        edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
-            ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
-        edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
-        }
-    }
-}
-
-\begin{center}
-	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
-		\node [rectangle ] (1)  at (-7, 2) {$\ldots$};
-		\node [rectangle, draw] (2) at  (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
-		\node [rectangle, draw] (3) at  (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$};
-		\node [rectangle] (4) at  (9, 2) {$\ldots$};
-		\node [rectangle] (5) at  (-7, -2) {$\ldots$};
-		\node [rectangle, draw] (6) at  (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
-		\node [rectangle, draw] (7) at  ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
-		\node [rectangle] (8) at  ( 9, -2) {$\ldots$};
-		\node [rectangle] (9) at  (-7, -6) {$\ldots$};
-		\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = \retrieve \; r_i\;v_i$};
-		\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = \retrieve \; r_{i+1}\;v_{i+1}$};
-		\node [rectangle] (12) at  (9, -6) {$\ldots$};
-
-
-		\path (1) edge [] node {} (2);
-		\path (5) edge [] node {} (6);
-		\path (9) edge [] node {} (10);
-
-		\path (11) edge [] node {} (12);
-		\path (7) edge [] node {} (8);
-		\path (3) edge [] node {} (4);
-
-	
-		\path (2) edge [] node {$\backslash a$} (3);
-		\path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
-		\path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
-		%\path (6) edge [] node {$\vdash v_i : r_i$} (10);
-		%\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
-
-		\path (10) edge [dashed, <->] node {$=$} (11);
-
-		\path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
-
-%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
-%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
-%		\path	(r)
-%			edge [] node {$\internalise$} (a);
-%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
-%		\path	(a)
-%			edge [] node {$\backslash a$} (a1);
-%
-%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
-%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
-%		\path	(a1)
-%			edge [] node {$\backslash a$} (a21);
-%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
-%		\path	(a22)
-%			edge [] node {$\backslash b$} (a3);
-%		\path	(a21)
-%			edge [dashed, bend right] node {} (a3);
-%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
-%		\path	(a3)
-%			edge [below] node {$\bmkeps$} (bs);
-%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
-%		\path 	(bs)
-%			edge [] node {$\decode$} (v);
-
-
-	\end{tikzpicture}
-	%\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
-\end{center}
-
-When simplifications are added, the inhabitation relation no longer holds,
-causing the above diagram to break.
-
-Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
-
-
-
-we note that the invariant
-$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
-to maintain because the precondition $\vdash v_i : r_i$ is too weak.
-It does not require $v_i$ to be a POSIX value 
-
-
-{\color{red} \rule{\linewidth}{0.5mm}}
-New content ends
-{\color{red} \rule{\linewidth}{0.5mm}}
-
-
-
-%
-%
-%which is essential for getting an understanding this thesis
-%in chapter \ref{Bitcoded1}, which is necessary for understanding why
-%the proof 
-%
-%In this chapter,
-
-%We contrast our simplification function 
-%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
-%This is another case for the usefulness 
-%and reliability of formal proofs on algorithms.
-%These ``aggressive'' simplifications would not be possible in the injection-based 
-%lexing we introduced in chapter \ref{Inj}.
-%We then prove the correctness with the improved version of 
-%$\blexer$, called $\blexersimp$, by establishing 
-%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
-%
 \section{Simplifications by Sulzmann and Lu}
+\marginpar{moved \\simplification \\section to front \\to make coherent\\ sense.}
 The algorithms $\lexer$ and $\blexer$ work beautifully as functional 
 programs, but not as practical code. One main reason for the slowness is due
 to the size of intermediate representations--the derivative regular expressions
@@ -271,7 +52,10 @@
 	\end{tabular}
 \end{center}
 \noindent
-As can be seen, there are several duplications.
+From the second derivative several duplicate sub-expressions 
+already needs to be eliminated (possible
+bitcodes are omitted to make the presentation more concise
+because they are not the key part of the simplifications).
 A simple-minded simplification function cannot simplify
 the third regular expression in the above chain of derivative
 regular expressions, namely
@@ -472,6 +256,8 @@
 %one needs to be more careful when designing the
 %simplification function and making claims about them.
 
+
+
 \section{Our $\textit{Simp}$ Function}
 We will now introduce our own simplification function.
 %by making a contrast with $\textit{simp}\_{SL}$.
@@ -748,6 +534,294 @@
 $\textit{blexer\_SLSimp}$ by Sulzmann and Lu.
 In the next section we are going to establish that our
 simplification preserves the correctness of the algorithm.
+
+
+
+
+\section{Why $\textit{Blexer}$'s Proof Does Not Work}
+The fundamental reason is we cannot extend the correctness proof of theorem 4
+because lemma 13 does not hold anymore when simplifications are involved.
+\marginpar{\em rephrased things \\so why new \\proof makes sense.}
+%The proof details are necessary materials for this thesis
+%because it provides necessary context to explain why we need a
+%new framework for the proof of $\blexersimp$, which involves
+%simplifications that cause structural changes to the regular expression.
+%A new formal proof of the correctness of $\blexersimp$, where the 
+%proof of $\blexer$
+%is not applicatble in the sense that we cannot straightforwardly extend the
+%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
+%not hold anymore.
+%This is because the structural induction on the stepwise correctness
+%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
+%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
+%each other.
+%In this chapter we introduce simplifications
+%for annotated regular expressions that can be applied to 
+%each intermediate derivative result. This allows
+%us to make $\blexer$ much more efficient.
+%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
+%but their simplification functions could have been more efficient and in some cases needed fixing.
+
+
+In particular, the correctness theorem 
+of the un-optimised bit-coded lexer $\blexer$ in 
+chapter \ref{Bitcoded1} formalised by Ausaf et al.
+relies crucially on lemma \ref{retrieveStepwise} that says
+any value can be retrieved in a stepwise manner, namely:
+\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred	
+	\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)
+\end{equation}
+%This no longer holds once we introduce simplifications.
+Simplifications are necessary to control the size of derivatives,
+but they also destroy the structures of the regular expressions
+such that \ref{eq:stepwise} does not hold.
+
+
+We want to prove the correctness of $\blexersimp$ which integrates
+$\textit{bsimp}$ by applying it after each call to the derivative:
+\begin{center}
+\begin{tabular}{lcl}
+	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
+$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
+\end{tabular}
+\begin{tabular}{lcl}
+  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
+      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
+  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
+  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
+  & & $\;\;\textit{else}\;\textit{None}$
+\end{tabular}
+\end{center}
+\noindent
+Previously without $\textit{bsimp}$ the exact structure of each intermediate 
+regular expression is preserved, allowing pairs of inhabitation relations 
+in the form $\vdash v : r \backslash c $ and
+$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}.
+But $\blexersimp$ introduces simplification after the derivative,
+making it difficult to align the structures of values and regular expressions.
+If we change the form of property \ref{eq:stepwise} to 
+adapt to the needs of $\blexersimp$ the precondition of becomes
+\[
+	\vdash v : (\textit{bsimp} \; (r\backslash c))
+\]
+The inhabitation relation of the other pair no longer holds,
+because $\inj$ does not work on the simplified value $v$ 
+and the unsimplified regular expression $r$.
+The retrieve function will not work either.
+\[
+	\vdash \inj \; r \; c \; v : r
+\]
+It seems unclear what procedures needs to be
+used to create a new value $v_?$ such that
+\[
+	\vdash v_? : r \; \text{and} \; \retrieve \; r \; v_?   = \retrieve \; (\textit{bsimp} \; (r\backslash c)) \; v
+\]
+hold.
+%It is clear that once we made 
+%$v$ to align with $\textit{bsimp} \; r_{c}$
+%in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
+%in for the above statement to hold.
+Ausaf et al. \cite{AusafDyckhoffUrban2016}
+used something they call rectification functions to restore the original value from the simplified value.
+The idea is that simplification functions not only returns a regular expression,
+but also a rectification function 
+\[
+	\textit{simp}^{rect} : Regex \Rightarrow (Value \Rightarrow Value, Regex)
+%\textit{frect} : Value \Rightarrow Value
+\]
+that is recorded recursively,
+and then applied to the previous value 
+to obtain the correct value for $\inj$ to work on. 
+The recursive case of the lexer is defined as something like
+\[
+	\textit{slexer} \; r \; (c\!::\!s) \dn let \;(\textit{frect}, r_c) = \textit{simp}^{rect} \;(r \backslash c) \;\; \textit{in}\;\;
+	\inj \; r \; c \; (\textit{frect} \; (\textit{slexer} \; r_c\; s))
+	%\textit{match} \; s \; \textit{case} [
+\]
+However this approach (including $\textit{slexer}$'s correctness proof) only 
+works without bitcodes, and it limits the kind of simplifications one can introduce.
+%and they have not yet extended their relatively simple simplifications
+%to more aggressive ones.
+See the thesis by Ausaf \cite{Ausaf}
+for details.
+
+%\begin{center}
+%	$\vdash v:  (r\backslash c) \implies \retrieve \; (\mathord{?}(\textit{bsimp} \; r_c)) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
+%\end{center}
+%\noindent
+
+We were not able to use their idea for our very strong simplification rules.
+Therefore we are taking another route that completely
+disposes of lemma \ref{retrieveStepwise},
+and prove a weakened inductive invariant instead.
+
+Let us first explain why lemma \ref{retrieveStepwise}'s requirement 
+is too strong, and suggest a few possible fixes, which leads to
+our proof which we believe was the most natural and effective method.
+
+
+
+\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
+
+%From this chapter we start with the main contribution of this thesis, which
+
+The $\blexer$ proof relies on a lockstep POSIX
+correspondence between the lexical value and the
+regular expression in each derivative and injection.
+If we zoom into the diagram \ref{graph:inj} and look specifically at
+the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
+the invariant that the same bitcodes can be extracted from the pairs:
+\tikzset{three sided/.style={
+        draw=none,
+        append after command={
+            [-,shorten <= -0.5\pgflinewidth]
+            ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+        edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
+            ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+        edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
+            ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+        edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+        }
+    }
+}
+
+\tikzset{three sided1/.style={
+        draw=none,
+        append after command={
+            [-,shorten <= -0.5\pgflinewidth]
+            ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+        edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
+            ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+        edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
+            ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+        edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+        }
+    }
+}
+
+\begin{center}
+	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
+		%\node [rectangle ] (1)  at (-7, 2) {$\ldots$};
+		%\node [rectangle, draw] (2) at  (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
+		%\node [rectangle, draw] (3) at  (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
+		%\node [rectangle] (4) at  (9, 2) {$\ldots$};
+		%\node [rectangle] (5) at  (-7, -2) {$\ldots$};
+		%\node [rectangle, draw] (6) at  (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
+		%\node [rectangle, draw] (7) at  ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+		%\node [rectangle] (8) at  ( 9, -2) {$\ldots$};
+		%\node [rectangle] (9) at  (-7, -6) {$\ldots$};
+		%\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+		%\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+		%\node [rectangle] (12) at  (9, -6) {$\ldots$};
+		
+		\node [rectangle ] (1)  at (-8, 2) {$\ldots$};
+		\node [rectangle, draw] (2) at  (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
+		\node [rectangle, draw] (3) at  (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
+		\node [rectangle] (4) at  (8, 2) {$\ldots$};
+		\node [rectangle] (5) at  (-8, -2) {$\ldots$};
+		\node [rectangle, draw] (6) at  (-5, -2) {$v_i = \Stars \; [\Left (a)]$};
+		\node [rectangle, draw] (7) at  ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+		\node [rectangle] (8) at  ( 8, -2) {$\ldots$};
+		\node [rectangle] (9) at  (-8, -6) {$\ldots$};
+		\node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+		\node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+		\node [rectangle] (12) at  (8, -6) {$\ldots$};
+
+
+
+		\path (1) edge [] node {} (2);
+		\path (5) edge [] node {} (6);
+		\path (9) edge [] node {} (10);
+
+		\path (11) edge [] node {} (12);
+		\path (7) edge [] node {} (8);
+		\path (3) edge [] node {} (4);
+
+
+		\path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10);
+		\path (2) edge [dashed,bend left = 48] node {} (10);
+
+		\path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11);
+		\path (3) edge [dashed,bend left = 45] node {} (11);
+	
+		\path (2) edge [] node {$\backslash a$} (3);
+		\path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
+		\path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
+		%\path (6) edge [] node {$\vdash v_i : r_i$} (10);
+		%\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
+
+		\path (10) edge [dashed, <->] node {$=$} (11);
+
+		\path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
+
+%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
+%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
+%		\path	(r)
+%			edge [] node {$\internalise$} (a);
+%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
+%		\path	(a)
+%			edge [] node {$\backslash a$} (a1);
+%
+%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
+%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
+%		\path	(a1)
+%			edge [] node {$\backslash a$} (a21);
+%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
+%		\path	(a22)
+%			edge [] node {$\backslash b$} (a3);
+%		\path	(a21)
+%			edge [dashed, bend right] node {} (a3);
+%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
+%		\path	(a3)
+%			edge [below] node {$\bmkeps$} (bs);
+%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
+%		\path 	(bs)
+%			edge [] node {$\decode$} (v);
+
+
+	\end{tikzpicture}
+	%\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
+\end{center}
+
+When simplifications are added, the inhabitation relation no longer holds,
+causing the above diagram to break.
+
+Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
+
+
+
+we note that the invariant
+$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
+to maintain because the precondition $\vdash v_i : r_i$ is too weak.
+It does not require $v_i$ to be a POSIX value 
+
+
+{\color{red} \rule{\linewidth}{0.5mm}}
+New content ends
+{\color{red} \rule{\linewidth}{0.5mm}}
+
+
+
+%
+%
+%which is essential for getting an understanding this thesis
+%in chapter \ref{Bitcoded1}, which is necessary for understanding why
+%the proof 
+%
+%In this chapter,
+
+%We contrast our simplification function 
+%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
+%This is another case for the usefulness 
+%and reliability of formal proofs on algorithms.
+%These ``aggressive'' simplifications would not be possible in the injection-based 
+%lexing we introduced in chapter \ref{Inj}.
+%We then prove the correctness with the improved version of 
+%$\blexer$, called $\blexersimp$, by establishing 
+%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
+%
+
+
 %----------------------------------------------------------------------------------------
 %	SECTION rewrite relation
 %----------------------------------------------------------------------------------------
author	Chengsong
	Sat, 08 Jul 2023 22:18:22 +0100
changeset 656	753a3b0ee02b
parent 655	d8f82c690b32
child 657	00171b627b8d