lexing: comparison ChengsongTanPhdThesis/Chapters/Bitcoded2.tex

equal deleted inserted replaced

-:a4d692a9a289
+:bc5571c38d1f
 %its correctness proof in
 %Chapter 3\ref{Chapter3}.
 {\color{red} \rule{\linewidth}{0.5mm}}
 New content starts.
 {\color{red} \rule{\linewidth}{0.5mm}}
+\marginpar{\em Added a completely new overview section, highlighting the contributions.}
 \section{Overview}
 This chapter
 is the point from which novel contributions of this PhD project are introduced
-in detail,
+in detail.
-and previous
+The material in the
-chapters are essential background work for setting the scene of the formal proof we
+previous
+chapters is necessary
+material for setting the scene of the formal proof we
 are about to describe.
+The fundamental reason is we cannot extend the correctness proof of theorem 4,
+because lemma 13 does not hold anymore when simplifications are involved.
 The proof details are necessary materials for this thesis
 because it provides necessary context to explain why we need a
 new framework for the proof of $\blexersimp$, which involves
 simplifications that cause structural changes to the regular expression.
-a new formal proof of the correctness of $\blexersimp$, where the
+A new formal proof of the correctness of $\blexersimp$, where the
 proof of $\blexer$
 is not applicatble in the sense that we cannot straightforwardly extend the
 proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
 not hold anymore.
 %This is because the structural induction on the stepwise correctness
 In particular, the correctness theorem
 of the un-optimised bit-coded lexer $\blexer$ in
 chapter \ref{Bitcoded1} formalised by Ausaf et al.
-relies on lemma \ref{retrieveStepwise} that says
+relies crucially on lemma \ref{retrieveStepwise} that says
-any value can be retrieved in a stepwise manner:
+any value can be retrieved in a stepwise manner, namely:
-\begin{center}
+\begin{center}%eqref: this proposition needs to be referred
 	$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)$
 \end{center}
 This no longer holds once we introduce simplifications.
-Simplifications are necessary to control the size of regular expressions
+Simplifications are necessary to control the size of derivatives
-during derivatives by eliminating redundant
-sub-expression with some procedure we call $\textit{bsimp}$.
 We want to prove the correctness of $\blexersimp$ which integrates
 $\textit{bsimp}$ by applying it after each call to the derivative:
 \begin{center}
 \begin{tabular}{lcl}
 	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
 regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
 $\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if
 we use the convenient notation $r_{c} \dn r\backslash c$
 and $v_{r}^{c} \dn \inj \;r \; c \; v$),
 but $\blexersimp$ introduces simplification after the derivative,
-getting us trouble in aligning the pairs:
+making it difficult to align the pairs:
 \begin{center}
-	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; \textit{bsimp} \; r_c \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
+	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
 \end{center}
 \noindent
-It is quite clear that once we made
+It is clear that once we made
 $v$ to align with $\textit{bsimp} \; r_{c}$
 in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
 in for the above statement to hold.
 Ausaf et al. \cite{AusafUrbanDyckhoff2016}
 made some initial attempts with this idea, see \cite{FahadThesis}
 for details.
+They added
+and then rectify it to
+this works fine, however that limits the kind of simplifications you can introduce.
+We cannot use their idea for our very strong simplification rules.
+Therefore we take our route
+a wea
 The other route is to dispose of lemma \ref{retrieveStepwise},
-and prove a slightly weakened inductive invariant instead.
+and prove a weakened inductive invariant instead.
 We adopt this approach in this thesis.
-We first introduce why the inductive invariant in $\blexer$'s proof
+Let us first explain why the requirement in $\blexer$'s proof
 is too strong, and suggest a few possible fixes, which leads to
 our proof which we believe was the most natural and effective method.
 %From this chapter we start with the main contribution of this thesis, which
 The $\blexer$ proof relies on a lockstep POSIX
 correspondence between the lexical value and the
 regular expression in each derivative and injection.
-If we zoom into the diagram \ref{graph:in} and look specifically at
+If we zoom into the diagram \ref{graph:inj} and look specifically at
 the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
 the invariant that the same bitcodes can be extracted from the pairs:
+\tikzset{three sided/.style={
+draw=none,
+append after command={
+[-,shorten <= -0.5\pgflinewidth]
+([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+}
+}
+}
+\tikzset{three sided1/.style={
+draw=none,
+append after command={
+[-,shorten <= -0.5\pgflinewidth]
+([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
+edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
+edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
+edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
+}
+}
+}
+\begin{figure}[H]
+	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
+		\node [rectangle, draw] (1)  at (-7, 2) {$\ldots$};
+		\node [rectangle, draw] (2) at  (-4, 2) {$_{bs'}(_Za+_Saa)^*$};
+		\node [rectangle, draw] (3) at  (4, 2) {$_{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$};
+		\node [rectangle, draw] (4) at  (7, 2) {$\ldots$};
+		\node [rectangle, draw] (5) at  (-7, -2) {$\ldots$};
+		\node [rectangle, draw] (6) at  (-4, -2) {$\Stars \; [\Left (a)]$};
+		\node [rectangle, draw] (7) at  ( 4, -2) {$\Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+		\node [rectangle, draw] (8) at  ( 7, -2) {$\ldots$};
+		\node [rectangle, draw] (9) at  (-7, -6) {$\ldots$};
+		\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits} = \retrieve \; r_i\;v_i$};
+		\node [rectangle, draw] (11) at (4, -6) {$\textit{bits} = \retrieve \; r_{i+1}\;v_{i+1}$};
+		\node [rectangle, draw] (12) at  (7, -6) {$\ldots$};
+		\path (1) edge [] node {} (2);
+		\path (2) edge [] node {$\backslash a$} (3);
+%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
+%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
+%		\path	(r)
+%			edge [] node {$\internalise$} (a);
+%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
+%		\path	(a)
+%			edge [] node {$\backslash a$} (a1);
+%
+%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
+%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
+%		\path	(a1)
+%			edge [] node {$\backslash a$} (a21);
+%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
+%		\path	(a22)
+%			edge [] node {$\backslash b$} (a3);
+%		\path	(a21)
+%			edge [dashed, bend right] node {} (a3);
+%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
+%		\path	(a3)
+%			edge [below] node {$\bmkeps$} (bs);
+%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
+%		\path 	(bs)
+%			edge [] node {$\decode$} (v);
+	\end{tikzpicture}
+	\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
+\end{figure}
 When simplifications are added, the inhabitation relation no longer holds,
 causing the above diagram to break.
 Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.

changeset 653	bc5571c38d1f
parent 652	a4d692a9a289
child 654	2ad20ba5b178