ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
changeset 653 bc5571c38d1f
parent 652 a4d692a9a289
child 654 2ad20ba5b178
equal deleted inserted replaced
652:a4d692a9a289 653:bc5571c38d1f
     9 %its correctness proof in 
     9 %its correctness proof in 
    10 %Chapter 3\ref{Chapter3}. 
    10 %Chapter 3\ref{Chapter3}. 
    11 {\color{red} \rule{\linewidth}{0.5mm}}
    11 {\color{red} \rule{\linewidth}{0.5mm}}
    12 New content starts.
    12 New content starts.
    13 {\color{red} \rule{\linewidth}{0.5mm}}
    13 {\color{red} \rule{\linewidth}{0.5mm}}
       
    14 \marginpar{\em Added a completely new overview section, highlighting the contributions.}
    14 
    15 
    15 \section{Overview}
    16 \section{Overview}
    16 
    17 
    17 This chapter
    18 This chapter
    18 is the point from which novel contributions of this PhD project are introduced
    19 is the point from which novel contributions of this PhD project are introduced
    19 in detail, 
    20 in detail. 
    20 and previous
    21 The material in the
    21 chapters are essential background work for setting the scene of the formal proof we
    22 previous
       
    23 chapters is necessary 
       
    24 material for setting the scene of the formal proof we
    22 are about to describe.
    25 are about to describe.
       
    26 
       
    27 The fundamental reason is we cannot extend the correctness proof of theorem 4,
       
    28 because lemma 13 does not hold anymore when simplifications are involved.
       
    29 
    23 The proof details are necessary materials for this thesis
    30 The proof details are necessary materials for this thesis
    24 because it provides necessary context to explain why we need a
    31 because it provides necessary context to explain why we need a
    25 new framework for the proof of $\blexersimp$, which involves
    32 new framework for the proof of $\blexersimp$, which involves
    26 simplifications that cause structural changes to the regular expression.
    33 simplifications that cause structural changes to the regular expression.
    27 a new formal proof of the correctness of $\blexersimp$, where the 
    34 A new formal proof of the correctness of $\blexersimp$, where the 
    28 proof of $\blexer$
    35 proof of $\blexer$
    29 is not applicatble in the sense that we cannot straightforwardly extend the
    36 is not applicatble in the sense that we cannot straightforwardly extend the
    30 proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
    37 proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
    31 not hold anymore.
    38 not hold anymore.
    32 %This is because the structural induction on the stepwise correctness
    39 %This is because the structural induction on the stepwise correctness
    42 
    49 
    43 
    50 
    44 In particular, the correctness theorem 
    51 In particular, the correctness theorem 
    45 of the un-optimised bit-coded lexer $\blexer$ in 
    52 of the un-optimised bit-coded lexer $\blexer$ in 
    46 chapter \ref{Bitcoded1} formalised by Ausaf et al.
    53 chapter \ref{Bitcoded1} formalised by Ausaf et al.
    47 relies on lemma \ref{retrieveStepwise} that says
    54 relies crucially on lemma \ref{retrieveStepwise} that says
    48 any value can be retrieved in a stepwise manner:
    55 any value can be retrieved in a stepwise manner, namely:
    49 \begin{center}	
    56 \begin{center}%eqref: this proposition needs to be referred	
    50 	$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)$
    57 	$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)$
    51 \end{center}
    58 \end{center}
    52 This no longer holds once we introduce simplifications.
    59 This no longer holds once we introduce simplifications.
    53 Simplifications are necessary to control the size of regular expressions 
    60 Simplifications are necessary to control the size of derivatives 
    54 during derivatives by eliminating redundant 
    61 
    55 sub-expression with some procedure we call $\textit{bsimp}$.
    62 
    56 We want to prove the correctness of $\blexersimp$ which integrates
    63 We want to prove the correctness of $\blexersimp$ which integrates
    57 $\textit{bsimp}$ by applying it after each call to the derivative:
    64 $\textit{bsimp}$ by applying it after each call to the derivative:
    58 \begin{center}
    65 \begin{center}
    59 \begin{tabular}{lcl}
    66 \begin{tabular}{lcl}
    60 	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
    67 	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
    73 regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
    80 regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
    74 $\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if 
    81 $\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if 
    75 we use the convenient notation $r_{c} \dn r\backslash c$
    82 we use the convenient notation $r_{c} \dn r\backslash c$
    76 and $v_{r}^{c} \dn \inj \;r \; c \; v$),
    83 and $v_{r}^{c} \dn \inj \;r \; c \; v$),
    77 but $\blexersimp$ introduces simplification after the derivative,
    84 but $\blexersimp$ introduces simplification after the derivative,
    78 getting us trouble in aligning the pairs:
    85 making it difficult to align the pairs:
    79 \begin{center}
    86 \begin{center}
    80 	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; \textit{bsimp} \; r_c \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
    87 	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
    81 \end{center}
    88 \end{center}
    82 \noindent
    89 \noindent
    83 It is quite clear that once we made 
    90 It is clear that once we made 
    84 $v$ to align with $\textit{bsimp} \; r_{c}$
    91 $v$ to align with $\textit{bsimp} \; r_{c}$
    85 in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
    92 in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
    86 in for the above statement to hold.
    93 in for the above statement to hold.
    87 Ausaf et al. \cite{AusafUrbanDyckhoff2016}
    94 Ausaf et al. \cite{AusafUrbanDyckhoff2016}
    88 made some initial attempts with this idea, see \cite{FahadThesis}
    95 made some initial attempts with this idea, see \cite{FahadThesis}
    89 for details.
    96 for details.
       
    97 
       
    98 They added
       
    99 and then rectify it to
       
   100 
       
   101 
       
   102 this works fine, however that limits the kind of simplifications you can introduce.
       
   103 We cannot use their idea for our very strong simplification rules.
       
   104 Therefore we take our route
       
   105 a wea
       
   106 
    90 The other route is to dispose of lemma \ref{retrieveStepwise},
   107 The other route is to dispose of lemma \ref{retrieveStepwise},
    91 and prove a slightly weakened inductive invariant instead.
   108 and prove a weakened inductive invariant instead.
    92 We adopt this approach in this thesis.
   109 We adopt this approach in this thesis.
    93 
   110 
    94 We first introduce why the inductive invariant in $\blexer$'s proof
   111 Let us first explain why the requirement in $\blexer$'s proof
    95 is too strong, and suggest a few possible fixes, which leads to
   112 is too strong, and suggest a few possible fixes, which leads to
    96 our proof which we believe was the most natural and effective method.
   113 our proof which we believe was the most natural and effective method.
    97 
   114 
    98 
   115 
    99 
   116 
   102 %From this chapter we start with the main contribution of this thesis, which
   119 %From this chapter we start with the main contribution of this thesis, which
   103 
   120 
   104 The $\blexer$ proof relies on a lockstep POSIX
   121 The $\blexer$ proof relies on a lockstep POSIX
   105 correspondence between the lexical value and the
   122 correspondence between the lexical value and the
   106 regular expression in each derivative and injection.
   123 regular expression in each derivative and injection.
   107 If we zoom into the diagram \ref{graph:in} and look specifically at
   124 If we zoom into the diagram \ref{graph:inj} and look specifically at
   108 the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
   125 the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
   109 the invariant that the same bitcodes can be extracted from the pairs:
   126 the invariant that the same bitcodes can be extracted from the pairs:
       
   127 \tikzset{three sided/.style={
       
   128         draw=none,
       
   129         append after command={
       
   130             [-,shorten <= -0.5\pgflinewidth]
       
   131             ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
       
   132         edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
       
   133             ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
       
   134         edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
       
   135             ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
       
   136         edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
       
   137         }
       
   138     }
       
   139 }
       
   140 
       
   141 \tikzset{three sided1/.style={
       
   142         draw=none,
       
   143         append after command={
       
   144             [-,shorten <= -0.5\pgflinewidth]
       
   145             ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
       
   146         edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
       
   147             ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
       
   148         edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
       
   149             ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
       
   150         edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
       
   151         }
       
   152     }
       
   153 }
       
   154 
       
   155 \begin{figure}[H]
       
   156 	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
       
   157 		\node [rectangle, draw] (1)  at (-7, 2) {$\ldots$};
       
   158 		\node [rectangle, draw] (2) at  (-4, 2) {$_{bs'}(_Za+_Saa)^*$};
       
   159 		\node [rectangle, draw] (3) at  (4, 2) {$_{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$};
       
   160 		\node [rectangle, draw] (4) at  (7, 2) {$\ldots$};
       
   161 		\node [rectangle, draw] (5) at  (-7, -2) {$\ldots$};
       
   162 		\node [rectangle, draw] (6) at  (-4, -2) {$\Stars \; [\Left (a)]$};
       
   163 		\node [rectangle, draw] (7) at  ( 4, -2) {$\Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
       
   164 		\node [rectangle, draw] (8) at  ( 7, -2) {$\ldots$};
       
   165 		\node [rectangle, draw] (9) at  (-7, -6) {$\ldots$};
       
   166 		\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits} = \retrieve \; r_i\;v_i$};
       
   167 		\node [rectangle, draw] (11) at (4, -6) {$\textit{bits} = \retrieve \; r_{i+1}\;v_{i+1}$};
       
   168 		\node [rectangle, draw] (12) at  (7, -6) {$\ldots$};
       
   169 
       
   170 
       
   171 		\path (1) edge [] node {} (2);
       
   172 		\path (2) edge [] node {$\backslash a$} (3);
       
   173 
       
   174 %		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
       
   175 %		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
       
   176 %		\path	(r)
       
   177 %			edge [] node {$\internalise$} (a);
       
   178 %		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
       
   179 %		\path	(a)
       
   180 %			edge [] node {$\backslash a$} (a1);
       
   181 %
       
   182 %		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
       
   183 %		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
       
   184 %		\path	(a1)
       
   185 %			edge [] node {$\backslash a$} (a21);
       
   186 %		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
       
   187 %		\path	(a22)
       
   188 %			edge [] node {$\backslash b$} (a3);
       
   189 %		\path	(a21)
       
   190 %			edge [dashed, bend right] node {} (a3);
       
   191 %		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
       
   192 %		\path	(a3)
       
   193 %			edge [below] node {$\bmkeps$} (bs);
       
   194 %		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
       
   195 %		\path 	(bs)
       
   196 %			edge [] node {$\decode$} (v);
       
   197 
       
   198 
       
   199 	\end{tikzpicture}
       
   200 	\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
       
   201 \end{figure}
   110 
   202 
   111 When simplifications are added, the inhabitation relation no longer holds,
   203 When simplifications are added, the inhabitation relation no longer holds,
   112 causing the above diagram to break.
   204 causing the above diagram to break.
   113 
   205 
   114 Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
   206 Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.