ChengsongTanPhdThesis/Chapters/Bitcoded1.tex
author Chengsong
Wed, 23 Aug 2023 03:02:31 +0100
changeset 668 3831621d7b14
parent 667 660cf698eb26
permissions -rwxr-xr-x
added technical Overview section, almost done introduction
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     1
% Chapter Template
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     2
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     3
% Main chapter title
538
8016a2480704 intro and chap2
Chengsong
parents: 537
diff changeset
     4
\chapter{Bit-coded Algorithm of Sulzmann and Lu}
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     5
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     6
\label{Bitcoded1} % Change X to a consecutive number; for referencing this chapter elsewhere, use \ref{ChapterX}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     7
%Then we illustrate how the algorithm without bitcodes falls short for such aggressive 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     8
%simplifications and therefore introduce our version of the bitcoded algorithm and 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     9
%its correctness proof in 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    10
%Chapter 3\ref{Chapter3}. 
564
Chengsong
parents: 543
diff changeset
    11
In this chapter, we are going to describe the bit-coded algorithm
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 649
diff changeset
    12
introduced by Sulzmann and Lu \parencite{Sulzmann2014} and their correctness proof.
667
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    13
Just like in chapter \ref{Inj}, the algorithms and proofs have been included
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    14
for self-containedness reasons,
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    15
even though they have been originally found and described by
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    16
Sulzmann and Lu (\cite{Sulzmann2014}) and 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    17
Ausaf et al. (\cite{AusafDyckhoffUrban2016} and \cite{Ausaf}).
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    18
%In addition to this, the 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    19
The details of the proofs in this thesis 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    20
also follow more closely the actual Isabelle formalisation.
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    21
For example, lemma \ref{flexStepwise} and \ref{retrieveStepwise}
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    22
are not included in the publications by Ausaf et al., despite them being
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    23
some of the key lemmas leading to the correctness result.
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    24
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    25
We will first motivate the bit-coded algorithm in section \ref{bitMotivate},
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    26
and then introduce their formal definitions in section \ref{bitDef},
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    27
followed by a description of the correctness proof of $\blexer$ in section \ref{blexerProof}.
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    28
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 649
diff changeset
    29
%to address the growth problem of 
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 649
diff changeset
    30
%derivatives of
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 649
diff changeset
    31
%regular expressions. 
667
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    32
%We have implemented their algorithm in Scala and Isabelle, 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    33
%and found problems 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    34
%in their algorithm, such as de-duplication not working properly and redundant
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    35
%fixpoint construction. 
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
    36
\section{The Motivation Behind Using Bitcodes}\label{bitMotivate}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
    37
Let us give again the definition of $\lexer$ from Chapter \ref{Inj}:
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    38
\begin{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    39
\begin{tabular}{lcl}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    40
	$\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\;  \Some(\mkeps \; r) \; \textit{else} \; \None$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    41
	$\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    42
	& & $\quad \phantom{\mid}\; \None \implies \None$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    43
	& & $\quad \mid           \Some(v) \implies \Some(\inj \; r\; c\; v)$
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    44
\end{tabular}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    45
\end{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    46
\noindent
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    47
This algorithm works nicely as a functional program that utilizes Brzozowski derivatives:
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    48
each derivative character is remembered and stacked up, and
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    49
injected back in reverse order as they have been taken derivative of.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    50
The derivative operation $\backslash$ and its reverse operation
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    51
$\inj$ is of similar shape and compexity, and work in lockstep with each other.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    52
However if we take a closer look into the example run 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    53
of $\lexer$ we have shown in chapter \ref{Inj},
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    54
many inefficiencies exist:
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    55
\begin{center}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    56
	\begin{tabular}{lcl}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    57
		$(a+\textcolor{magenta}{a}\textcolor{blue}{a})^* \cdot \textcolor{red}{c}$ & $\stackrel{\backslash \textcolor{magenta}{a}}{\rightarrow}$ & $((\ONE + \textcolor{magenta}{\ONE} \textcolor{blue}{a})\cdot (a+aa)^*)\cdot \textcolor{red}{c}+\ZERO$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    58
				   %& $\stackrel{\rightarrow}{\backslash a}$ & $((\ONE + \ONE a)\cdot (a+aa)^*)\cdot c + \ZERO$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    59
				    & $\stackrel{\backslash \textcolor{blue}{a}}{\rightarrow}$  &
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    60
				    $(((\ZERO+(\ZERO a+ \textcolor{blue}{\ONE}))\cdot (a+aa)^* + (\ONE+\ONE a)\cdot (a+aa)^* )\cdot \textcolor{red}{\mathbf{c}} + \ZERO)+\ZERO$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    61
				    & $\stackrel{\backslash \textcolor{red}{c}}{\rightarrow}$  &
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    62
				    $((r_{=0}\cdot c + \textcolor{red}{\ONE})+\ZERO)+\ZERO$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    63
				    & $\stackrel{\mkeps}{\rightarrow}$ & $\Left (\Left \; (\Right \; \textcolor{red}{\Empty}))$ \\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    64
				   & $\stackrel{\inj \;\textcolor{red}{c} }{\rightarrow}$ & 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    65
				   $\Left \; (\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\; \textcolor{blue}{\Empty})) \; \Stars \, [])) \; \textcolor{red}{c}))$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    66
				   & $\stackrel{\inj \;\textcolor{blue}{a}}{\rightarrow}$ & 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    67
				   $\Left\; (\Seq \; (\Seq\; (\Right \; (\Seq \; \textcolor{magenta}{\Empty }\; \textcolor{blue}{ \mathbf{a}}) )
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    68
				   \;\Stars \,[]) \; \textcolor{red}{c})$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    69
				   & $\stackrel{\inj \;\textcolor{magenta}{a}}{\rightarrow}$ & $\Seq \; (\Stars \; [\Right \; (\Seq \; \mathbf{\textcolor{magenta}{a}} \; \textcolor{blue}{a})]) \; \textcolor{red}{ c}$
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    70
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    71
				   %$\inj \; r \; c\A$
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    72
		%$\inj \; (a\cdot b)\cdot c \; \Seq \; \ONE \; b$ & $=$ & $(a+e)$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    73
	\end{tabular}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    74
\end{center}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    75
\noindent
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    76
For the un
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    77
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    78
\begin{center}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    79
	\begin{tabular}{lcl}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    80
$\inj$ &  $\quad ((\ONE + {\ONE} 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    81
	\textcolor{blue}{a})\cdot (a+aa)^*)\cdot
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    82
	 + \ZERO \; \quad \textcolor{blue}{a} \; $ & \\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    83
$\quad \Left \; 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    84
(\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\; 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    85
	\textcolor{blue}{\Empty})) \; \Stars \, [])) \; c))$ & \\$=$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    86
	$\Left \; (\inj \; ((\ONE + \ONE 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    87
	\textcolor{blue}{a})\cdot (a+aa)^*)\cdot 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    88
	c \quad  \textcolor{blue}{a} \quad (\Left \; (\Seq\; ( \Left \; (\Seq \; (\Right \; (\Right\; 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    89
	\textcolor{blue}{\Empty})) \; \Stars \, []) ) \; c ) )\;\;)$ &\\ $=$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    90
	$\Left \; (\Seq \; (\inj \; (\ONE + \ONE \textcolor{blue}{a})\cdot(a+aa)^* \quad \textcolor{blue}{a} \quad
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    91
	(\Left \; (\Seq \; (\Right \; (\Right\;\textcolor{blue}{\Empty})))) ) \; c ) $ & \\$=$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    92
	$\Left \; (\Seq \; (\Seq \; (\inj \quad (\ONE + \ONE \textcolor{blue}{a}) \quad \textcolor{blue}{a}\quad \Right \;(\Right \; \textcolor{blue}{\Empty}) ) \; \Stars \,[])\; c)$ &\\ $=$ \\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    93
	$\Left \; (\Seq \; (\Seq \; (\Right \; (\inj \; \ONE \textcolor{blue}{a} \quad \textcolor{blue}{a}\quad (\Right \;\textcolor{blue}{\Empty})))) \; \Stars \, [])$
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    94
																		   & \\ $=$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    95
																		   $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; (\mkeps \; \ONE)\;(\inj \;\textcolor{blue}{a} \; \textcolor{blue}{a} \; \textcolor{blue}{\Empty})) ) ) \; \Stars \, [] )$
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    96
																		   & \\ $=$\\
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    97
																		   $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; \Empty \; \textcolor{blue}{a}) ) ) \; \Stars \, [] )$
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    98
 	\end{tabular}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
    99
\end{center}
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   100
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   101
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   102
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   103
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   104
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   105
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   106
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   107
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   108
% The first problem with this algorithm is that
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   109
% for the function $\inj$ to work properly
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   110
% we cannot destroy the structure of a regular expression,
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   111
% and therefore cannot simplify along the way without mechanisms
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   112
% that restores the values during the simplification process.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   113
% %too aggressively.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   114
% For example,
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   115
% \[
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   116
% 	r + (r + a) \rightarrow r + a
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   117
% \]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   118
% cannot be applied because that would require
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   119
% breaking up the inner alternative.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   120
% The $\lexer$ plus $\textit{simp}$ therefore only enables
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   121
% same-level de-duplications like
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   122
% \[
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   123
% 	r + r \rightarrow r.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   124
% \]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   125
% Secondly, the algorithm recursively calls $\lexer$ on 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   126
% each new character input,
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   127
% and before starting a child call
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   128
% it stores information of previous lexing steps
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   129
% on a stack, in the form of regular expressions
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   130
% and characters: $r_0$, $c_0$, $r_1$, $c_1$, etc.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   131
% Each descent into deeper recursive calls in $\lexer$
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   132
% causes a new pair of $r_i, c_i$ to be pushed to the call stack.
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   133
% \begin{figure}[H]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   134
% \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   135
% %\draw (-6,-6) grid (6,6);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   136
% \node  [ circle ] (r) at (-6, 5) {$r$};
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   137
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   138
% %\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   139
% \node  [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   140
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   141
% %\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   142
% \node  [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   143
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   144
% \node [minimum width = 2cm, rectangle, draw] (stack) at (0, 3) {Stack};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   145
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   146
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   147
% 	(r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   148
%         edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   149
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   150
% \path   (r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   151
% 	edge [bend right, dashed] node {saved} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   152
% \path   (c1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   153
% 	edge [bend right, dashed] node {} (stack);
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   154
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   155
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   156
% \end{tikzpicture}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   157
% \caption{First derivative taken}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   158
% \end{figure}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   159
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   160
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   161
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   162
% \begin{figure}[H]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   163
% \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   164
% %\draw (-6,-6) grid (6,6);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   165
% \node  [ circle ] (r) at (-6, 5) {$r$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   166
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   167
% %\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   168
% \node  [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   169
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   170
% %\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   171
% \node  [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   172
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   173
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   174
% \node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   175
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   176
% %\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   177
% \node [circle, draw] (r2) at (2, 5) {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   178
% \node [minimum width = 3cm, minimum height = 1cm, rectangle, draw] (stack) at (0, 2) {\large Stack};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   179
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   180
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   181
% 	(r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   182
%         edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   183
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   184
% \path   (r2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   185
% 	edge [bend right, dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   186
% \path   (c2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   187
% 	edge [bend right, dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   188
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   189
% \path   (r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   190
% 	edge [] node {} (r2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   191
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   192
% \end{tikzpicture}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   193
% \caption{Second derivative taken}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   194
% \end{figure}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   195
% \noindent
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   196
% As the number of derivative steps increases,
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   197
% the stack would increase:
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   198
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   199
% \begin{figure}[H]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   200
% \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   201
% %\draw (-6,-6) grid (6,6);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   202
% \node  [ circle ] (r) at (-6, 5) {$r$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   203
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   204
% %\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   205
% \node  [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   206
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   207
% %\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   208
% \node  [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   209
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   210
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   211
% \node [circle, minimum size = 0.1, ] (c2) at (0, 5.4) {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   212
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   213
% %\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   214
% \node [circle, ] (r2) at (2, 5) {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   215
% \node [minimum width = 4cm, minimum height = 2.5cm, rectangle, draw] (stack) at (0, 1) { \large Stack};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   216
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   217
% \node [] (ldots) at (3.5, 5) {};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   218
% %\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   219
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   220
% \node [minimum size = 0.5, circle, ] (rn) at (6, 5) {};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   221
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   222
% \node (rldots) at ($(ldots)!.4!(rn)$) {\ldots};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   223
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   224
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   225
% 	(r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   226
%         edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   227
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   228
% \path   (rldots)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   229
% 	edge [bend left, dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   230
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   231
% \path   (r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   232
% 	edge [] node {} (r2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   233
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   234
% \path   (r2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   235
% 	edge [] node {} (ldots);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   236
% \path   (ldots)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   237
% 	edge [bend left, dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   238
% \path   (5.03, 4.9)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   239
% 	edge [bend left, dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   240
% \end{tikzpicture}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   241
% \caption{More derivatives taken}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   242
% \end{figure}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   243
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   244
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   245
% \begin{figure}[H]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   246
% \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   247
% %\draw (-6,-6) grid (6,6);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   248
% \node  [radius = 0.5, circle, draw] (r) at (-6, 5) {$r$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   249
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   250
% %\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   251
% \node  [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   252
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   253
% %\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   254
% \node  [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   255
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   256
% %\node (0, 6)  (c2) circle [radius = 0.3] {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   257
% \node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   258
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   259
% %\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   260
% \node [circle, draw] (r2) at (2, 5) {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   261
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   262
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   263
% \node [] (ldots) at (4.5, 5) {};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   264
% %\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   265
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   266
% \node [minimum size = 0.5, circle, draw] (rn) at (6, 5) {$r_n$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   267
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   268
% \node at ($(ldots)!.4!(rn)$) {\ldots};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   269
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   270
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   271
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   272
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   273
% \node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   274
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   275
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   276
% 	(r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   277
%         edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   278
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   279
% 	(r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   280
%         edge [] node {} (r2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   281
% \path   (r2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   282
% 	edge [] node {} (ldots);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   283
% \path   (r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   284
% 	edge [dashed, bend right] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   285
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   286
% \path   (r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   287
% 	edge [dashed, ] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   288
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   289
% \path   (c1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   290
% 	edge [dashed, bend right] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   291
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   292
% \path   (c2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   293
% 	edge [dashed] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   294
% \path   (4.5, 5)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   295
% 	edge [dashed, bend left] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   296
% \path   (4.9, 5)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   297
% 	edge [dashed, bend left] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   298
% \path   (5.3, 5)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   299
% 	edge [dashed, bend left] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   300
% \path (r2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   301
% 	edge [dashed, ] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   302
% \path (rn)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   303
% 	edge [dashed, bend left] node {} (stack);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   304
% \end{tikzpicture}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   305
% \caption{Before injection phase starts}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   306
% \end{figure}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   307
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   308
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   309
% \noindent
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   310
After all derivatives have been taken, the stack grows to a maximum size
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   311
and the pair of regular expressions and characters $r_i, c_{i+1}$ 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   312
are then popped out and used in the injection phase.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   313
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   314
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   315
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   316
% \begin{figure}[H]
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   317
% \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   318
% %\draw (-6,-6) grid (6,6);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   319
% \node  [radius = 0.5, circle, ] (r) at (-6, 5) {$r$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   320
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   321
% %\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   322
% \node  [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   323
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   324
% %\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   325
% \node  [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   326
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   327
% %\node (0, 6)  (c2) circle [radius = 0.3] {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   328
% \node [circle, minimum size = 0.1, ] (c2) at (0, 5.4) {$c_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   329
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   330
% %\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   331
% \node [circle, ] (r2) at (2, 5) {$r_2$};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   332
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   333
% %
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   334
% \node [] (ldots) at (4.5, 5) {};
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   335
% %\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   336
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   337
% \node [minimum size = 0.5, circle, ] (rn) at (6, 5) {$r_n$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   338
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   339
% \node at ($(ldots)!.4!(rn)$) {\ldots};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   340
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   341
% \node [minimum size = 0.5, circle, ] (vn) at (6, -5) {$v_n$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   342
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   343
% \node [] (ldots2) at (3.5, -5) {};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   344
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   345
% \node  [minimum size = 0.5, circle, ] (v2) at (2, -5) {$v_2$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   346
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   347
% \node at ($(ldots2)!.4!(v2)$) {\ldots};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   348
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   349
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   350
% \node [circle, ] (v1) at (-2, -5) {$v_1$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   351
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   352
% \node  [radius = 0.5, circle, ] (v) at (-6, -5) {$v$};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   353
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   354
% \node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   355
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   356
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   357
% 	(r)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   358
%         edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   359
% \path
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   360
% 	(r1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   361
%         edge [] node {} (r2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   362
% \path   (r2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   363
% 	edge [] node {} (ldots);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   364
% \path   (rn)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   365
% 	edge [] node {$\mkeps$} (vn);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   366
% \path   (vn) 
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   367
% 	edge [] node {} (ldots2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   368
% \path   (v2)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   369
% 	edge [] node {$\inj \; r_1 \; c_2\;v_2$} (v1);
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   370
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   371
% \path   (v1)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   372
% 	edge [] node {$\inj \; r \; c_1 \; v_1$} (v);
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   373
668
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   374
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   375
% 	edge [dashed] node {} (-4.2, -5.2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   376
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   377
% 	edge [dashed] node {} (-4, -5.2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   378
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   379
% 	edge [dashed] node {} (-0.1, -5.2);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   380
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   381
% 	edge [dashed] node {} (0.2, -5.26);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   382
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   383
% 	edge [dashed] node {} (3.2, -5);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   384
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   385
% 	edge [dashed] node {} (2.7, -5);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   386
% \path (stack)
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   387
% 	edge [dashed] node {} (3.7, -5);
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   388
% \end{tikzpicture}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   389
% \caption{Stored $r_i, c_{i+1}$ Used by $\inj$}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   390
% \end{figure}
3831621d7b14 added technical Overview section, almost done introduction
Chengsong
parents: 667
diff changeset
   391
% \noindent
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   392
Storing all $r_i, c_{i+1}$ pairs recursively
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   393
allows the algorithm to work in an elegant way, at the expense of 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   394
storing quite a bit of verbose information on the stack.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   395
The stack seems to grow at least quadratically with respect
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   396
to the input (not taking into account the size bloat of $r_i$),
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   397
which can be inefficient and prone to stack overflows.
667
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
   398
\section{Bitcoded Algorithm}\label{bitDef}
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   399
To address this,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   400
Sulzmann and Lu defined a new datatype 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   401
called \emph{annotated regular expression},
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   402
which condenses all the partial lexing information
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   403
(that was originally stored in $r_i, c_{i+1}$ pairs)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   404
into bitcodes.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   405
The bitcodes are then carried with the regular
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   406
expression, and augmented or moved around
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   407
as the lexing proceeds.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   408
It becomes unnecessary
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   409
to remember all the 
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   410
intermediate expressions, but only the most recent one
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   411
with this bit-carrying regular expression.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   412
Annotated regular expressions 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   413
are defined as the following 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   414
Isabelle datatype:\footnote{ We use subscript notation to indicate
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   415
	that the bitcodes are auxiliary information that does not
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   416
interfere with the structure of the regular expressions }
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   417
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   418
\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   419
  $\textit{a}$ & $::=$  & $\ZERO$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   420
                  & $\mid$ & $_{bs}\ONE$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   421
                  & $\mid$ & $_{bs}{\bf c}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   422
                  & $\mid$ & $_{bs}\sum\,as$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   423
                  & $\mid$ & $_{bs}a_1\cdot a_2$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   424
                  & $\mid$ & $_{bs}a^*$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   425
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   426
\end{center}  
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   427
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   428
where $bs$ stands for bit-codes, $a$  for $\mathbf{a}$nnotated regular
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   429
expressions and $as$ for lists of annotated regular expressions.
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   430
The alternative constructor, written $\sum$, has been generalised to 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   431
take a list of annotated regular expressions rather than just two.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   432
Why is it generalised? This is because when we analyse nested 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   433
alternatives, there can be more than two elements at the same level
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   434
after de-duplication, which can no longer be stored in a binary
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   435
constructor.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   436
Bits and bitcodes (lists of bits) are defined as:
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   437
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   438
		$b ::=   S \mid  Z \qquad
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   439
bs ::= [] \mid b::bs    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   440
$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   441
\end{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   442
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   443
We use $S$ and $Z$ rather than $1$ and $0$ is to avoid
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   444
confusion with the regular expressions $\ZERO$ and $\ONE$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   445
The idea is to use the bitcodes
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   446
to indicate which choice was made at a certain point
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   447
during lexing.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   448
For example,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   449
$(_{Z}a+_{S}b) \backslash a$ gives us
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   450
$_{Z}\ONE + \ZERO$, where the $Z$ bitcode will
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   451
later be used to decode that a left branch was
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   452
selected at the time when the part $a+b$ was analysed by
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   453
derivative.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   454
\subsection{A Bird's Eye View of the Bit-coded Lexer}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   455
Before we give the details of the functions and definitions 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   456
related to our
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   457
$\blexer$ (\emph{b}-itcoded lexer), we first provide a high-level
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   458
view of the algorithm.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   459
\begin{figure}[H]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   460
\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   461
%\draw (-6,-6) grid (6,6);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   462
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   463
	\node [circle, draw] (r0) at (-6, 2) {$r$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   464
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   465
	\node  [radius = 0.5, circle, draw] (r) at (-6, 5) {$_{bs}a$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   466
	\path (r0)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   467
		edge [] node {$\internalise$} (r);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   468
%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   469
\node  [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   470
%
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   471
%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   472
\node  [minimum size = 1.0cm, circle, draw] (r1) at (-2, 5) {$_{bs_1}a_1$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   473
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   474
%\node (0, 6)  (c2) circle [radius = 0.3] {$c_2$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   475
\node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   476
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   477
%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   478
\node [circle, draw, minimum size = 1.4cm] (r2) at (2, 5) {$_{bs_2}a_2$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   479
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   480
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   481
\node [] (ldots) at (4.5, 5) {};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   482
%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   483
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   484
\node [minimum size = 2.2cm, circle, draw] (rn) at (6, 5) {$_{bs_n}a_n$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   485
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   486
\node at ($(ldots)!.1!(rn)$) {\ldots};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   487
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   488
\node [minimum size = 0.5, circle, ] (v) at (6, 2) {$v$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   489
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   490
%\node [] (v2) at (4, -5) {};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   491
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   492
%\node [draw, cross out] (ldots2) at (5, -5) {};
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   493
%
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   494
%\node at ($(ldots2)!.4!(v2)$) {\ldots};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   495
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   496
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   497
\node [align = center] (decode) at (6.6, 3.2) {$\bmkeps$\\$\decode$};
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   498
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   499
\path (c1)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   500
	edge [dashed, bend left] node {} (r0);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   501
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   502
\path (c2)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   503
	edge [dashed, bend left] node {} (r0);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   504
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   505
\path (r1)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   506
	edge [dashed, bend right] node {} (r2);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   507
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   508
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   509
\path
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   510
	(r)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   511
        edge [dashed, bend right] node[left] {} (r1);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   512
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   513
\path
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   514
	(r)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   515
        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   516
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   517
\path
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   518
	(r1)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   519
        edge [] node {} (r2);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   520
\path   (r2)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   521
	edge [] node {} (ldots);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   522
\path   (rn)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   523
	edge [] node {} (v);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   524
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   525
\path	(r0)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   526
	edge [dashed, bend right] node {} (decode);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   527
%\path	(v)
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   528
	%edge [] node {} (ldots2);
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   529
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   530
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   531
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   532
\end{tikzpicture}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   533
\caption{A bird's eye view of $\blexer$. The $_{bsi}a_{i}$s stand
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   534
	for the annotated regular expressions in each derivative step.
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   535
	The characters used in each derivative are written as $c_i$.
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   536
The size of the derivatives typically increases during each derivative step.
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   537
The process starts with an internalise phase and concludes with a decoding phase.}
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   538
\end{figure}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   539
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   540
The plain regular expressions
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   541
are first ``lifted'' to an annotated regular expression,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   542
with the function called \emph{internalise} ($r \rightarrow _{bs}a$).
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   543
Then the annotated regular expression $_{bs}a$ will
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   544
be transformed by successive derivatives with respect 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   545
to the input stream of characters $c_1, c_2$ etc.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   546
Each time a derivative is taken, the bitcodes
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   547
are moved around, discarded or augmented together 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   548
with the regular expression they are attached to.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   549
After all input has been consumed, the 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   550
bitcodes are collected by $\bmkeps$,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   551
which traverses the nullable part of the regular expression
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   552
and collects the bitcodes along the way.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   553
The collected bitcodes are then $\decode$d with the guidance
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   554
of the input regular expression $r$ ( $_{bs}a \rightarrow r$).
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   555
The most notable improvements of $\blexer$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   556
over $\lexer$ are 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   557
\begin{itemize}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   558
	\item
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   559
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   560
		the absence of the second injection phase.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   561
	\item
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   562
		one does not need to store each pair of the 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   563
		intermediate regular expressions $_{bs_i}a_i$ and  $c_{i+1}$. 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   564
		The previous annotated regular expression $_{bs_i}a_i$'s information is passed
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   565
		on without loss to its successor $_{bs_{i+1}}a_{i+1}$,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   566
		and $c_{i+1}$'s information is stored in $L\;r$.\footnote{
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   567
		which will be used during the decode phase, where we use $r$ as
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   568
	a source of information.}
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   569
	\item
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   570
		simplification works much smoother--one can maintain correctness
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   571
		while applying quite aggressive simplifications.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   572
\end{itemize}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   573
%In the next section we will 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   574
%give the operations needed in $\blexer$,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   575
%with some remarks on the idea behind their definitions.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   576
\subsection{Operations in $\textit{Blexer}$}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   577
The first operation we define related to bit-coded regular expressions
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   578
is how we move bits to the inside of regular expressions.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   579
This operation is called $\fuse$:
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   580
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   581
\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   582
	$\textit{fuse}\;bs \; (\ZERO)$ & $\dn$ & $\ZERO$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   583
	$\textit{fuse}\;bs\; (_{bs'})\ONE$ & $\dn$ &
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   584
     $_{bs @ bs'}\ONE$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   585
	$\textit{fuse}\;bs\;(_{bs'}{\bf c})$ & $\dn$ &
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   586
     $_{bs@bs'}{\bf c}$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   587
	$\textit{fuse}\;bs\;(_{bs'}\sum\textit{as})$ & $\dn$ &
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   588
     $_{bs@bs'}\sum\textit{as}$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   589
	$\textit{fuse}\;bs\; (_{bs'}a_1\cdot a_2)$ & $\dn$ &
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   590
     $_{bs@bs'}a_1 \cdot a_2$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   591
  $\textit{fuse}\;bs\;(_{bs'}a^*)$ & $\dn$ &
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   592
     $_{bs @ bs'}a^*$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   593
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   594
\end{center} 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   595
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   596
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   597
With \emph{fuse} we are able to define the $\internalise$ function, 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   598
written $(\_)^\uparrow$,
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   599
that translates a ``standard'' regular expression into an
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   600
annotated regular expression.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   601
This function will be applied before we start
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   602
with the derivative phase of the algorithm.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   603
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   604
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   605
\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   606
  $(\ZERO)^\uparrow$ & $\dn$ & $\ZERO$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   607
  $(\ONE)^\uparrow$ & $\dn$ & $_{[]}\ONE$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   608
  $(c)^\uparrow$ & $\dn$ & $_{[]}{\bf c}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   609
  $(r_1 + r_2)^\uparrow$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   610
  $_{[]}\sum[\textit{fuse}\,[Z]\,r_1^\uparrow,\,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   611
  \textit{fuse}\,[S]\,r_2^\uparrow]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   612
  $(r_1\cdot r_2)^\uparrow$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   613
         $_{[]}r_1^\uparrow \cdot r_2^\uparrow$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   614
  $(r^*)^\uparrow$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   615
         $_{[]}(r^\uparrow)^*$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   616
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   617
\end{center}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   618
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   619
The opposite of $\textit{internalise}$ is
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   620
$\erase$, where all bit-codes are removed,
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   621
and the alternative operator $\sum$ for annotated
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   622
regular expressions is transformed into the binary version 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   623
in (plain) regular expressions. This can be defined as follows:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   624
\begin{center}\label{eraseDef}
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   625
	\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   626
		$\ZERO_\downarrow$ & $\dn$ & $\ZERO$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   627
		$( _{bs}\ONE )_\downarrow$ & $\dn$ & $\ONE$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   628
		$( _{bs}\mathbf{c} )_\downarrow$ & $\dn$ & $\mathbf{c}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   629
		$( _{bs} a_1 \cdot a_2 )_\downarrow$ & $\dn$ & 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   630
		$ (a_1) _\downarrow \cdot  (a_2) _\downarrow$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   631
		$( _{bs} [])_\downarrow $ & $\dn$ & $\ZERO $\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   632
		$( _{bs} [a]  )_\downarrow$ & $\dn$ & $a_\downarrow$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   633
		$(_{bs} \sum [a_1, \; a_2])_\downarrow$ & $\dn$ & $ (a_1) _\downarrow + ( a_2 ) _\downarrow $\\
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   634
		$(_{bs} \sum (a :: as))_\downarrow$ & $\dn$ & $ a_\downarrow + \; (_{[]} \sum as)_\downarrow$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   635
		$( _{bs} a^* )_\downarrow$ & $\dn$ & $(a_\downarrow)^*$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   636
	\end{tabular}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   637
\end{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   638
\noindent
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   639
Where we abbreviate the $\erase\; a$ operation
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   640
as $(a)_\downarrow$, for conciseness.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   641
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   642
Determining whether an annotated regular expression
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   643
contains the empty string in its language requires
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   644
a dedicated function $\bnullable$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   645
In our formalisation this function is defined by simply calling $\erase$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   646
before $\nullable$.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   647
\begin{definition}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   648
		$\bnullable \; a \dn  \nullable \; (a_\downarrow)$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   649
\end{definition}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   650
The function for collecting 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   651
bitcodes from a 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   652
(b)nullable regular expression is called $\bmkeps$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   653
This function  is a lifted version of the $\textit{mkeps}$ function,
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   654
which follows the path $\mkeps$ takes to traverse the
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   655
$\nullable$ branches when visiting a regular expression,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   656
but collects bitcodes instead of generating a value.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   657
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   658
\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   659
  $\textit{bmkeps}\,(_{bs}\ONE)$ & $\dn$ & $bs$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   660
  $\textit{bmkeps}\,(_{bs}\sum a::\textit{as})$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   661
     $\textit{if}\;\textit{bnullable}\,a$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   662
  & &$\textit{then}\;bs\,@\,\textit{bmkeps}\,a$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   663
  & &$\textit{else}\;bs\,@\,\textit{bmkeps}\,(_{[]}\sum \textit{as})$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   664
  $\textit{bmkeps}\,(_{bs} a_1 \cdot a_2)$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   665
     $bs \,@\,\textit{bmkeps}\,a_1\,@\, \textit{bmkeps}\,a_2$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   666
  $\textit{bmkeps}\,(_{bs}a^*)$ & $\dn$ &
585
4969ef817d92 chap4 more
Chengsong
parents: 582
diff changeset
   667
     $bs \,@\, [S]$
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   668
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   669
\end{center}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   670
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   671
This function, just like $\mkeps$, 
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   672
visits a regular expression tree respecting
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   673
the POSIX rules. 
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   674
It traverses each child of the sequence regular expression
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   675
from left to right and creates a bitcode by stitching
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   676
together bitcodes obtained from the children expressions.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   677
In the case of alternative regular expressions, 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   678
it looks for the leftmost
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   679
$\nullable$ branch
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   680
to visit and ignores the other siblings.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   681
%Whenever there is some bitcodes attached to a
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   682
%node, it returns the bitcodes concatenated with whatever
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   683
%child recursive calls return.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   684
The only time when $\bmkeps$ creates new bitcodes
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   685
is when it completes a star's iterations by attaching a $S$ to the end of the bitcode
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   686
list it returns.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   687
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   688
The bitcodes extracted by $\bmkeps$ need to be 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   689
$\decode$d (with the guidance of a ``plain'' regular expression):
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   690
%\begin{definition}[Bitdecoding of Values]\mbox{}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   691
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   692
\begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   693
  $\textit{decode}'\,bs\,(\ONE)$ & $\dn$ & $(\Empty, bs)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   694
  $\textit{decode}'\,bs\,(c)$ & $\dn$ & $(\Char\,c, bs)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   695
  $\textit{decode}'\,(Z\!::\!bs)\;(r_1 + r_2)$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   696
     $\textit{let}\,(v, bs_1) = \textit{decode}'\,bs\,r_1\;\textit{in}\;
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   697
       (\Left\,v, bs_1)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   698
  $\textit{decode}'\,(S\!::\!bs)\;(r_1 + r_2)$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   699
     $\textit{let}\,(v, bs_1) = \textit{decode}'\,bs\,r_2\;\textit{in}\;
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   700
       (\Right\,v, bs_1)$\\                           
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   701
  $\textit{decode}'\,bs\;(r_1\cdot r_2)$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   702
        $\textit{let}\,(v_1, bs_1) = \textit{decode}'\,bs\,r_1\;\textit{in}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   703
  & &   $\textit{let}\,(v_2, bs_2) = \textit{decode}'\,bs_1\,r_2$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   704
  & &   \hspace{35mm}$\textit{in}\;(\Seq\,v_1\,v_2, bs_2)$\\
585
4969ef817d92 chap4 more
Chengsong
parents: 582
diff changeset
   705
  $\textit{decode}'\,(S\!::\!bs)\,(r^*)$ & $\dn$ & $(\Stars\,[], bs)$\\
4969ef817d92 chap4 more
Chengsong
parents: 582
diff changeset
   706
  $\textit{decode}'\,(Z\!::\!bs)\,(r^*)$ & $\dn$ & 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   707
         $\textit{let}\,(v, bs_1) = \textit{decode}'\,bs\,r\;\textit{in}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   708
  & &   $\textit{let}\,(\Stars\,vs, bs_2) = \textit{decode}'\,bs_1\,r^*$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   709
  & &   \hspace{35mm}$\textit{in}\;(\Stars\,v\!::\!vs, bs_2)$\bigskip\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   710
  
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   711
  $\textit{decode}\,bs\,r$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   712
     $\textit{let}\,(v, bs') = \textit{decode}'\,bs\,r\;\textit{in}$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   713
  & & $\textit{if}\;bs' = []\;\textit{then}\;\textit{Some}\,v\;
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   714
       \textit{else}\;\textit{None}$                       
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   715
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   716
\end{center} 
537
Chengsong
parents: 536
diff changeset
   717
\noindent
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   718
The function $\decode'$ returns a pair consisting of 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   719
a partially decoded value and some leftover bit-list.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   720
The function $\decode'$ succeeds if the left-over 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   721
bit-sequence is empty.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   722
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   723
%$\decode$ is terminating as $\decode'$ is terminating.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   724
%$\decode'$ is terminating 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   725
%because at least one of $\decode'$'s parameters will go down in terms
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   726
%of size.\\
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   727
The reverse operation of $\decode$ is $\code$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   728
This function encodes a value into a bitcode by converting
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   729
$\Left$ into $Z$, $\Right$ into $S$, and marks the start of any non-empty
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   730
star iteration by $S$. $ Z$ marks the border where a star iteration
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   731
terminates. 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   732
This coding is lossy, as it throws away the information about
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   733
characters, and does not encode the ``boundary'' between two
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   734
sequence values. Moreover, with only the bitcode we cannot even tell
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   735
whether the $S$s and $Z$s are for $\Left/\Right$ or $\Stars$,
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   736
but this will not be necessary for our correctness proof.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   737
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   738
\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   739
  $\textit{code}(\Empty)$ & $\dn$ & $[]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   740
  $\textit{code}(\Char\,c)$ & $\dn$ & $[]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   741
  $\textit{code}(\Left\,v)$ & $\dn$ & $Z :: code(v)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   742
  $\textit{code}(\Right\,v)$ & $\dn$ & $S :: code(v)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   743
  $\textit{code}(\Seq\,v_1\,v_2)$ & $\dn$ & $code(v_1) \,@\, code(v_2)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   744
  $\textit{code}(\Stars\,[])$ & $\dn$ & $[Z]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   745
  $\textit{code}(\Stars\,(v\!::\!vs))$ & $\dn$ & $S :: code(v) \;@\;
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   746
                                                 code(\Stars\,vs)$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   747
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   748
\end{center} 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   749
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   750
We can prove that given a value $v$ and regular expression $r$
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   751
with $\vdash v:r$,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   752
then we have the property that $\decode$ and $\code$ are
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   753
reverse operations of one another:
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   754
\begin{lemma}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   755
\[If \vdash v : r \; then \;\decode \; (\code \; v) \; r = \textit{Some}(v) \]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   756
\end{lemma}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   757
\begin{proof}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   758
By proving a more general version of the lemma, on $\decode'$:
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   759
\[\vdash v : r \implies \decode' \; ((\code \; v) @ ds) \; r = (v, ds) \]
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   760
Then we obtain the property by setting $ds$ to be $[]$ and unfolding the $\decode$ definition.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   761
\end{proof}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   762
\noindent
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   763
Now we define the central part of Sulzmann and Lu's second lexing algorithm,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   764
the $\bder$ function (which stands for \emph{b}itcoded-derivative):
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   765
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   766
  \begin{tabular}{@{}lcl@{}}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   767
  $(\ZERO)\,\backslash c$ & $\dn$ & $\ZERO$\\  
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   768
  $(_{bs}\ONE)\,\backslash c$ & $\dn$ & $\ZERO$\\  
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   769
  $(_{bs}{\bf d})\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   770
        $\textit{if}\;c=d\; \;\textit{then}\;
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   771
         _{bs}\ONE\;\textit{else}\;\ZERO$\\  
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   772
  $(_{bs}\sum \;\textit{as})\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   773
  $_{bs}\sum\;(\textit{map} \; (\_\backslash c) \; as )$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   774
  $(_{bs}\;a_1\cdot a_2)\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   775
     $\textit{if}\;\textit{bnullable}\,a_1$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   776
					       & &$\textit{then}\;_{bs}\sum\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   777
					       & &$\phantom{\textit{then},\;_{bs}\sum\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   778
  & &$\textit{else}\;_{bs}\,(a_1\,\backslash c)\cdot a_2$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   779
  $(_{bs}a^*)\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   780
      $_{bs}(\textit{fuse}\, [Z] \; r\,\backslash c)\cdot
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   781
       (_{[]}r^*))$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   782
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   783
\end{center}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   784
\noindent
639
80cc6dc4c98b until chap 7
Chengsong
parents: 638
diff changeset
   785
For $\bder \; c\; a$, we use the infix notation $a\backslash c$ for readability.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   786
The $\bder$ function tells us how regular expressions can be recursively traversed,
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   787
where the bitcodes are augmented and carried around 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   788
when a derivative is taken.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   789
We give the intuition behind some of the more involved cases in 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   790
$\bder$.
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   791
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   792
For example,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   793
in the \emph{star} case,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   794
a derivative of $_{bs}a^*$ means 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   795
that one more star iteration needs to be taken.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   796
We therefore need to unfold it into a sequence,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   797
and attach an additional bit $Z$ to the front of $a \backslash c$
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   798
as a record to indicate one new star iteration is unfolded.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   799
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   800
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   801
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   802
  \begin{tabular}{@{}lcl@{}}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   803
  $(_{bs}a^*)\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   804
  $_{bs}(\underbrace{\textit{fuse}\, [Z] \; a\,\backslash c}_{\text{One more iteration}})\cdot
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   805
       (_{[]}a^*))$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   806
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   807
\end{center}   
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   808
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   809
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   810
This information will be recovered later by the $\decode$ function.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   811
%The intuition is that the bit $Z$ will be decoded at the right location,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   812
%because we accumulate bits from left to right (a rigorous proof will be given
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   813
%later).
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   814
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   815
%\begin{tikzpicture}[ > = stealth, % arrow head style
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   816
%        shorten > = 1pt, % don't touch arrow head to node
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   817
%        semithick % line style
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   818
%    ]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   819
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   820
%    \tikzstyle{every state}=[
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   821
%        draw = black,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   822
%        thin,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   823
%        fill = cyan!29,
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   824
%        minimum size = 7mm
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   825
%    ]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   826
%    \begin{scope}[node distance=1cm and 0cm, every node/.style=state]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   827
%		\node (k) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   828
%        {$bs$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   829
%         \nodepart{two} $a^*$ };
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   830
%	 \node (l) [below =of k, rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   831
%        { $bs$ + [Z]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   832
%         \nodepart{two}  $(a\backslash c )\cdot a^*$ };
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   833
%    \end{scope}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   834
%    \path[->] 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   835
%	      (k) edge (l);
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   836
%\end{tikzpicture}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   837
%
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   838
%Pictorially the process looks like below.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   839
%Like before, the red region denotes
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   840
%previous lexing information (stored as bitcodes in $bs$).
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   841
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   842
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   843
%	\begin{scope}[node distance=1cm]   
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   844
%		\node (a) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   845
%        {$bs$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   846
%         \nodepart{two} $a^*$ };
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   847
%	 \node (b) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   848
%        { $bs$ + [Z]
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   849
%         \nodepart{two}  $(a\backslash c )\cdot a^*$ };
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   850
%%\caption{term 1 \ref{term:1}'s matching configuration}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   851
% 	\end{scope}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   852
%\end{tikzpicture}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   853
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   854
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   855
Another place the $\bder$ function differs
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   856
from derivatives on regular expressions
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   857
is the sequence case:
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   858
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   859
  \begin{tabular}{@{}lcl@{}}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   860
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   861
  $(_{bs}\;a_1\cdot a_2)\,\backslash c$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   862
     $\textit{if}\;\textit{bnullable}\,a_1$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   863
					       & &$\textit{then}\;_{bs}\sum\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   864
					       & &$\phantom{\textit{then},\;_{bs}\sum\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   865
  & &$\textit{else}\;_{bs}\,(a_1\,\backslash c)\cdot a_2$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   866
\end{tabular}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   867
\end{center}    
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   868
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   869
%We assume that $\bmkeps$ 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   870
%correctly extracts the bitcode for how $a_1$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   871
%matches the string prior to $c$ (more on this later).
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   872
%The bitsequence $\textit{bs}$ which was initially 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   873
%attached to the first element of the sequence
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   874
%$a_1 \cdot a_2$, has now been 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   875
%elevated to the top level of the $\sum$ constructor. 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   876
%This is because this piece of information will be needed 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   877
%whichever way the sequence is matched,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   878
%regardless of whether $c$ belongs to $a_1$ or $a_2$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   879
The difference mainly lies in the $\textit{if}$ branch (when $a_1$ is $\bnullable$):
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   880
we use $\bmkeps$ to store the lexing information
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   881
in $a_1$ before collapsing 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   882
it (as it has been fully matched by string prior to $c$),
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   883
and attach the collected bit-codes to the front of $a_2$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   884
before throwing away $a_1$. 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   885
In the injection-based lexer, $r_1$ is immediately thrown away in 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   886
the $\textit{if}$ branch,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   887
the information $r_1$ stores is therefore lost:
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   888
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   889
	\begin{tabular}{lcl}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   890
$(r_1 \cdot r_2)\backslash c$ & $\dn$ & $\mathit{if} \, [] \in L(r_1)$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   891
	&   & $\mathit{then}\;(r_1\backslash c) \cdot r_2 \,+\, r_2\backslash c$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   892
	&   & $\mathit{else}\; \ldots$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   893
	\end{tabular}
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   894
\end{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   895
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   896
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   897
%\noindent
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   898
%this loss of information makes it necessary
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   899
%to store on stack all the regular expressions'
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   900
%``snapshots'' before they were
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   901
%taken derivative of.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   902
%So that the related information will be available 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   903
%once the child recursive 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   904
%call finishes.\\
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   905
The rest of the clauses of $\bder$ is rather similar to
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   906
$\der$. \\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   907
Generalising the derivative operation with bitcodes to strings, we have 
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   908
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   909
	\begin{tabular}{@{}lcl@{}}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   910
		$a\backslash_s [] $ & $\dn$ & $a$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   911
		$a\backslash (c :: s) $ & $\dn$ & $(a \backslash c) \backslash_s s$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   912
	\end{tabular}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   913
\end{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   914
\noindent
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   915
As we did earlier, we omit the $s$ subscript at $\backslash_s$ when there is no danger
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   916
of confusion.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   917
\subsection{Putting Things Together}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   918
Putting these operations altogether, we obtain a lexer with bit-coded regular expressions
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   919
as its internal data structures, which we call $\blexer$:
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   920
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   921
\begin{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   922
\begin{tabular}{lcl}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   923
  $\textit{blexer}\;r\,s$ & $\dn$ &
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   924
      $\textit{let}\;a = (r^\uparrow)\backslash s\;\textit{in}$\\                
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   925
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   926
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   927
  & & $\;\;\textit{else}\;\textit{None}$
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   928
\end{tabular}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   929
\end{center}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   930
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   931
This function first attaches bitcodes to a
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   932
plain regular expression $r$, and then builds successive derivatives
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   933
with respect to the input string $s$, and
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   934
then test whether the result is (b)nullable.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   935
If yes, then extract the bitcodes from the nullable expression,
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   936
and decodes the bitcodes into a lexical value.
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   937
If there does not exist a match between $r$ and $s$, the lexer
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   938
outputs $\None$ indicating a mismatch.
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   939
Ausaf and Urban formally proved the correctness of the $\blexer$, namely
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   940
\begin{property}
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   941
$\blexer \;r \; s = \lexer \; r \; s$.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   942
\end{property}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   943
\noindent
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   944
This was claimed but not formalised in Sulzmann and Lu's work.
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   945
We introduce the proof later, after we give all
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   946
the needed auxiliary functions and definitions.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   947
\subsection{An Example $\blexer$ Run}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
   948
Before introducing the proof we shall first walk 
580
e0f0a81f907b halfway chap3
Chengsong
parents: 579
diff changeset
   949
through a concrete example of our $\blexer$ calculating the right 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   950
lexical information through bit-coded regular expressions.
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
   951
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   952
Consider the regular expression $(aa)^* \cdot (b+c)$ matching the string $aab$.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   953
We give again the bird's eye view of this particular example 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   954
in each stage of the algorithm:
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   955
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   956
\tikzset{three sided/.style={
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   957
        draw=none,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   958
        append after command={
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   959
            [-,shorten <= -0.5\pgflinewidth]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   960
            ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   961
        edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   962
            ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   963
        edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   964
            ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   965
        edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   966
        }
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   967
    }
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   968
}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   969
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   970
\tikzset{three sided1/.style={
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   971
        draw=none,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   972
        append after command={
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   973
            [-,shorten <= -0.5\pgflinewidth]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   974
            ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   975
        edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   976
            ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   977
        edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   978
            ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   979
        edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   980
        }
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   981
    }
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   982
}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   983
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   984
\begin{figure}[H]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   985
	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   986
		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   987
		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   988
		\path	(r)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   989
			edge [] node {$\internalise$} (a);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   990
		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   991
		\path	(a)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   992
			edge [] node {$\backslash a$} (a1);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   993
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   994
		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   995
		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   996
		\path	(a1)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   997
			edge [] node {$\backslash a$} (a21);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   998
		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
   999
		\path	(a22)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1000
			edge [] node {$\backslash b$} (a3);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1001
		\path	(a21)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1002
			edge [dashed, bend right] node {} (a3);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1003
		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1004
		\path	(a3)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1005
			edge [below] node {$\bmkeps$} (bs);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1006
		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1007
		\path 	(bs)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1008
			edge [] node {$\decode$} (v);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1009
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1010
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1011
	\end{tikzpicture}
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1012
	\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1013
\end{figure}
537
Chengsong
parents: 536
diff changeset
  1014
\noindent
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1015
The one dashed arrow indicates that $_Z(\ONE \cdot (aa)^*)$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1016
turned into $ZS$ after derivative w.r.t $b$
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1017
is taken, which calls $\bmkeps$ on the nullable $_Z\ONE\cdot (aa)^*$
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1018
before processing $_Zb+_Sc$.
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1019
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1020
The annotated regular expressions
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1021
would look overwhelming if we explicitly indicate all the
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1022
locations where bitcodes are attached.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1023
For example,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1024
$(aa)^*\cdot (b+c)$ would 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1025
look like $_{[]}(_{[]}(_{[]}a \cdot _{[]}a)^*\cdot _{[]}(_{[]}b+_{[]}c))$ 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1026
after 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1027
internalise.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1028
Therefore for readability we omit bitcodes if they are empty. 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1029
This applies to all annotated 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1030
regular expressions in this thesis.
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1031
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1032
%and assume we have just read the first character $a$:
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1033
%\begin{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1034
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1035
%    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1036
%	    {$(_{[Z]}(\ONE \cdot a) \cdot (aa)^* )\cdot bc$ 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1037
%	    \nodepart{two} $[Z] \iff \Seq \; (\Stars \; [\Seq\; a \; ??, \;??]) \; ??$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1038
%\end{tikzpicture}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1039
%\end{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1040
%\noindent
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1041
%We use the red area for (annotated) regular expressions and the blue 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1042
%area the (partially calculated) bitcodes 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1043
%and its corresponding (partial) values.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1044
%The first derivative 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1045
%generates a $Z$ bitcode to indicate
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1046
%a new iteration has been started.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1047
%This bitcode is attached to the front of
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1048
%the unrolled star iteration $\ONE\cdot a$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1049
%for later decoding.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1050
%\begin{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1051
%\begin{tikzpicture}[]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1052
%    \node [rectangle split, rectangle split horizontal, 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1053
%	    rectangle split parts=2, rectangle split part fill={red!30,blue!20}, draw, rounded corners, inner sep=10pt]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1054
%	    (der2) at (0,0)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1055
%	    {$(_{[Z]}(\ONE \cdot \ONE) \cdot (aa)^*) \cdot bc $ 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1056
%	    \nodepart{two} $[Z] \iff \Seq \; (\Stars \; [\Seq \; a\;a, ??]) \; ??$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1057
%
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1058
%\node [draw=none, minimum size = 0.1, ] (r) at (-7, 0) {$a_1$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1059
%\path
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1060
%	(r)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1061
%	edge [->, >=stealth',shorten >=1pt, above] node {$\backslash a$} (der2);
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1062
%%\caption{term 1 \ref{term:1}'s matching configuration}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1063
%\end{tikzpicture}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1064
%\end{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1065
%\noindent
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1066
%After we take derivative with respect to 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1067
%second input character $a$, the annotated
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1068
%regular expression has the second $a$ chopped off.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1069
%The second derivative does not involve any 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1070
%new bitcodes being generated, because
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1071
%there are no new iterations or bifurcations
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1072
%in the regular expression requiring any $S$ or $Z$ marker
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1073
%to indicate choices.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1074
%\begin{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1075
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1076
%    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1077
%	    {$(_{[Z]}(\ONE \cdot \ONE) \cdot (aa)^*) \cdot (\ONE \cdot c) $ 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1078
%	    \nodepart{two} $[Z] \iff \Seq \; (\Stars \; [\Seq \; a\;a, ??]) \; ??$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1079
%%\caption{term 1 \ref{term:1}'s matching configuration}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1080
%\end{tikzpicture}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1081
%\end{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1082
%\noindent
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1083
%
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1084
%
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1085
%\begin{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1086
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1087
%    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1088
%	    {$\stackrel{Bitcoded}{\longrightarrow} \Seq(\Stars[\Char(a), \Char(a)], ???)$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1089
%	\nodepart{two} $\Seq(\ldots, \Seq(\Char(b), \Char(c)))$  $\stackrel{Inj}{\longleftarrow}$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1090
%%\caption{term 1 \ref{term:1}'s matching configuration}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1091
%\end{tikzpicture}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1092
%\end{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1093
%\noindent
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1094
%If we do this kind of "attachment"
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1095
%and each time augment the attached partially
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1096
%constructed value when taking off a 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1097
%character:
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1098
%\begin{center}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1099
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1100
%	\node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},] (spPoint)
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1101
%        {$\Seq(\Stars[\Char(a), \Char(a)], \ldots)$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1102
%         \nodepart{two} Remaining: $b c$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1103
%\end{tikzpicture}\\
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1104
%$\downarrow$\\
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1105
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1106
%    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1107
%        {$\Seq(\Stars[\Char(a), \Char(a)], \Seq(\Char(b), \ldots))$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1108
%         \nodepart{two} Remaining: $c$};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1109
%\end{tikzpicture}\\
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1110
%$\downarrow$\\
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1111
%\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1112
%    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1113
%        {$\Seq(\Stars[\Char(a), \Char(a)], \Seq(\Char(b), \Char(c)))$
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1114
%         \nodepart{two} EOF};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1115
%\end{tikzpicture}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1116
%\end{center}
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1117
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1118
In the next section we introduce the correctness proof
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1119
found by Ausaf and Urban
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1120
of the bitcoded lexer.
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1121
%-----------------------------------
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1122
%	SUBSECTION 1
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1123
%-----------------------------------
667
660cf698eb26 added example of how inj and lexer works
Chengsong
parents: 657
diff changeset
  1124
\section{Correctness Proof of $\textit{Blexer}$}\label{blexerProof}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1125
Why is $\blexer$ correct?
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1126
In other words, why is it the case that 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1127
$\blexer$ outputs the same value as $\lexer$?
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1128
Intuitively,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1129
that is because 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1130
\begin{itemize}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1131
	\item
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1132
		$\blexer$ follows an almost identical
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1133
		path as that of $\lexer$, 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1134
		for example $r_1, r_2, \ldots$ and $a_1, a_2, \ldots$ being produced,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1135
		which are the same up to the application of $\erase$.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1136
	\item
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1137
		The bit-encodings work properly,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1138
		allowing the possibility of 
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1139
		pulling out the right lexical value from an
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1140
		annotated regular expression at 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1141
		any stage, using $\bmkeps$ or $\retrieve$ (details in
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1142
		lemmas \ref{bmkepsRetrieve} and \ref{blexer_retrieve}).
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1143
\end{itemize}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1144
We will elaborate on this, with the help of
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1145
some helper functions such as $\retrieve$ and
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1146
$\flex$.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1147
\subsection{Specifications of Some Helper Functions}
622
4b1149fb5aec incorporated more comments, bib
Chengsong
parents: 585
diff changeset
  1148
The functions we introduce will give a more detailed view into 
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1149
the lexing process, which is not possible
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1150
using $\lexer$ or $\blexer$ alone.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1151
\subsubsection{$\textit{Retrieve}$}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1152
The first function we shall introduce is $\retrieve$.
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1153
Sulzmann and Lu gave its definition, and
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1154
Ausaf and Urban found it
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1155
useful in their mechanised proofs.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1156
Our bit-coded lexer ``retrieve''s the bitcodes using $\bmkeps$,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1157
after all the derivatives have been taken:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1158
\begin{center}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1159
\begin{tabular}{lcl}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1160
	& & $\ldots$\\
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1161
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1162
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1163
  & & $\ldots$
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1164
\end{tabular}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1165
\end{center}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1166
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1167
The function $\bmkeps$ retrieves the value $v$'s
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1168
information in the format
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1169
of bitcodes, by travelling along the
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1170
path of the regular expression that corresponds to a POSIX match,
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1171
collecting all the bitcodes.
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1172
We know that this "retrieved" bitcode leads to the correct value after decoding,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1173
which is $v_0$ in the injection-based lexing diagram:
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1174
\vspace{20mm}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1175
\begin{figure}[H]%\label{graph:injLexer}
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1176
\begin{center}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1177
\begin{tikzcd}[
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1178
	every matrix/.append style = {name=p},
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1179
	remember picture, overlay,
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1180
	]
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1181
	a_0 \arrow[r, "\backslash c_0"]  \arrow[d] & a_1 \arrow[r, "\backslash c_1"] \arrow[d] & a_2 \arrow[r, dashed] \arrow[d] & a_n \arrow[d] \\
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1182
v_0           & v_1 \arrow[l,"inj_{r_0} c_0"]                & v_2 \arrow[l, "inj_{r_1} c_1"]              & v_n \arrow[l, dashed]         
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1183
\end{tikzcd}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1184
\end{center}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1185
\begin{tikzpicture}[
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1186
	remember picture, overlay,
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1187
E/.style = {ellipse, draw=blue, dashed,
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1188
            inner xsep=4mm,inner ysep=-4mm, rotate=90, fit=#1}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1189
                        ]
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1190
\node[E = (p-1-1) (p-2-1)] {};
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1191
\node[E = (p-1-4) (p-2-4)] {};
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1192
\node[E = (p-1-2) (p-2-2)] {};
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1193
\node[E = (p-1-3) (p-2-3)] {};
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1194
\end{tikzpicture}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1195
\caption{Injection-based lexing diagram, 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1196
with matching value and regular expression pairs
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1197
encircled}\label{Inj2}
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1198
\end{figure}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1199
\vspace{20mm}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1200
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1201
We have that $\vdash v_i:(a_i)_\downarrow$,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1202
where $v_i$ are the intermediate values generated step by step in $\lexer$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1203
These values are not explicitly created in $\blexer$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1204
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1205
We encircled in the diagram  all the pairs $v_i, a_i$ to show that these values
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1206
and regular expressions share the same structure.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1207
These pairs all contain adequate information to 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1208
obtain the final lexing result.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1209
For example, in the leftmost pair the 
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1210
lexical information is condensed in 
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1211
$v_0$, whereas for the rightmost pair,
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1212
the lexing result hides is in the bitcodes of $a_n$.
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1213
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1214
The function $\bmkeps$ is able to extract from $a_n$ the result
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1215
by looking for nullable parts of the regular expression.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1216
However for regular expressions $a_i$ in general,
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1217
they might not necessarily be nullable.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1218
For those regular expressions that are not nullable,
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1219
$\bmkeps$ will not work. 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1220
Therefore they need some additional ``help'' 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1221
finding the POSIX bit-encoding.
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1222
The most straightforward ``help'' comes from $a_i$'s corresponding
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1223
value $v_i$, and this suggests a function $f$ satisfying the
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1224
following properties:
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1225
\begin{itemize}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1226
	\item
581
9db2500629be chap3 almost done
Chengsong
parents: 580
diff changeset
  1227
		$f \; a_i\;v_i = f \; a_n \; v_n = \bmkeps \; a_n$%\decode \; (\bmkeps \; a_n) \; (\erase \; a_0)$
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1228
	\item
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1229
		$f \; a_i\;v_i = f \; a_0 \; v_0 = \code \; v_0$ %\decode \;(\code \; v_0) \; (\erase \; a_0)$
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1230
\end{itemize}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1231
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1232
Sulzmann and Lu came up with the following definition for $f$ satisfying 
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1233
the above
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1234
requirements, and named it \emph{retrieve}:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1235
\vspace{-7mm}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1236
\begin{center}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1237
\begin{tabular}{llcl}
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1238
	$\retrieve \; \, _{bs} \ONE$ & $   \Empty$ & $\dn$ & $\textit{bs}$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1239
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1240
	$\retrieve \; \, _{bs} \mathbf{c}$ & $ (\Char \; c) $ & $\dn$ & $ \textit{bs}$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1241
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1242
	$\retrieve \; \, _{bs} a_1 \cdot a_2$ & $   (\Seq \; v_1 \; v_2) $ & 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1243
	$\dn$ &  $\textit{bs} \; @\;  (\retrieve \; a_1\; v_1)\; @ \;(\retrieve \; a_2 \; v_2)$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1244
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1245
	$\retrieve \; \, _{bs} \Sigma (a :: \textit{as})$ & $   (\Left \; v) $ & 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1246
	$\dn$ & $\textit{bs}\; @\; (\retrieve \; a \; v)$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1247
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1248
	$\retrieve \; \, _{bs} \Sigma (a :: \textit{as})$ & $   (\Right \; v) $ & 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1249
	$\dn$ & $\textit{bs}\; @\; (\retrieve \; (_{[]}\Sigma \textit{as}) \; v)$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1250
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1251
	$\retrieve \; \, _{bs} a^* $ & $   (\Stars \; (v :: vs)) $ & $\dn$ & 
585
4969ef817d92 chap4 more
Chengsong
parents: 582
diff changeset
  1252
	$\textit{bs}\; @\; [Z] \; @ \; (\retrieve \; a \; v)\; @ \; 
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1253
	(\retrieve \; (_{[]} a^*) \; (\Stars \; vs) )$\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1254
585
4969ef817d92 chap4 more
Chengsong
parents: 582
diff changeset
  1255
	$\retrieve \; \, _{bs} a^* $ & $  (\Stars \; []) $ & $\dn$ & $\textit{bs} \; @ \; [S]$
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1256
\end{tabular}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1257
\end{center}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1258
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1259
As stated, $\retrieve$ collects the right bit-codes from the 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1260
final derivative $a_n$, guided by $v_n$. This can be proved
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1261
as follows:
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1262
\begin{lemma}\label{bmkepsRetrieve}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1263
	\mbox{}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1264
	\begin{itemize}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1265
		\item
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1266
If $\forall v \in vs. \vdash v : r$, and  $\code \; v = \retrieve \; (\internalise\; r) \; v$\\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1267
then $\code \; (\Stars \; vs) = \retrieve \; _{[]}((\internalise \;  r)^*) \; (\Stars \; vs)$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1268
		\item
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1269
	$\vdash v : r \implies \retrieve \; (r)^\uparrow \; v = \code \; v$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1270
		\item
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1271
	$\bnullable \; a \implies \bmkeps \; a = \retrieve \; a \; (\mkeps \; (a_\downarrow))$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1272
	\end{itemize}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1273
\begin{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1274
	The first part is by induction on the value list $vs$.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1275
	The second part is by induction on $r$,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1276
	and the star case uses the first part.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1277
	The last part is by a routine induction on $a$.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1278
\end{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1279
\noindent
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1280
\end{lemma}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1281
\begin{proof}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1282
	By a routine induction on $a$.
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1283
\end{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1284
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1285
%The design of $\retrieve$ enables us to extract bitcodes
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1286
%from both annotated regular expressions and values.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1287
%$\retrieve$ always ``sucks up'' all the information.
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1288
%When more information is stored in the value, we would be able to
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1289
%extract more from the value, and vice versa.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1290
%For example in star iterations, $\retrieve$ will be able to extract from $\Stars \; vs$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1291
%exactly the same bitcodes as $\code \; (\Stars \; vs)$:
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1292
%\begin{lemma}\label{retrieveEncodeSTARS}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1293
%  If $\forall v \in vs. \vdash v : r$, and  $\code \; v = \retrieve \; (\internalise\; r) \; v$\\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1294
%  then $\code \; (\Stars \; vs) = \retrieve \; _{[]}((\internalise \;  r)^*) \; (\Stars \; vs)$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1295
%\end{lemma}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1296
%\begin{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1297
%	By induction on the value list $vs$.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1298
%\end{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1299
%\noindent
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1300
%Similarly, when the value list does not hold information,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1301
%only the bitcodes plus some delimiter will be recorded:
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1302
%\begin{center}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1303
%  $\retrieve \; _{bs}((\internalise \;  r)^*) \; (\Stars \; [] )  = bs @ [S]$.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1304
%\end{center}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1305
%In general, if we have a regular expression that is just internalised
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1306
%and the final lexing result value, we can $\retrieve$ the bitcodes
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1307
%that look as if we have en$\code$-ed the value as bitcodes:
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1308
%\begin{lemma}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1309
%	$\vdash v : r \implies \retrieve \; (r)^\uparrow \; v = \code \; v$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1310
%\end{lemma}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1311
%\begin{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1312
%	By induction on $r$.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1313
%	The star case uses lemma \ref{retrieveEncodeSTARS}.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1314
%\end{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1315
%\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1316
The following property of $\retrieve$ is crucial, as
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1317
it provides a "bridge" between $a_0$ and $a_n $ in the
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1318
lexing diagram by linking adjacent regular expressions $a_i$ and
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1319
$a_{i+1}$.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1320
The property says that one 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1321
can retrieve the same bits from a derivative 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1322
regular expression as those from 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1323
before the derivative took place,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1324
provided that the corresponding values are used respectively:
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1325
\begin{lemma}\label{retrieveStepwise}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1326
	$\vdash v : (a\backslash c)_\downarrow \implies \retrieve \; (a \backslash c)  \;  v= 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1327
							\retrieve \; a \; (\inj \; a_\downarrow\; c\; v)$
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1328
\end{lemma}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1329
\begin{proof}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1330
	We do an induction on $r$, generalising over $v$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1331
	The induction principle in our formalisation
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1332
	is function $\erase$'s cases.
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1333
\end{proof}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1334
\noindent
649
Chengsong
parents: 640
diff changeset
  1335
Intuitively the lemma can be understood as all lexical value
Chengsong
parents: 640
diff changeset
  1336
information being preserved and recoverable throughout each lexing step.
Chengsong
parents: 640
diff changeset
  1337
We shall see in the next chapter that this no longer
Chengsong
parents: 640
diff changeset
  1338
holds with simplifications which takes away certain sub-expressions
Chengsong
parents: 640
diff changeset
  1339
corresponding to non-POSIX lexical values.
Chengsong
parents: 640
diff changeset
  1340
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1341
%Before we move on to the next
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1342
%helper function, we rewrite $\blexer$ in
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1343
%the following way which makes later proofs more convenient:
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1344
%\begin{lemma}\label{blexer_retrieve}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1345
%$\blexer \; r \; s = \decode  \; (\retrieve \; (\internalise \; r) \; (\mkeps \; (r \backslash s) )) \; r$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1346
%\end{lemma}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1347
%\begin{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1348
%	Using lemma \ref{bmkepsRetrieve}.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1349
%\end{proof}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1350
%\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1351
The function $\retrieve$ allows connecting
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1352
between the intermediate 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1353
results $a_i$ and $a_{i+1}$ in $\blexer$.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1354
For plain regular expressions something similar is required.
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1355
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1356
\subsection{The Function $\flex$}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1357
Ausaf and Urban defined an auxiliary function called $\flex$ for $\lexer$,
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1358
defined as
536
aff7bf93b9c7 comments addressed all
Chengsong
parents: 532
diff changeset
  1359
\begin{center}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1360
\begin{tabular}{lcl}
536
aff7bf93b9c7 comments addressed all
Chengsong
parents: 532
diff changeset
  1361
$\flex \; r \; f \; [] \; v$       &  $=$ &   $f\; v$\\
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1362
$\flex \; r \; f \; (c :: s) \; v$ &  $=$ &   $\flex \; r \; (\lambda v. \, f (\inj \; r\; c\; v)) \;\; s \; v$
536
aff7bf93b9c7 comments addressed all
Chengsong
parents: 532
diff changeset
  1363
\end{tabular}
aff7bf93b9c7 comments addressed all
Chengsong
parents: 532
diff changeset
  1364
\end{center}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1365
which accumulates the characters that need to be injected back, 
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1366
and does the injection in a stack-like manner (the last character being chopped off in the derivatives phase is first injected).
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1367
The function
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1368
$\flex$ can calculate what $\lexer$ calculates, given the input regular expression
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1369
$r$, the identity function $id$, the input string $s$ and the value
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1370
$v_n= \mkeps \; (r\backslash s)$:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1371
\begin{lemma}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1372
	$\flex \;\; r \;\; \textit{id}\;\; s \;\; (\mkeps \;(r\backslash s)) = \lexer \; r \; s$
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1373
\end{lemma}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1374
\begin{proof}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1375
	By reverse induction on $s$.
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1376
\end{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1377
\noindent
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1378
The main advantage of using $\flex$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1379
in $\lexer$ is that
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1380
$\flex$ makes the value $v$ and function $f$
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1381
that manipulates the value  explicit parameters,
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1382
which ``exposes'' $\lexer$'s recursive call
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1383
arguments and allows us to ``set breakpoints'' and ``resume''
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1384
at any point during $\lexer$'s recursive calls. Therefore
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1385
the following stepwise property holds. 
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1386
\begin{lemma}\label{flexStepwise}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1387
	$\textit{flex} \;\; r \;\; f \;\; (s@[c]) \;\; v = \flex \;\; r \;\; f  \;\; s \;\; (\inj \; (r\backslash s) \; c \; v) $
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1388
\end{lemma}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1389
\begin{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1390
	By induction on the shape of $r\backslash s$.
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1391
\end{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1392
\noindent
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1393
With $\flex$ and $\retrieve$, 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1394
we are ready to connect $\lexer$ and $\blexer$,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1395
giving the correctness proof for $\blexer$.
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1396
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1397
%----------------------------------------------------------------------------------------
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1398
%	SECTION  correctness proof
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1399
%----------------------------------------------------------------------------------------
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1400
\section{Correctness of the  Bit-coded Algorithm (Without Simplification)}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1401
We can first show that 
638
dd9dde2d902b comments till chap4
Chengsong
parents: 624
diff changeset
  1402
$\flex$ and $\blexer$ calculates the same value.
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1403
\begin{lemma}\label{flex_retrieve}
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1404
	If $\vdash v: (r\backslash s)$, then\\
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1405
$\flex \; r \; \textit{id}\; s\; v = \decode \; (\retrieve \; (r\backslash s )\; v) \; r$
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1406
\end{lemma}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1407
\begin{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1408
By induction on $s$. 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1409
We prove the interesting case where
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1410
both $\flex$ and $\decode$ successfully terminates
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1411
with some value.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1412
We take advantage of the stepwise properties  
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1413
both sides.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1414
The induction tactic is reverse induction on string $s$.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1415
The inductive hypothesis says that $\flex \; r \; id \;s \; v =
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1416
\decode \; (\retrieve \; (r\backslash s) \; v) \; r$ holds,
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1417
where $v$ can be any value satisfying 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1418
the assumption $\vdash v: (r\backslash s)$.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1419
The crucial point is to rewrite 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1420
\[
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1421
	\retrieve \;\; (r \backslash (s@[c])) \;\; (\mkeps\; (r \backslash (s@[c]) ))
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1422
\]
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1423
as
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1424
\[
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1425
	\retrieve \;\; (r \backslash s) 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1426
	\;\; (\inj \; (r \backslash s) \; c\;  \mkeps (r \backslash (s@[c])))
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1427
\]
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1428
using lemma \ref{retrieveStepwise}.
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1429
This enables us to equate 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1430
\[
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1431
	\retrieve \; (r \backslash (s@[c])) \; (\mkeps \; (r \backslash (s@[c]) ))
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1432
\] 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1433
with 
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1434
\[
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1435
\flex \; r \; \textit{id} \; s \; (\inj \; (r\backslash s) \; c\; (\mkeps (r\backslash s@[c])))
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1436
\]
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1437
using IH,
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1438
which in turn can be rewritten as
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1439
\[
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1440
	\flex \; r \; \textit{id} \; (s@[c]) \;  (\mkeps \; (r \backslash (s@[c])))
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1441
\].
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1442
\end{proof}
582
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1443
\noindent
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1444
With this pivotal lemma we can now link $\flex$ and $\blexer$
3e19073e91f4 chap3 done
Chengsong
parents: 581
diff changeset
  1445
and finally give the correctness of $\blexer$--it outputs the same result as $\lexer$:
649
Chengsong
parents: 640
diff changeset
  1446
\begin{theorem}\label{blexerCorrect}
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1447
	$\blexer\; r \; s = \lexer \; r \; s$
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1448
\end{theorem}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1449
\begin{proof}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1450
	We can rewrite the expression $\blexer \; r \; s$ by using lemma \ref{bmkepsRetrieve}:
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1451
\[
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1452
	\blexer \; r \; s = \decode  \; (\retrieve \; (\; r^\uparrow) \; (\mkeps \; (r \backslash s) )) \; r
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1453
\]
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1454
	From \ref{flex_retrieve} we have that the left-hand-side is equal to 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1455
\[
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1456
	\textit{flex} \; r \; \textit{id} \; s \; \mkeps \; (r \backslash s),
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1457
\]
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1458
which in turn equals 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1459
\[
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1460
	\lexer \; r \; s.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1461
\]
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1462
	%which immediately implies this theorem.
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1463
\end{proof}
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1464
\noindent
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1465
To piece things together 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1466
%and spell out the correctness
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1467
%result of the bitcoded lexer more explicitly,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1468
and give the correctness result explicitly,
576
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1469
we use the fact from the previous chapter that
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1470
\[
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1471
	 (r, s) \rightarrow v \;\; \textit{iff} \;\; \lexer \; r \; s =\Some \;  v
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1472
	 \quad \quad
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1473
	 \nexists v. (r, s) \rightarrow v\;\; \textit{iff} \;\; \lexer \;r\;s = None
576
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1474
\]
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1475
to obtain this corollary:
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1476
\begin{corollary}\label{blexerPOSIX}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1477
	The $\blexer$ function correctly implements a POSIX lexer, namely
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1478
	\begin{itemize}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1479
	\item
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1480
	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \; s = \Some \; v$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1481
	\item
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1482
	$\nexists v. (r, s) \rightarrow v\;\; \textit{iff} \;\; \blexer \;r\;s = None$
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 653
diff changeset
  1483
	\end{itemize}
576
3e1b699696b6 thesis chap5
Chengsong
parents: 575
diff changeset
  1484
\end{corollary}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1485
Our main reason for analysing the bit-coded algorithm over 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 622
diff changeset
  1486
the injection-based one is that it allows us to define
542
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1487
more aggressive simplifications.
a7344c9afbaf chapter3 finished
Chengsong
parents: 538
diff changeset
  1488
We will elaborate on this in the next chapter.
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1489
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1490